[squeak-dev] Crashes on snapshot with the new compactor

H. Hirzel hannes.hirzel at gmail.com
Mon Mar 27 08:12:23 UTC 2017


On 3/26/17, Ben Coman <btc at openinworld.com> wrote:
> On Sun, Mar 26, 2017 at 10:41 AM, Ben Coman <btc at openinworld.com> wrote:
>> On Sun, Mar 26, 2017 at 4:27 AM, Eliot Miranda <eliot.miranda at gmail.com>
>> wrote:
>>> Hi All,
>>>
>>>     a number of people are being affected by crashes on snapshotting the
>>> image, the worst possible time for a crash.  There is a bug in the new
>>> compactor that unfortunately bites when saving.  The compactor is invoked
>>> as
>>> part of a full garbage collect after the garbage collector has feed
>>> unreachable objects.  Normally the new compactor makes only a single pass
>>> through the heap, which may not move all the objects that are possible to
>>> move.  (The amount of objects that can be moved in a single pass is
>>> limited
>>> by available free space.)  But on snapshot the compactor makes as may
>>> passes
>>> as are necessary to slide all movable objects down as far as possible.
>>> Unfortunately there is a bug in this second pass.
>>>
>>> Fixing this bug is now my priority.  I have an example image from Esteban
>>> Lorenzano to test.  I am asking anyone else that can provide an image
>>> that
>>> reliably crashes when trying to save it to make the image and changes
>>> available to me for testing if possible.
>>>
>>> In the mean time one may be able to work around the problem by doing a
>>> full
>>> garbage collect before snapshot.  This should do a GC with a single
>>> compaction pass which should not fail, and then make it much more likely
>>> that the GC during snapshot will do a single compaction pass, since fewer
>>> objects should be mobile after the single pass compaction in the explicit
>>> GC.
>>
>> Rather than avoid the problem, in which case you'll get less samples,
>> can we temporarily have the snapshot create a second file
>> "my.image.beforeSnapshotGC".
>> so when it crashes, we'll have a great sample for you.
>>
>> I'm sure we are all keen (and grateful) to get a reliable compactor.
>> The pain is not so much that it crashes, but that the image is corrupted.
>> If its possible/likely that "my.image.beforeSnapshotGC" might be renamed
>> and successfully opened, I'm sure those of use following bleeding edge
>> are capable and will to operate like that, to help bring a faster
>> resolution.
>>
>> cheers -ben
>
> Another thing (seeing Andrei's post about a crash during a big computation)
> what would be the performance hit to create a file
> my.image.beforeCompaction"
> prior to *every* compaction.  The double benefit is:
> * recoverable for user
> * good ready to crash sample for you
>
> This could be a good permanent feature enabled by command line or
> in-Image setting/preference.
+1
I suggest as well that such an additional image save before compacting
is added to the trunk.

> cheers -ben
>
>>
>>>
>>> To do this in Pharo I would put a full gc here:
>>>
>>> SessionManager>>snapshot: save andQuit: quit
>>> | isImageStarting snapshotResult |
>>> ChangesLog default logSnapshot: save andQuit: quit.
>>>
>>>>> SmalltalkImage current primitiveGarbageCollect.
>>>
>>> self currentSession stop: quit. "Image not usable from here until the
>>> session is restarted!"
>>> ...
>>>
>>> In Squeak I would put a full GC here:
>>>
>>> snapshot: save andQuit: quit withExitCode: exitCode embedded:
>>> embeddedFlag
>>> "Mark the changes file and close all files as part of
>>> #processShutdownList.
>>> If save is true, save the current state of this Smalltalk in the image
>>> file.
>>> If quit is true, then exit to the outer OS shell.
>>> If exitCode is not nil, then use it as exit code.
>>> The latter part of this method runs when resuming a previously saved
>>> image.
>>> This resume logic checks for a document file to process when starting
>>> up."
>>>
>>> | resuming msg |
>>> Object flushDependents.
>>> Object flushEvents.
>>>
>>> ...
>>> Smalltalk processShutDownList: quit.
>>>>> SmalltalkImage current primitiveGarbageCollect.
>>> Cursor write show.
>>> save ifTrue: [resuming := embeddedFlag
>>> ifTrue: [self snapshotEmbeddedPrimitive]
>>> ifFalse: [self snapshotPrimitive]]  "<-- PC frozen here on image file"
>>> ifFalse: [resuming := false].
>>>
>>> I do apologise for the bug.  I hope it will be fixed within a few days.
>>>
>>> _,,,^..^,,,_
>>> best, Eliot
>>>
>>>
>>>
>
>


More information about the Squeak-dev mailing list