[squeak-dev] Crashes on snapshot with the new compactor

Ben Coman btc at openinworld.com
Sun Mar 26 02:41:10 UTC 2017


On Sun, Mar 26, 2017 at 4:27 AM, Eliot Miranda <eliot.miranda at gmail.com> wrote:
> Hi All,
>
>     a number of people are being affected by crashes on snapshotting the
> image, the worst possible time for a crash.  There is a bug in the new
> compactor that unfortunately bites when saving.  The compactor is invoked as
> part of a full garbage collect after the garbage collector has feed
> unreachable objects.  Normally the new compactor makes only a single pass
> through the heap, which may not move all the objects that are possible to
> move.  (The amount of objects that can be moved in a single pass is limited
> by available free space.)  But on snapshot the compactor makes as may passes
> as are necessary to slide all movable objects down as far as possible.
> Unfortunately there is a bug in this second pass.
>
> Fixing this bug is now my priority.  I have an example image from Esteban
> Lorenzano to test.  I am asking anyone else that can provide an image that
> reliably crashes when trying to save it to make the image and changes
> available to me for testing if possible.
>
> In the mean time one may be able to work around the problem by doing a full
> garbage collect before snapshot.  This should do a GC with a single
> compaction pass which should not fail, and then make it much more likely
> that the GC during snapshot will do a single compaction pass, since fewer
> objects should be mobile after the single pass compaction in the explicit
> GC.

Rather than avoid the problem, in which case you'll get less samples,
can we temporarily have the snapshot create a second file
"my.image.beforeSnapshotGC".
so when it crashes, we'll have a great sample for you.

I'm sure we are all keen (and grateful) to get a reliable compactor.
The pain is not so much that it crashes, but that the image is corrupted.
If its possible/likely that "my.image.beforeSnapshotGC" might be renamed
and successfully opened, I'm sure those of use following bleeding edge
are capable and will to operate like that, to help bring a faster resolution.

cheers -ben

>
> To do this in Pharo I would put a full gc here:
>
> SessionManager>>snapshot: save andQuit: quit
> | isImageStarting snapshotResult |
> ChangesLog default logSnapshot: save andQuit: quit.
>
>>> SmalltalkImage current primitiveGarbageCollect.
>
> self currentSession stop: quit. "Image not usable from here until the
> session is restarted!"
> ...
>
> In Squeak I would put a full GC here:
>
> snapshot: save andQuit: quit withExitCode: exitCode embedded: embeddedFlag
> "Mark the changes file and close all files as part of #processShutdownList.
> If save is true, save the current state of this Smalltalk in the image file.
> If quit is true, then exit to the outer OS shell.
> If exitCode is not nil, then use it as exit code.
> The latter part of this method runs when resuming a previously saved image.
> This resume logic checks for a document file to process when starting up."
>
> | resuming msg |
> Object flushDependents.
> Object flushEvents.
>
> ...
> Smalltalk processShutDownList: quit.
>>> SmalltalkImage current primitiveGarbageCollect.
> Cursor write show.
> save ifTrue: [resuming := embeddedFlag
> ifTrue: [self snapshotEmbeddedPrimitive]
> ifFalse: [self snapshotPrimitive]]  "<-- PC frozen here on image file"
> ifFalse: [resuming := false].
>
> I do apologise for the bug.  I hope it will be fixed within a few days.
>
> _,,,^..^,,,_
> best, Eliot
>
>
>


More information about the Squeak-dev mailing list