[Vm-dev] Minifying Woes

Eliot Miranda eliot.miranda at gmail.com
Fri Mar 24 18:03:21 UTC 2023


Hi Tom,

On Fri, Mar 24, 2023 at 8:53 AM Tom Beckmann <tomjonabc at gmail.com> wrote:

>
> Hi list,
>
> we're on a bit of an adventure to try and find the minimum size of a
> Squeak image that can still run a stdio REPL. After narrowing it down to
> around 6MB, we noticed that SpaceTally reported ~3MB of objects (as opposed
> to the 6MB that were saved on-disk).
>

Cool; a fine effort.  i hope this leads to a small image running on a
minheadless VM for command-line scripting.


> After a further deep-dive (in which SqueakJS and later the VM simulator
> were of immense help), we found that there were 60 objects of class index
> 19, which took up 3MB of space in the .image file. After some digging, we
> eventually found out that class index 19 are from
> SpurMemoryManager>>sixtyFourBitLongsClassIndexPun. As we understand it the
> "pun" objects are internal clones of the built-in classes (such as
> WeakArray, Array, ...), to prevent them from being found by a user.
>

Almost.  Puns are used to separate heap objects Spur uses internally from
"user" Smalltalk objects. These are:
the class table: a sparse table used to map the class indices in every
Smalltalk object header into the relevant class object
the remembered set: the objects in old space that reference new objects and
are hence roots for scavenging
the mark stack: the stack used to mark all old space objects that holds
objects that are being scanned for unmarked objects to scan and mark
the ephemeron stack: the stack used to hold potentially triggerable
ephemerons found during scan mark

Some of these objects look like raw data, some of them look like arrays.
But all of them should be invisible to Smalltalk.  So setting their class
index to a pun hides them during allObjects, and allInstances.  You'll find
that the first few class indices, 1, 2 & 4, are those for the immediate
classes (e.g. {SmallInteger. Character. SmallFloat64} collect:
#identityHash. Then you'll find that the lowest class identityHash is 32,
of LargeNegativeInteger, hence:
    ((Smalltalk specialObjectsArray select: [:e| e isBehavior]) collect:
[:b| {b identityHash. b}]) sort: [:aa :ab| aa first < ab first]

The class indices from 8 through 31 are used for puns.

We even managed to locate one of the larger class-index=19 objects with the
> help of Tom (WoC): the hiddenRootsObj contains in its 4099's slot the
> RememberedSet, which in our image was just over 1MB in size.
>
> Now, we're wondering whether we can get closer to our goal of getting to
> the smallest possible on-disk image size (don't ask why, at this point it's
> more of a challenge...). Does the RememberedSet need to be persisted or
> could we (easily?) nil it before saving to disk? Are there other low
> hanging fruits in terms of VM-internal objects that could be freed during
> snapshot generation?
>

It must be persisted. But it doesn't need to be that big.  There is a tool
in the VMMaker for eliminating this wasted space: SpurImagePreener.  I
can't guarantee that it currently prunes the remembered set (I just
checked; it doesn't; should I fix it or would you like to fix it? It might
be empowering for me to leave it to you).

So you do e.g. SpurImagePreener new preenImage: 'trunk', and it outputs a
hopefully shrunk trunk-preen.image. See SpurImagePreener's class comment.
If you look at SpurImagePreener>>#cloneObjects you'll see how to reduce the
size of the remembered table (currently it only handles the free lists).
I'll fix the mark stack, as the format of pages on the mark stack is a bit
tricky, but I'll leave it to you to fix the remembered set size.


>
> Best,
> Jens (jl) and Tom (tobe)
>
> PS: A not-so-clean version of the minification process can be found here
> https://github.com/hpi-swa-lab/cloud-squeak
> We're in the process of cleaning it up and might send out a proper
> announcement once it's pretty.
>

Super cool!

_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20230324/cb99e6c4/attachment.html>


More information about the Vm-dev mailing list