[Vm-dev] Minifying Woes

Eliot Miranda eliot.miranda at gmail.com
Fri Mar 24 18:07:35 UTC 2023


Hi Jens & Tom,

    ah, SpurImagePreener doesn't clone the mark stack so nothing needs
to be done for them.  Only the remembered set needs to be shrunk.  But
empty class pages could be discarded, and if you could figure out how to
rehash every dictionary you could reorder class indices to compact the
class table.

On Fri, Mar 24, 2023 at 11:03 AM Eliot Miranda <eliot.miranda at gmail.com>
wrote:

> Hi Tom,
>
> On Fri, Mar 24, 2023 at 8:53 AM Tom Beckmann <tomjonabc at gmail.com> wrote:
>
>>
>> Hi list,
>>
>> we're on a bit of an adventure to try and find the minimum size of a
>> Squeak image that can still run a stdio REPL. After narrowing it down to
>> around 6MB, we noticed that SpaceTally reported ~3MB of objects (as opposed
>> to the 6MB that were saved on-disk).
>>
>
> Cool; a fine effort.  i hope this leads to a small image running on a
> minheadless VM for command-line scripting.
>
>
>> After a further deep-dive (in which SqueakJS and later the VM simulator
>> were of immense help), we found that there were 60 objects of class index
>> 19, which took up 3MB of space in the .image file. After some digging, we
>> eventually found out that class index 19 are from
>> SpurMemoryManager>>sixtyFourBitLongsClassIndexPun. As we understand it the
>> "pun" objects are internal clones of the built-in classes (such as
>> WeakArray, Array, ...), to prevent them from being found by a user.
>>
>
> Almost.  Puns are used to separate heap objects Spur uses internally from
> "user" Smalltalk objects. These are:
> the class table: a sparse table used to map the class indices in every
> Smalltalk object header into the relevant class object
> the remembered set: the objects in old space that reference new objects
> and are hence roots for scavenging
> the mark stack: the stack used to mark all old space objects that holds
> objects that are being scanned for unmarked objects to scan and mark
> the ephemeron stack: the stack used to hold potentially triggerable
> ephemerons found during scan mark
>
> Some of these objects look like raw data, some of them look like arrays.
> But all of them should be invisible to Smalltalk.  So setting their class
> index to a pun hides them during allObjects, and allInstances.  You'll find
> that the first few class indices, 1, 2 & 4, are those for the immediate
> classes (e.g. {SmallInteger. Character. SmallFloat64} collect:
> #identityHash. Then you'll find that the lowest class identityHash is 32,
> of LargeNegativeInteger, hence:
>     ((Smalltalk specialObjectsArray select: [:e| e isBehavior]) collect:
> [:b| {b identityHash. b}]) sort: [:aa :ab| aa first < ab first]
>
> The class indices from 8 through 31 are used for puns.
>
> We even managed to locate one of the larger class-index=19 objects with
>> the help of Tom (WoC): the hiddenRootsObj contains in its 4099's slot the
>> RememberedSet, which in our image was just over 1MB in size.
>>
>> Now, we're wondering whether we can get closer to our goal of getting to
>> the smallest possible on-disk image size (don't ask why, at this point it's
>> more of a challenge...). Does the RememberedSet need to be persisted or
>> could we (easily?) nil it before saving to disk? Are there other low
>> hanging fruits in terms of VM-internal objects that could be freed during
>> snapshot generation?
>>
>
> It must be persisted. But it doesn't need to be that big.  There is a tool
> in the VMMaker for eliminating this wasted space: SpurImagePreener.  I
> can't guarantee that it currently prunes the remembered set (I just
> checked; it doesn't; should I fix it or would you like to fix it? It might
> be empowering for me to leave it to you).
>
> So you do e.g. SpurImagePreener new preenImage: 'trunk', and it outputs a
> hopefully shrunk trunk-preen.image. See SpurImagePreener's class comment.
> If you look at SpurImagePreener>>#cloneObjects you'll see how to reduce the
> size of the remembered table (currently it only handles the free lists).
> I'll fix the mark stack, as the format of pages on the mark stack is a bit
> tricky, but I'll leave it to you to fix the remembered set size.
>
>
>>
>> Best,
>> Jens (jl) and Tom (tobe)
>>
>> PS: A not-so-clean version of the minification process can be found here
>> https://github.com/hpi-swa-lab/cloud-squeak
>> We're in the process of cleaning it up and might send out a proper
>> announcement once it's pretty.
>>
>
> Super cool!
>
> _,,,^..^,,,_
> best, Eliot
>


-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20230324/38476efa/attachment.html>


More information about the Vm-dev mailing list