[Vm-dev] In reference to: Minifying Woes

Eliot Miranda eliot.miranda at gmail.com
Mon Apr 3 13:15:19 UTC 2023


Hi Tom,


> On Apr 3, 2023, at 2:17 AM, Tom Beckmann <tomjonabc at gmail.com> wrote:
> 
> 
> Hi Eliot,
> 
> many thanks for the detailed insights! I have since started
> familiarizing myself with the VM source code.
> 
> Reducing the size of the remembered set object seemed easy enough (and
> indeed, the image file got significantly smaller after my change).

Have you contributed back your change?  There’s a VMMakerInbox for this.

> However, as Tom (Braun) pointed out previously, the generated images
> are crashing. My investigation so far did not yield any meaningful
> insights. The VM crashes in initializeObjectMemory when it attempts to
> access the first objects as they are pointing to incorrect memory
> regions. Oddly, the byte offsets match those that are supposed to have
> been written but the contents seem to be incorrect.
> 
> Would you happen to have any pointers on how to best approach debugging
> this?

I took a look and found the same thing.  So far I haven’t been able to make sense of it. I am loading the image in the simulator, and getting that crash.  I haven’t had time to look in sufficient detail.  But since one is looking at very few objects at the start of memory I would be looking in detail at the object headers in the heap by inspecting the memory array itself and checking that the objects were as expected.  Maybe you and I could find time to pair this week.  I’m skiing today and tomorrow.  Maybe Thursday morning PST/evening CET? 

> 
> Best,
> Tom
> 
>> On Fri, 2023-03-24 at 11:49 -0700, Eliot Miranda wrote:
>>  
>> Hi Tom,
>> 
>>> On Fri, Mar 24, 2023 at 11:04 AM Tom Braun <me at tom-braun.de> wrote:
>>>  
>>> Hi all,
>>> 
>>> I am new to the list, so I couldn’t answer directly to Minifying
>>> Woes (or at least didn’t know how to).
>>> 
>>> I identified 75 VM intern objects in Tom’s (tobe) image, by loading
>>> it in the simulator and setting a halt after loading the image.
>>> With the following code:
>>> 
>>> | collection |
>>> collection := OrderedCollection new.
>>> self allOldSpaceEntitiesDo: [:obj | (((self classIndexOf: obj) <
>>> self lastClassIndexPun) and: (self isImmediate: obj) not) ifTrue:
>>> [ collection add: obj] ].
>>> (collection sorted: [:a :b | (self bytesInBody: a) > (self
>>> bytesInBody: b)]) collect: [:ea | ea hex -> (self bytesInBody: ea)]
>>> 
>> 
>> 
>> Right.  A minor thing is that you don't need isImmediate: because by
>> definition allOldSpaceEntitiesDo enumerates over objects, not
>> immediates.  Instead you could say
>> 
>> self allOldSpaceEntitiesDo: [:obj || ci | ci := self classIndexOf:
>> obj. (ci isZero or: [ci between: self firstClassIndexPun and: self
>> lastClassIndexPun]) ifTrue: [collection add: obj] ]
>> 
>> or more simply
>> 
>> self allOldSpaceEntitiesDo: [:obj || ci | ((self classIndexOf: obj)
>> <= self lastClassIndexPun ifTrue: [collection add: obj] ]
>> 
>>> I get the following 75 objects:
>>> 
>>> 1. A free chunk after all other objects. Can be ignored for the
>>> sake of minimising image size
>>> 2. Remembered set -> 1048592 byte
>>> 3. hiddenRootsObj  -> 32848 byte
>>> 4 - 61. Pages of the mark and weakling stack -> respectively 32752
>>> byte
>>> 62 - 74. arrays of the class table -> respectively 8208 byte
>>> 75. specialObjectsOop -> 520 byte
>>> 
>>> As far as I can judge the stack pages and remembered set could be
>>> removed from the image to minimize it further.
>>> If I understand correctly the StackPages could be removed by using
>>> the SpurImagePreener. 
>>> 
>> 
>> 
>> I thought that this, in SpurImagePreener>>cloneObjects, would prevent
>> any mark stack/ephemeron stack pages getting cloned.  Looks like I'm
>> wrong.
>> 
>>               (self shouldClone: obj) ifTrue:
>>                     [self cloneObject: obj]
>> 
>> shouldClone: obj
>>     ^(sourceHeap isValidObjStackPage: obj) not
>> 
>> shouldClone: might be as simple as
>> 
>> shouldClone: obj
>>     | classIndex |
>>     classIndex := self classIndexOf: obj.
>>     classIndex = 0 ifTrue: [^false]. "free objects have a class index
>> of 0"
>>     (classIndex between: self firstClassIndexPun and: self lastClassI
>> ndexPun) ifFalse:
>>         [^true].
>>     "The hiddenRootsObject must be cloned; the remembered set must be
>> cloned (but may be reduced in size); the classTable pages must be
>> cloned. anything else can be discarded."
>>     ^obj = sourceHeap hiddenRootsObject
>>     or: [obj = sourceHeap rememberedSetObj
>>     or: [classIndex = self arrayClassIndexPun]] "class table pages"
>> 
>>> The remembered set 
>>> Could be removed in the SpurImagePreener too (when the
>>> VM initialises the memory it initialises a new remembered set too,
>>> if it is nil).
>>> 
>> 
>> 
>> Cool; I wasn't sure if it was initialized on start-up. It does seem
>> to be. We need to check that the old one gets collected.  In fact,
>> the old one should be used if it exists, because its size is a good
>> predictor of how big it needs to be.
>>> After a quick read of the SpurImagePreener I didn’t see that the
>>> remembered set gets removed.
>>> 
>> 
>> 
>> Right. I had forgotten to do this.
>>> 
>>> When I tried using the preener it resulted in an unusable image
>>> (both simulator and compiled VM couldn’t load it).
>>> I tried both:
>>> 
>>> SpurImagePreener new 
>>>    preenImage: '/Users/tombraun/Desktop/Squeak6.0-22104-64bit
>>> copy.image'
>>> 
>>> SpurImagePreener new 
>>>    writeDefaultHeader: true;
>>>    savedWindowSize: 1 at 1;
>>>    preenImage: '/Users/tombraun/Desktop/Squeak6.0-22104-64bit
>>> copy.image'
>>> 
>>> Could be a me problem, as I did some changes to the memory
>>> management in my VMMaker image, although this shouldn’t influence 
>>> the preener….
>>> 
>> 
>> 
>> Well, it can be fixed :-)
>>> On this note @Eliot: why the decision to make the object stacks and
>>> the remembered set VM managed objects instead of allocating them 
>>> separately? 
>>> 1. We don’t need to keep them in a snapshot. All object stacks are
>>> empty after GC and as we flushed the new space pre snapshot
>>> The remembered set shouldn't need to be persisted too. 
>>> 2. During GC we simply mark all stack pages and keep them alive.
>>> When we at least 
>>> freed empty pages (after a limit, to prevent the running VM from
>>> having to allocate too many pages every GC?) I would see the value.
>>> 
>>> What did I overlook or does it simply have historical reasons?
>>> 
>> 
>> 
>> If one has good to high quality machinery for a heap manager then it
>> makes sense to use it for all allocations, not just those of the
>> mutator.  This includes in the footprint measurements etc memory
>> usage, instead of hiding it in the C allocator. It also means that it
>> is easy to find these allocations, as above, whereas if allocated in
>> C one could easily lose space if there was a leak, etc.  So in my
>> opinion (and in others') high quality heap managers should manage as
>> much of their internal storage as possible.
>>  
>>> Best,
>>> Tom (WoC)
>>> 
>> 
>>  
>> _,,,^..^,,,_
>> best, Eliot
> 


More information about the Vm-dev mailing list