[Vm-dev] Re: [Pharo-project] Plan/discussion/communication around new object format

Wed Jun 13 18:50:51 UTC 2012

On Mon, Jun 11, 2012 at 1:36 AM, Igor Stasenko <siguctua at gmail.com> wrote:

>
> Some extra ideas.
>
> 1. Avoiding extra header for big sized objects.
> I not sure about this, but still ..
>
> according to Eliot's design:
> 8: slot size (255 => extra header word with large size)
>
> What if we extend size to 16 bits (so in total it will be 65536 slots)
>

This simply doesn't make sense within the overall context of the header
(i.e. relatively large identityHash the same size as the class index).  A
large size field increases the size of the header for all objects.  An
extra size word for large objects only increases the size (probably by 8
bytes) for large objects.  But that's a very small percentage overhead of
at most 8 / (256 * 4), or 0.8%.  Few objects are large.  The bulk of
objects are smaller than 256 slots.  Its a no-brainer; have a small size
field and overflow only for large objects.

> and we have a single flag, pointing how to calculate object size:
>
> flag(0)   object size = (size field) * 8
> flag(1)  object size = 2^ (slot field)
>
> which means that past 2^16 (or how many bits we dedicate to size field
> in header) all object sizes
> will be power of two.
> Since most of the objects will fit under 2^16, we don't lose much.
> For big arrays, we could have a special collection/array, which will
> store exact size in it's inst var (and we even don't need to care in
> cases of Sets/Dicts/OrderedCollections).
> Also we can actually make it transparent:
>
> Array class>>new: size
>  size > (max exact size ) ifTrue: [ ^ ArrayWithBigSizeWhatever new: size ]
>
> of course, care must be taken for those variable classes which
> potentially can hold large amounts of bytes (like Bitmap).
> But i think code can be quickly adopted to this feature of VM, which
> will simply fail a #new: primitive
> if size is not power of two for sizes greater than max "exact size"
> which can fit into size field of header.
> ----
>
> 2. Slot for arbitrary properties.
> If you read carefully, Eliot said that for making lazy become it is
> necessary to always have some extra space per object, even if object
> don't have any fields:
>
> <<We shall probably keep the minimum object size at 16 bytes so that
> there is always room for a forwarding pointer. >>
>
> So, this fits quite well with idea of having slot for dynamic
> properties per object. What if instead of "extending object" when it
> requires extra properties slot, we just reserve the slot for
> properties at the very beginning:
>
> [ header ]
> [ properties slot]
> ... rest of data ..
>
> so, any object will have that slot. And in case of lazy-become. we can
> use that slot for holding forwarding pointer. Voila.
>
> 3. From 2. we going straight back to hash.. VM don't needs to know
> such a thing as object's hash, it has no semantic load inside VM, it
> just answers those bits by a single primitive.
>
> So, why it is kind of enforced inherent property of all objects in
> system? And why nobody asks, if we have that one, why we could not
> have more than one or as many as we want? This is my central question
> around idea of having per-object properties.
> Once VM will guarantee that any object can have at least one slot for
> storing object reference (property slot),
> then it is no longer needed for VM to care about identity hash.
>
> Because it can be implemented completely at language size. But most of
> all, we are NO longer limited
> how big/small hash values , which directly converts into bonuses: less
> hash collisions > more performance. Want 64-bit hash? 128-bit?
> Whatever you desire:
>
> Object>>identityHash
>   ^ self propertiesAt: #hash ifAbsentPut: [ HashGenerator newHashValue ]
>
> and once we could have per-object properties.. and lazy become, things
> like Magma will get a HUGE benefits straightly out of the box.
> Because look, lazy become, immutability - those two addressing many
> problems related to OODB implementation
> (i barely see other use cases, where immutability would be as useful
> as in cases of OODB)..
> so for me it is logical to have this last step: by adding arbitrary
> properties, OODB now can store the ID there.
>
> --
> Best regards,
> Igor Stasenko.
>

-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20120613/f62a2b6f/attachment.htm