[Vm-dev] Re: [Pharo-project] Plan/discussion/communication around new object format

Mon Jun 11 08:36:43 UTC 2012

Some extra ideas.

1. Avoiding extra header for big sized objects.
I not sure about this, but still ..

according to Eliot's design:
8: slot size (255 => extra header word with large size)

What if we extend size to 16 bits (so in total it will be 65536 slots)
and we have a single flag, pointing how to calculate object size:

flag(0)   object size = (size field) * 8
flag(1)  object size = 2^ (slot field)

which means that past 2^16 (or how many bits we dedicate to size field
in header) all object sizes
will be power of two.
Since most of the objects will fit under 2^16, we don't lose much.
For big arrays, we could have a special collection/array, which will
store exact size in it's inst var (and we even don't need to care in
cases of Sets/Dicts/OrderedCollections).
Also we can actually make it transparent:

Array class>>new: size
  size > (max exact size ) ifTrue: [ ^ ArrayWithBigSizeWhatever new: size ]

of course, care must be taken for those variable classes which
potentially can hold large amounts of bytes (like Bitmap).
But i think code can be quickly adopted to this feature of VM, which
will simply fail a #new: primitive
if size is not power of two for sizes greater than max "exact size"
which can fit into size field of header.
----

2. Slot for arbitrary properties.
If you read carefully, Eliot said that for making lazy become it is
necessary to always have some extra space per object, even if object
don't have any fields:

<<We shall probably keep the minimum object size at 16 bytes so that
there is always room for a forwarding pointer. >>

So, this fits quite well with idea of having slot for dynamic
properties per object. What if instead of "extending object" when it
requires extra properties slot, we just reserve the slot for
properties at the very beginning:

[ header ]
[ properties slot]
... rest of data ..

so, any object will have that slot. And in case of lazy-become. we can
use that slot for holding forwarding pointer. Voila.

3. From 2. we going straight back to hash.. VM don't needs to know
such a thing as object's hash, it has no semantic load inside VM, it
just answers those bits by a single primitive.

So, why it is kind of enforced inherent property of all objects in
system? And why nobody asks, if we have that one, why we could not
have more than one or as many as we want? This is my central question
around idea of having per-object properties.
Once VM will guarantee that any object can have at least one slot for
storing object reference (property slot),
then it is no longer needed for VM to care about identity hash.

Because it can be implemented completely at language size. But most of
all, we are NO longer limited
how big/small hash values , which directly converts into bonuses: less
hash collisions > more performance. Want 64-bit hash? 128-bit?
Whatever you desire:

Object>>identityHash
   ^ self propertiesAt: #hash ifAbsentPut: [ HashGenerator newHashValue ]

and once we could have per-object properties.. and lazy become, things
like Magma will get a HUGE benefits straightly out of the box.
Because look, lazy become, immutability - those two addressing many
problems related to OODB implementation
(i barely see other use cases, where immutability would be as useful
as in cases of OODB)..
so for me it is logical to have this last step: by adding arbitrary
properties, OODB now can store the ID there.

-- 
Best regards,
Igor Stasenko.