[Vm-dev] Immediate and heap objects

Bert Freudenberg bert at freudenbergs.de
Fri Dec 5 01:05:18 UTC 2014


I just thought of a unified explanation for immediate and non-immediate objects. It somewhat inverts the notion of "normal", but maybe this way it is easier to understand?

--------------------------------------------------------------

In Squeak, everything is an "object". Each object has a reference to another object defining its behavior. This is called the object's "class". Many objects can reference the same class object, they are called the class's "instances". In addition to the class reference, an object may hold other data, the so-called "instance data". The interpretation of this data is defined by the class.

Each object is stored in main memory using at least 1 machine word. Different variants of Squeak use either 32 or 64 bit words. For efficiency reasons, the storage format for an object is akin to a "Huffman code", using fewer bits and words for more common kinds of objects.

How exactly the object's bits encode the class and instance data is not visible to the user. The Virtual Machine transparently handles the details and makes all objects appear alike.

Some objects encode both the class reference and instance data in 1 word. These are called "immediate objects".

Most objects do not fit in 1 word. These have a second part dynamically allocated on the heap. They are called "heap objects".

The 1-word first part (the only word in immediates) is called an "oop". It is used to reference an object from another object's instance data.

The oop has some "tag bits" and some data bits. The tag bits encode the class, and the data bits encode the instance data. One special combination of tag bits is reserved to denote heap objects. The other combinations of tag bits correspond to different classes of immediate objects.

32-bit oops have 2 tag bits. This allows four combinations of tag bits (00, 01, 10, 11). The tags 01 and 11 are used for immediate "SmallInteger" instances, which represents signed numbers between -1073741824 and 1073741823. The tag 10 will be used in Spur for immediate Characters.

64-bit oops have 3 tag bits in Spur. Only half of the 8 tags are assigned at the moment, for SmallIntegers, Characters, and SmallFloat64s.

If all tag bits in an oop are zero, this denotes a heap object. In this case, the oop does not immediately encode the class and instance data, but instead it identifies a chunk of memory where that information is stored. Such an untagged oop is used as a direct pointer into the heap.

The memory layout of heap objects is specified by the object's class. If you're interested in that layout or the actual assignment of tag bits, read Clement's excellent post:
	https://clementbera.wordpress.com/2014/01/16/spurs-new-object-format/

--------------------------------------------------------------

Of course we normally call heap objects "regular objects", and as users we rarely have to care about the distinction anyway. But maybe when we do, explaining it the other way around is actually helpful ...

- Bert -

PS: Another idea would be to distinguish between "register objects" and "memory objects" and explaining it in terms of CPU operations, like I did in my previous attempt. Actually, that may not be such a bad idea?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4142 bytes
Desc: not available
Url : http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20141205/0a86cc2d/smime.bin


More information about the Vm-dev mailing list