[squeak-dev] A couple of memory management related questions

Eliot Miranda eliot.miranda at gmail.com
Tue Jun 15 22:32:34 UTC 2010


On Tue, Jun 15, 2010 at 2:44 PM, Ralph Johnson <johnson at cs.uiuc.edu> wrote:

> If people define classes by hand, I imagine 64k would be enough.  But
> suppose they are using light-weight classes, i.e. giving each object
> its own behavior.   Then it would be fairly easy to have more than 64K
> of them, though 1M is still a reasonable limit.
>
> What is the difference in cost between choosing 64K as the limit and
> choosing 1M?  I know, one byte, but what else would you use that byte
> for?
>

So the header format is 64 bits.  Its not about the bits.  Its about speed
of access.  We need of the order of 8 bits for GC flags (marked, free,
forwarded, weak etc).  We would like of the order of 8 bits for field size
and 8 bits for inst size, with objects >= 255 32-bit fields having an
overflow size field.  That leaves 5 bytes, 40 bits for identity hash and
class index.  A symmetric 20-bits each gives us slightly slower class index
access but provides for 1m classes (apologies for my math earlier).  But a
16-bit class index would be faster, or at least have more compact code, on
86, and allow for identity hashes for non-classes of up to 16m.  So if 64k
classes is enough then a 16 bit class index field is the way to go.  If its
tight I would stick with 20/20.  VW's 64-bit VM uses exactly this.  Its
essentially:

8 bits of GC flags (1 bit being pointers vs non-pointers)
8 bits of field size (64 bits per field)
8 bits of inst size for pointer objects/5 unused bits, 3 bits of odd size
for non-pointers, so size = 8 * # of fields - odd size
20 bits of class index
20 bits of identity hash

But its organized to put the 20 bits in the least significant half of each
word, the inst and field sizes in bytes and the GC flags distributed:

| field size byte | 4 flags | 20 bit id hash | inst size/odd size byte | 4
flags | 20 bit class idx |

so extracting the class idx, the high frequency operation done on every
send, is a 64-bit fetch & mask, which in a 32-bit implementation would be a
32-bit fetch and mask.  On 32-bits might be better to do

| field size byte | 20 bit id hash | 4 flags | 20 bit class idx | 4 flags
| inst size/odd size byte |

which would be a 32-bit fetch followed by a 12-bit shift, which is typically
less code bytes than the 0xFFFFF mask, and fast now that we can assume a
barrel shifter.  But that would be incompatible with the 64-bit scheme.  So
I'll stick with the mask.  Its still way shorter than the Cog code for the
current compact class scheme.


Clearly with the 20/20 scheme its easy to steal a couple of bits for more GC
flags by taking a bit from each 20 bit field.  Further, with Squeak, the
inst size byte might be pointless (although contexts are still indexable
objects with fixed fields) and so I doubt bits are at a premium.  Hence the
real question is whether I should be whorish and go for the 16-bit compact
fast class index access or not.

best
Eliot


> -Ralph
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20100615/d6307eb5/attachment.htm


More information about the Squeak-dev mailing list