Image format proposals... Re: [SqF]Report of VI4 Project for Feb '02

Scott A Crosby crosby at qwes.math.cmu.edu
Sun Feb 3 08:55:25 UTC 2002


On Sun, 3 Feb 2002, Hans-Martin Mosner wrote:

> Martin McClure wrote:
>
> > At 9:06 PM -0500 2/2/02, Scott A Crosby wrote:
> > >  >
> > >>  This second class of usage is the primary motivation for adding the
> > >>  header bit. All the alternatives to this that I've seen either
> > >>  severely limit the functionality or are extremely ugly.
> > >>
> > >
> > >Do you really need a header bit? What about just reserving a seperate
> > >range of memory for such objects, then, you just see if its in that range
> > >before deciding whether or not to allow the mutation.
> >
> > This would be of some use, and is a conceivable compromise
> > implementation. However, I need to be able to toggle a given object
> > in and out of this state fairly frequently, and that gets more
> > complicated and less performant with a separate memory area.
> >
> > Also, when objects are created I can't always tell whether they
> > should go in the separate memory area or not, and that complicates
> > things further.

Yaw, bad idea on my part. Maybe the below is more to your liking?

>
> At the moment, the image has only 15 compact classes, which means that 4 bits
> for the compact class index would suffice. However, that would prevent Squeak
> users from making their own heavily-used classes compact. Would we be willing
> to pay the price?

Or, remove compact classes entirely?

5 whole header bits. Give 2 to the hash, reserve one for future GC tricks,
and two left for immutable and/or other flags.

You can still make a quick VM-level way to recognize special object types,
say an extra header in the classes that is set to magic for system
classes.

The image gets about 6% larger.. (see my last message)

**** But ***

But if you're going to do that, why not go 'whole hog', and, for the
*exact same cost*, use the 16 or so unnecessary bits that are wasted
remembering which class is which.

--
Switch to 128/64-bit headers?

-- 64 bits
18 length (all zero's indicate long header)
 2 object type
     00 - Object contains no references
     01 - Object is a forwarding pointer.
     10 - (long header) 4th word in header is an oop to be checked in GC
     11 - Object weakly references other objects.
 4 for reserved for GC.
 8 unused. (for immutible, special behavior, etc) [*]

16 hashbits
16 class# (looked up on a seperate table)
--


--  128 bits.
18 length (all zeros)
 2 object type
     00 - Object contains no references
     01 - Object is a forwarding pointer.
     10 - (long header) 4th word in header is an oop to be checked in GC
     11 - Object weakly references other objects.
 4 for reserved for GC.
 8 unused. (for immutible, special behavior, etc) [*]

16 hashbits
16 class# (looked up on a seperate table)

32 for length
32 unused.[*]
--

[*] Unused in this example, but I'm sure we'll find good uses for them. :)

Yaw, we're limited to 64k classes, but we've only got only 2 types of
header. About 4 objects in the system will need the long header. If the
consensus is that 64k classes (15x what the system currently contains)
isn't enough, shift a few bits toward that field.

And, #become becomes as expensive as a shallowCopy; we use one of the bits
in the first header to indicate that the other header is actually a
forwarding pointer, pointing to the relocated object. The same trick makes
it easy to retrofit a copying collector onto squeak. if GC sees a
forwarding pointer, it updates all references to the object to point to
the forwarded-to object.

And, for objects with new and mystical properties, there's 32 unused bits
in the 128-bit header. Maybe throw a reference for a transaction class or
some type of mirror-reflective object, or a proxy, or something else new
and cool. :)  And, as #become now costs as much as a #shallowCopy, we can
be used to cheaply reclassify any object into the 128-bit header.

--

Method lookup won't suffer much, we just combine the class# and the
hashbits from the selector. To get a pointer to the definition of a class,
you have to have an extra index into the class# array but now you have to
do the is-compact-class check and lookup if so.

The cost to store a 64k entry table is excessive, but we could make that
either VM-generation-time variable, or runtime variable.

Outside of GC, with forwarding pointers, as soon as we see one, trying to
access an object, we immediately update it to point to the final object.
Thus, we only follow a given forwarding pointer once. [**] Branch
prediction will be happy. :)

--
And in a basically stock image, this will be about 6% larger.

--

This is just a random not-completely thought out brainstorming ideas...

So dig in and tell me why its crap or why it isn't... :)

Scott


[**] Technically wrong in some cases when there are forwarding pointers
pointing to forwarding pointers .... n>4 deep, but this is good enough
approximation.




More information about the Squeak-dev mailing list