[BC][VI4] Forked Squeak version

Scott A Crosby crosby at qwes.math.cmu.edu
Fri Jul 12 09:13:57 UTC 2002


On Tue, 11 Jun 2002, Tim Rowledge wrote:

>
> Other people have in the past asked for bits in object headers to signal
> persistence, immovability, immutability, inscrutability, incontinence,
> in a database, in a bun, inadmissable, and probably other things. Maybe
> we can provide some of them.
>
> Not all the above are strictly image format changing, but may simply be
> something worth getting in whilst already breaking backwards
> compatability.
>

I had a couple of other ideas, including a different header scheme that I
covered a few months ago.. Roughly its along the idea of:

  Header word 1:
     16 bits: class index.
     1 bit: root
     2 bits: GC mark bits.
     ? bits: whatever other flags we can imagine...
               - weak,
               - immutable
               - literal (do not scan for references with GC)
  Header word 2:
     18 bits: size. (all -1 means see header word 3)
     14 bits: hash.
  (Optional)
  Header word 3:
     32 bits: size.

There's a seperate class index table at the head of the image.


--

First, its a lot simpler, class lookup for, for example, methodcache
lookup is a lot faster. Just read the first header word, mask off the
bits, and hash with the symbol keyword. No compact class bits.

And, this saves the 5 compact class bits completely.  Second, reserving 32
bits for every class not in the compact class array seems extreme
overkill, we afford indirection without affecting methodcache lookup. So,
we can use a 16 bit classid (or 20 bits if 65536 classes isn't enough).

Also, this can handle a wide range of class sizes with no branches based
on header type, (those are expensive). Just use the second header word.
For the few classes over, say, 256kb, you have a third header word with
the actual size.

And, this scheme seems to offer a lot of extra bits for whatever we may
want it to do. We can reserve 6 bits in the first header for GC (for
experimenting with different GC algorithms) and still have 10 bits for
whatever else we wish.  We can get another 4 bits or so if we had to, by
masking off a few of the high bits in the second header.

And, it should be faster. Since there's pretty much no class that uses the
third header, the CPU can accurately predict that it'll almost never be
used. Thus, we don't have to do several hard-for-CPU's-to-predict branches
in the methodcache lookup, or in the garbage collector trying to determine
object sizes or class-id's... And, as that code is inlined all over,
getting rid of it will help the CPU's instruction cache.

True, we do pay some extra RAM for classes that were formerly compact,
about another meg worth of image size, but in exchange, we should get
simpler faster code, and 10-16 bits free in every object header.

And, if you're thinking of wiping the compact class bits anyways, you're
paying the same bloat for much less than this would give you.

Scott




More information about the Squeak-dev mailing list