64 bit images(was: A plan for 3.8/4.0...)
jmvsqueak at uolsinectis.com.ar
Fri Apr 23 12:50:15 UTC 2004
Knowing that you are working towards a VI 4 format, and 64 bit Squeak, I
want to share wih you some musings about oops and object header format. I
believe that a 64 bit system could have many advantages besides huge images.
In 64-bit Squeak, words will likely be aligned by 64 bits, so of a 64 bit
pointer, only 61 will actually be used. These 3 bits free allow for seven
inmediate object classes. To chose which ones to optimize, we must take into
- These objects should inmutable (equality conceptually the same as
identity, this is a must)
- There should be many instances (to save a significant amount of memory)
- Their methods should be called often, and should have primitives (to have
a significant performance gain)
I have no doubts on which should be the first two options:
- The first is of course SmallInteger (with a range of 61 bits)
- Another very useful one would be Float (with a range of 61 bits, and
primitives working with 64 bit floats. The mantissa should be completed with
3 zeroes on primitive entry, and have the last 3 bits truncated on primitive
exit.) I believe this one would alse be useful in VI 4 32-bit image. I
believe 32 bits pointers should be aligned by 32 bits, needing 30 bits, and
leaving space for 2 extra inmediate objects. Floats should be one.
Primitives would work with 32 bits floats, and the mantissa should lose 2
bits (instead of 3).
The following are less clear, but worth considering:
- ShortSymbol. We could have some short symbols coded in the object pointer.
This would allow to shrink the symbol dictionary, saving memory and making
symbol creation faster. Anyway, the only performance improvements would be
on symbol creation, mostly when compiling methods, but it could be useful
anyway. ShortSymbols would only be allowed to be made of: A..Z, a..z, 0..9,
and :. They would use only 6 bits per character, and they can be up to 10
characters long. Many selectors could be ShortSymbols.
- Character. I always found strange that a character would use more memory
space than a SmallInteger. Perhaps a good inmediate character representation
could make multi-lingual and multi-alphabet strings easier. I guess Yoshiki
could think if this is a good idea. Perhaps in 61 bits we could also have
space for coding some format information: bold, italic, outline, font, size,
color, etc. Some of these bits would say to which class the character
belongs (we could have a Character hierarchy). Others would be indexes to
tables in the class (i.e. font). Perhaps ther would also be a LongCharacter
or FullCharacter for those that have some property not covered by the
- SmallFraction with numerator and denominator up to 30 bits each. They
would need image support similar to SmallInteger / LargeIntegers to work
transparently. They would need good primitives, but derived from those in
LargePositive(Negative)Integer. This would be useful if many Fractions are
used, or for applications that would use many of them.
- SmallPoint. A Point made of two integers (or perhaps floats) up to 30 bits
each could be a SmallPoint. Warning: SmallPoints would be inmutable, and
therefore are not freely exchangeable with Points. But many operations could
be primitivized (arithmethic, cartesian / polar convertion, etc). I guess
Andreas could tell if this would be useful.
- Color. RGBAlpha, 4 x 8 bits, 4 x 12 bits, 4 x 14 bits or 4 x 15 bits. This
should probably be related with any modification to Form representation.
Perhaps it could make easier getting and setting a pixel's color. Many Form
primitives could be faster and easier to use.
If there is some bit combination left, it would be ok to leave them free. In
the future, we might find a good use for it, or some application developer
might want to use them for application specific objects, preparing a
specific VM that supports them.
- In the object header it could be useful to dedicate one or two bits to
represent the state of remote replicaes of objects. This could help gain
performance in remote object replication and synchronization in distributed
applications and database (i.e. Gemstone) clients. These states could be
Normal, Uninitialized, Dirty and perhaps Locked. I guess that someone with
more experience with distributed systems and databases could say if this
makes any sense.
- Also, as someone said recently, an inmutability bit would be useful too.
I hope to be of some use to you with this.
I also want to tell you that I am deeply happy and thankful for your
returning to further develop Squeak. Thank you.
More information about the Squeak-dev