[squeak-dev] 30 bit unboxed floats

Wed Oct 20 01:09:31 UTC 2010

On Tue, Oct 19, 2010 at 3:02 PM, Jecel Assumpcao Jr. <jecel at merlintec.com>wrote:

> Igor Stasenko wrote:
>
> > Btw, about GC:
> >
> > http://lambda-the-ultimate.org/node/2391
> >
> > Garbage Collection without Paging, Matthew Hertz, Yi Feng, Emery D.
> > Berger. PLDI 2005
>
> The old Mushroom papers are also worth a look:
>
> http://www.wolczko.com/mushroom/index.html
>
> About the original thread, I liked the idea Andreas had proposed of a 6
> bit class index for immediate values. It fits in well with the existing
> compact classes and is easy to optimize in software. Having worked with
> Self and Logo where characters are strings of length 1, I can say that I
> like that scheme very much and will use it in any language I design but
> don't think it is worth changing Squeak to be like this.
>
> I didn't understand Eliot's comment in his first post:
>
> > The problem 30 bit floats have is that they're not a useful subset
> > of Float.  A 60-bit immediate float is a different beast altogether.
> > You really need the number of bits to be sihnificanty greater than
> > the size of the mantissa.  With 30 bits it is still 23 bits short.
>

Squeak uses 64 bit double precision floating-point.  VisualWorks uses both
32 bit single and 64-bit double precision floating-point.  In Squeak
therefore an immediate float of 30 bits would represent a tiny range of the
available float range, distributed sparsely within the range.  In VW I
implemented 61-bit immediate doubles in the 64-bit implementation.  These
have a full mantissa and a limited 8 bit exponent (64-it dp has an 11 bit
exponent).  These occupy the middle of the double precision range,
approximately from +/- 10^-38 to +/- 10^38.  Do math with these values and
you get the same answers as with full 64-bit values provided they're in
range, and /- 10^-38 to +/- 10^38 is a useful range.  Do math in Squeak
however and you either have a very small chance of representing a 64-bit
value as an immediate or you have to introduce a 32-bit single-precision
value /and/ accept a smaller range.  e.g. if you chose a 6-bit exponent (to
be able to keep  then your range is +/- 10^-9 to +/- 10^9.

So for me an immediate float type makes excellent sense in a 64-bit context
(VW's 60-bit floats are half as fast as SmallIntegers in the 64-bit VM)  but
doesn't make much sense in a 32-bit context, especially in a system that
expects 64-bit double precision.  Better spend the tag on an immediate
character type, which has plenty of benefits over the current boxed
representation (== works throughout the Unicode range, converting between
integers and characters is fast, string access is much faster because the
character table goes away).

Does that answer your question?

> It doesn't seem that 32 bits is that much better, but it is what we have
> with FloatArrays.

Um, we can have 64-bit FloatArrays if we want too.  32-bit FloatArrays may
make sense to save space, but the Squeak model of using 64-bits for
consistent precision seems to work well.  In VW there are contortions
necessary to deal with the two different float precisions (essentially the
question of whether operations on SmallIntegers that yield floats should
yield 64-bit or 32-bit results is a thorny one).

> Let me explain my motivation for starting this discussion - when people
> hear that I am designing a hardware implementation of the SqueakVM, the
> most frequent question I get is about what kind of FPU (floating point
> unit) the device will have. My reply is that with boxed floats we have
> so much overhead (no, I don't have actual numbers...) that I prefer to
> include as little hardware support as possible. Just to give you an
> example - if you select to have a FPU in the open source Leon 3
> processor (32 bit Sparc), that will take up three times the area of the
> Leon itself. With unboxed floats, I would be more interested in having
> better hardware for them. But I don't want this hardware to be
> incompatible with the software VMs and don't want to fork over this
> issue.
>

In the immortal works of the country bumbkin asked how to get to some city
"I wouldn't start from here".

Hardware can accelerate boxing and unboxing of floats just as Cog can.  Cog
has special machine code allocation for Floats and of course Float is a
compact class, which helps in testing for float-ness.  But IMO the current
Squeak image format is too complex and too slow and we'd do better coming up
with a better object representation and GC representation, which implies
focussing on kernel/microsqueak/tracer efforts to produce a kernel image and
VM work on implementing a new GC.

Jecel, how flexible is your design methodology?  If the bytecode set or the
object representation were to change how much work would be involved, just
some redeclaration or a manual rewrite or somewhere in between?

best
Eliot

>
> -- Jecel
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20101019/1f64bff6/attachment.htm