[squeak-dev] Float hierarchy for 64-bit Spur

Eliot Miranda eliot.miranda at gmail.com
Fri Nov 21 18:25:38 UTC 2014


Hi Tobias,

On Fri, Nov 21, 2014 at 8:01 AM, Tobias Pape <Das.Linux at gmx.de> wrote:

>
> On 21.11.2014, at 15:30, Bert Freudenberg <bert at freudenbergs.de> wrote:
>
> >
> > On 21.11.2014, at 13:53, Tobias Pape <Das.Linux at gmx.de> wrote:
> >
> >> On 21.11.2014, at 13:44, Bert Freudenberg <bert at freudenbergs.de> wrote:
> >>> Also, with the 64 bit format we get many more immediate objects. There
> already are immediate integers and characters, floats will be the third,
> there could be more, like immediate points. For those, the small/large
> distinction does not make sense.
> >>>
> >>> Maybe Eliot's idea of keeping "Float" in the name was best, but
> instead of "small" use "immediate":
> >>>
> >>>     Float - BoxedFloat - ImmediateFloat
> >>>
> >>>     A Float is either a BoxedFloat or an ImmediateFloat, depending on
> the magnitude of its exponent.
> >>
> >> I don't like the idea of putting a VM/Storage detail into the Class
> name.
> >> The running system itself does not care about whether Floats or
> Integers are
> >> boxed or immediate.
> >
> > Good point. Do you have a suggestion for names reflecting that?
>
>
> First: I think it is possible to have both SmallInteger/Large*Integer as
> well
> as all Float stuff combined such that we only have
>         - Integer
>         - Float
> and the VM has to deal with internal stuff, ie representing small enough
> numbers
> tagged and larger ones as boxed (which could, for example, mean to not be
> able
> to access the boxed values from the image side…).
>   However, this is “Zukunftsmusik” or “ungelegte Eier” (Things to come or
> not even
> considered).
>

I don't find this compelling for reasons I've expressed earlier in the
thread.  Personally I think the VM shouldn't be in the business of hiding
much.  There are advantages to it hiding the machinery that connects
contexts to stack frames and methods to machine code because that allows us
to use the same system with very different VMs and that's hugely
advantageous (see the Stack VM and SqueakJS for examples).  But that
doesn't for example hide contexts, it just optimizes teir use.

Second: I think the small/large stuff is semantically correct, because that
> is what
> it is, whether immediate or not:
>         - Integer: SmallInteger, LargeInteger
>         - Float: SmallFloat, LargeFloat
> I don't think there's confusion about the single=float thing when you
> don't have
> the name double somewhere.
>

Agreed.



>   Rationale against immediate in the name: Immediate/Non-Immediate is a
> means to
> an end, which is, speed for small or “few” things: ints, floats, chars.
> When you
> make something different immediate — just for fun: very short ascii
> strings like
> "hello" stored as 0x000068656C6C6F04 and 04 being the tag — you shouldn't
> name it
> ImmediateString but TinyString, because thats why it is there, an
> optimization
> for very tiny things.
>

Agreed.  But note that I will /not/ be pursuing things like immediate
strings.  IMO this is a bad idea.  Whereas there are really compelling
arguments for immediate integers, characters and floats, there aren't for
strings or symbols.  Most strings and most symbols are longer than 7 bytes

(ByteSymbol allInstances collect: [:ea| ea size]) sum asFloat / ByteSymbol
allInstances size 17.905990063082676
(ByteString allInstances collect: [:ea| ea size]) sum asFloat / ByteString
allInstances size 192.12565808504485

So choosing this representation doesn't save much space and loses time
because the more complex mixed representation is involved in many
operations (e.g. replaceFrom:to:with:startingAt: is now way more complex).

In fact, I'm thinking that a 2 bit tag is probably better.  AFAIA, since I
implemented 64-bit VisualWorks with a 3 bit tag no one has added any new
immediate types.  Points don't have the necessary dynamic frequency and
indeed points with floats may be very common in newer UI architectures.
Making nil, true and false immediates doesn't have much benefit either;
they're unique values, and unique addresses work just as well as
immediates.  Essentially expanding the number of tagged types, and
especially making the tagged type organization non-uniform (see e.g. Eliot
Moss's VMs where nil, true, false have one organization, character has a
another and SmallInteger another one still) makes the decode bloat, which
slows down message send.  So I think for the moment I'll go with a 2 bit
tag, giving us an even larger range for SmallDouble and SmallInteger, and
keep the simple representation:

immediates
[62 bit value][2 bit tag]
non-immediates
[64 bit pointer (least 3 bits 0)] -> [8 bit slot count][2 gc bits][22 bit
hash][3 gc bits][5 bit format][2 flag bits][22 bit class index]
-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20141121/c9541563/attachment.htm


More information about the Squeak-dev mailing list