[Vm-dev] Re: [squeak-dev] Spur with Immediate Floating Point
Support implies a break
btc at openInWorld.com
Thu Dec 4 22:46:03 UTC 2014
Bert Freudenberg wrote:
>> On 04.12.2014, at 04:18, Levente Uzonyi <leves at elte.hu> wrote:
>> Hi Eliot,
>> On Wed, 3 Dec 2014, Eliot Miranda wrote:
>>> SmallFloat64 is an immediate tagged representation, like SmallInteger, so
>>> they fit within an object pointer and have no header. In 64-bit Spur there
>>> is a 3-bit tag, leaving 61 bits. SmallFoat64 steals 3 bits from the 11-bit
>>> exponent to donate to the tags, representing a full double precision
>>> floating-point value that is restricted to the ~ +/-10^+/-38 range.
>>> There's really no practical way to shoe-horn a usable range of 64-bit float
>>> into a 30-bit value. Its possible but so few values would fit that the
>>> effort would be counter-productive. DOes this make sense now?
>> I didn't mean to use 30-bit values. I meant to use the same 61-bit representation as with the 64-bit Spur.
>> The object header is 64 bits long in both 32-bit and 64-bit Spur, right?
>> If yes, then why is it not possible to detect the tag of SmallFloat64 in a 32-bit VM, and treat the object as immediate?
> Because that is not what "immediate" means. There is no header, and not even an object. The value is encoded in the oop itself. You can't fit 61 bits in a 32 bit oop.
> I explained this previously, but I'll paste again:
>> The Squeak VM (and Cog and Spur) traditionally use 32 bits to identify an object. When you store a reference to an object into some other object, the VM actually stores a 32 bit word to some place in main memory.
>> When you use a Float in your code, the VM actually allocates 96 bits somewhere in memory (a 32-bit header for house keeping and 64 bits for the IEEE double) and gives you a 32-bit word back, which is a pointer to that object (we also call that an "oop"). This is called "boxing", it wraps the double inside an object. When you add two floats (say 3.0 + 4.0), the VM actually creates two objects and hands you back their oops (e.g. the two hexadecimal numbers @12345600 and @1ABCDE00). Then to add them, the VM reads 64 bits from the memory addresses 12345604 and 1ABCDE04 (skipping the object header), adds these two doubles, allocates another 96 bits in memory (say @56780000), and writes 64 bits of the result to the address 56780004.
>> If this sounds expensive to you, that's because it is. It is even more expensive than that because we have just created 3*96 = 288 bits of garbage that needs to be cleaned up later, otherwise we would soon run out of memory if we keep allocating. Since everything in Smalltalk is an object, that is what the VM has to do.
>> But there is a trick. The VM uses it to avoid all this allocating and memory fetching for the most common operations, namely working with smallish integers, which are used everywhere.
>> That trick is to hide some data in the oop itself. In the 32 bits of object pointers, the lowest two bits are actually always 0, because objects are always allocated at addresses that are a multiple of 4 (32 bits = 4 bytes). If these are always 0, we don't actually need to store them. But since there is no good way to store just 30 bits, we can also use those two bits for something else.
>> And we do. The VM currently just uses one bit, the least significant bit (LSB). If the LSB is 0, this is a regular pointer to an object in main memory. If the LSB is 1, then the VM uses the other 31 bits to store an integer. Inside the oop itself, not at some place in memory! It does not need to be allocated, or garbage-collected. It's just there, hidden inside the 32-bit oop.
>> This makes operations on these "small integers" extremely efficient. To add e.g. 3 and 4, the VM gets the oops @00000007 and @00000009, shifts them 1 bit to get the actual integers (7 >> 1 = 3 and 9 >> 1 = 4), adds them, and shifts it back, sets the LSB, and answers @0000000F. All this happens in CPU registers, no memory access needed, which is why this is so fast. Access to main memory is orders of magnitude slower than register access.
>> We call that an "immediate object". The Squeak VM currently uses only one kind of immediate objects, although there could be more, since we still have an unused bit. It would be great to speed up floating point operations, too. But there is no way to hide a 64-bit double in a 32 bit oop.
>> Which brings us to the proposed 64-bit object format. Objects are allocated in chunks of 64 bits = 8 bytes, meaning addresses are multiples of 8, leaving the the 3 lowest bits for identifying immediate objects.
>> But there still is no way to hide a 64-bit double inside a 64-bit oop, because the VM needs at least 1 bit to distinguish between regular object pointers and immediate objects.
>> So Eliot is proposing a 61-bit immediate Float which (just like SmallIntegers) the VM can process using register operations only. This will be a major boost for most floating point operations (as long as your values are not larger than 10^38).
btw, I forgot to say so when you last wrote that, it was enlightening -
thanks for taking the time to write it.
More information about the Squeak-dev