[Vm-dev] [squeak-dev] Spur with Immediate Floating Point Support implies a break

Chris Cunnington brasspen at gmail.com
Thu Dec 4 23:50:05 UTC 2014


> On Dec 4, 2014, at 6:00 PM, Eliot Miranda <eliot.miranda at gmail.com> wrote:
> 
> 
> 
> On Thu, Dec 4, 2014 at 2:46 PM, Ben Coman <btc at openinworld.com <mailto:btc at openinworld.com>> wrote:
> Bert Freudenberg wrote:
> On 04.12.2014, at 04:18, Levente Uzonyi <leves at elte.hu <mailto:leves at elte.hu>> wrote:
> 
> Hi Eliot,
> 
> On Wed, 3 Dec 2014, Eliot Miranda wrote:
> 
> SmallFloat64 is an immediate tagged representation, like SmallInteger, so
> they fit within an object pointer and have no header.  In 64-bit Spur there
> is a 3-bit tag, leaving 61 bits.  SmallFoat64 steals 3 bits from the 11-bit
> exponent to donate to the tags, representing a full double precision
> floating-point value that is restricted to the ~ +/-10^+/-38 range.
> There's really no practical way to shoe-horn a usable range of 64-bit float
> into a 30-bit value.  Its possible but so few values would fit that the
> effort would be counter-productive.  DOes this make sense now?
> I didn't mean to use 30-bit values. I meant to use the same 61-bit representation as with the 64-bit Spur.
> The object header is 64 bits long in both 32-bit and 64-bit Spur, right?
> If yes, then why is it not possible to detect the tag of SmallFloat64 in a 32-bit VM, and treat the object as immediate?
> 
> Because that is not what "immediate" means. There is no header, and not even an object. The value is encoded in the oop itself. You can't fit 61 bits in a 32 bit oop.
> 
> I explained this previously, but I'll paste again:
> 
> The Squeak VM (and Cog and Spur) traditionally use 32 bits to identify an object. When you store a reference to an object into some other object, the VM actually stores a 32 bit word to some place in main memory.
> 
> When you use a Float in your code, the VM actually allocates 96 bits somewhere in memory (a 32-bit header for house keeping and 64 bits for the IEEE double) and gives you a 32-bit word back, which is a pointer to that object (we also call that an "oop"). This is called "boxing", it wraps the double inside an object. When you add two floats (say 3.0 + 4.0), the VM actually creates two objects and hands you back their oops (e.g. the two hexadecimal numbers @12345600 and @1ABCDE00). Then to add them, the VM reads 64 bits from the memory addresses 12345604 and 1ABCDE04 (skipping the object header), adds these two doubles, allocates another 96 bits in memory (say @56780000), and writes 64 bits of the result to the address 56780004. 
> If this sounds expensive to you, that's because it is. It is even more expensive than that because we have just created 3*96 = 288 bits of garbage that needs to be cleaned up later, otherwise we would soon run out of memory if we keep allocating. Since everything in Smalltalk is an object, that is what the VM has to do.
> 
> But there is a trick. The VM uses it to avoid all this allocating and memory fetching for the most common operations, namely working with smallish integers, which are used everywhere.
> 
> That trick is to hide some data in the oop itself. In the 32 bits of object pointers, the lowest two bits are actually always 0, because objects are always allocated at addresses that are a multiple of 4 (32 bits = 4 bytes). If these are always 0, we don't actually need to store them. But since there is no good way to store just 30 bits, we can also use those two bits for something else.
> 
> And we do. The VM currently just uses one bit, the least significant bit (LSB). If the LSB is 0, this is a regular pointer to an object in main memory. If the LSB is 1, then the VM uses the other 31 bits to store an integer. Inside the oop itself, not at some place in memory! It does not need to be allocated, or garbage-collected. It's just there, hidden inside the 32-bit oop.
> 
> This makes operations on these "small integers" extremely efficient. To add e.g. 3 and 4, the VM gets the oops @00000007 and @00000009, shifts them 1 bit to get the actual integers (7 >> 1 = 3 and 9 >> 1 = 4), adds them, and shifts it back, sets the LSB, and answers @0000000F. All this happens in CPU registers, no memory access needed, which is why this is so fast. Access to main memory is orders of magnitude slower than register access.
> 
> We call that an "immediate object". The Squeak VM currently uses only one kind of immediate objects, although there could be more, since we still have an unused bit. It would be great to speed up floating point operations, too. But there is no way to hide a 64-bit double in a 32 bit oop.
> 
> Which brings us to the proposed 64-bit object format. Objects are allocated in chunks of 64 bits = 8 bytes, meaning addresses are multiples of 8, leaving the the 3 lowest bits for identifying immediate objects.
> 
> But there still is no way to hide a 64-bit double inside a 64-bit oop, because the VM needs at least 1 bit to distinguish between regular object pointers and immediate objects.
> 
> So Eliot is proposing a 61-bit immediate Float which (just like SmallIntegers) the VM can process using register operations only. This will be a major boost for most floating point operations (as long as your values are not larger than 10^38).
> 
> btw, I forgot to say so when you last wrote that, it was enlightening - thanks for taking the time to write it.
> 
> If the information was in a class comment somewhere would you have found it and read it?  

Yes, but I don't know how much help it would have been. Often class comments are slivers of a bigger picture. I've read the class comment for ObjectMemory. [1] I've read Igor's pdf from a Lille in 2011. 

https://www.dropbox.com/s/5r48bzzq006dgpe/JourneyInTheVM.key.pdf?dl=0

And it's still pretty confusing. The fact that it can confuse Levente Uzonyi is both liberating and a salutary lesson. And as I've opened my big mouth, I may as well put my head on the chopping block. 

- the heap readdresses 32-bit locations, so they are not the same as the addresses in memory, the numbers on the metal 
- a 32-bit address can be an immediate object, because the address can be a 30-bit number. (i.e. @00000007) 
- a 32-bit address can be a reference to 96-bit object somewhere. The 32-bit address leads to a header that describes the object pursuant of the ObjectMemory comment [1]

If these things are true, then doesn't that mean every time a number is used, then that 32-bit space, which could have been an address, is now invalid as an address to an object? If that's right, then it's a tad odd, right? The more math you do then the more @00000007 and @00000008 numbers are consumed leaving fewer, of the possible 4.3 billion 32-bit words to address objects with headers. And if that's true, then some intelligence in the VM somewhere is saying: "No, that address is working as an 'immediate object' for math. You'll have to use another 32-bit address to lead you to the 96-bits that come complete with a header". 

Sometimes the number on the post office box is the datum. Other times, you need to look inside the post office box for the datum. 

Chris 


[1] 
This class describes a 32-bit direct-pointer object memory for Smalltalk.  The model is very simple in principle:  a pointer is either a SmallInteger or a 32-bit direct object pointer.

SmallIntegers are tagged with a low-order bit equal to 1, and an immediate 31-bit 2s-complement signed value in the rest of the word.

All object pointers point to a header, which may be followed by a number of data fields.  This object memory achieves considerable compactness by using a variable header size (the one complexity of the design).  The format of the 0th header word is as follows:

	3 bits	reserved for gc (mark, root, unused)
	12 bits	object hash (for HashSets)
	5 bits	compact class index
	4 bits	object format
	6 bits	object size in 32-bit words
	2 bits	header type (0: 3-word, 1: 2-word, 2: forbidden, 3: 1-word)

If a class is in the compact class table, then this is the only header information needed.  If it is not, then it will have another header word at offset -4 bytes with its class in the high 30 bits, and the header type repeated in its low 2 bits.  It the objects size is greater than 255 bytes, then it will have yet another header word at offset -8 bytes with its full word size in the high 30 bits and its header type repeated in the low two bits.

The object format field provides the remaining information as given in the formatOf: method (including isPointers, isVariable, isBytes, and the low 2 size bits of byte-sized objects).

This implementation includes incremental (2-generation) and full garbage collection, each with compaction and rectification of direct pointers.  It also supports a bulk-become (exchange object identity) feature that allows many objects to be becomed at once, as when all instances of a class must be grown or shrunk.

There is now a simple 64-bit version of the object memory.  It is the simplest possible change that could work.  It merely sign-extends all integer oops, and extends all object headers and oops by adding 32 zeroes in the high bits.  The format of the base header word is changed in one minor, not especially elegant, way.  Consider the old 32-bit header:
	ggghhhhhhhhhhhhcccccffffsssssstt
The 64-bit header is almost identical, except that the size field (now being in units of 8 bytes, has a zero in its low-order bit.  At the same time, the byte-size residue bits for byte objects, which are in the low order bits of formats 8-11 and 12-15, are now in need of another bit of residue.  So, the change is as follows:
	ggghhhhhhhhhhhhcccccffffsssssrtt
where bit r supplies the 4's bit of the byte size residue for byte objects.  Oh, yes, this is also needed now for 'variableWord' objects, since their size in 32-bit words requires a low-order bit.

See the comment in formatOf: for the change allowing for 64-bit wide bitmaps, now dubbed 'variableLong'.

>  
> cheers -ben
> 
> 
> 
> 
> -- 
> best,
> Eliot

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20141204/39b94e69/attachment.htm


More information about the Squeak-dev mailing list