I wish to access some of the Float constants without a message send. Would anyone mind if I moved the class-vars defined in Float into a new Pool called "FloatConstants", so I may import them and write myFloat == NaN?
Hi Chris,
On Dec 16, 2014, at 7:24 PM, Chris Muller asqueaker@gmail.com wrote:
I wish to access some of the Float constants without a message send.
I'm curious. Why?
Would anyone mind if I moved the class-vars defined in Float into a new Pool called "FloatConstants", so I may import them and write myFloat == NaN?
Instead simply define the pool and initialize it. IMO it is your own business if you want to define such a pool but it does not need to be in the base image. The existing access has worked just fine so far.
On Wed, Dec 17, 2014 at 12:20 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Chris,
On Dec 16, 2014, at 7:24 PM, Chris Muller asqueaker@gmail.com wrote:
I wish to access some of the Float constants without a message send.
I'm curious. Why?
Speed. I need a fast map of the 32-bit Float range to unsigned 32-bit integer range such that comparisons within the integer range are consistent with comparisons of their floats. Loosely:
-Infinity ---------------------- Zero ------------------------ +Infinity | NaN 0 ------------------------- (2^31) ----------------------- 2^32
Here is the method I came up with to do this conversion:
Float>>#hashKey32 self == NegativeInfinity ifTrue: [ ^ 0 ]. self == Infinity ifTrue: [ ^ 4294967294 ]. self isNaN ifTrue: [ ^ 4294967295 ]. self == NegativeZero ifTrue: [ ^ 2147483650 ]. "Smallest to largest negative IEEE 32-bit floats range from (2147483649 to: 4286578687), so invert that range." self negative ifTrue: [ ^ (4286578687 - self asIEEE32BitWord) + 1 ]. "We're positive. IEEE positives range from (0 to: 2139095039)." ^ self asIEEE32BitWord + 2147483651
Since I need _maximum_ speed, I do not wish the check for Infinites and NaN's, the special cases, to require a message send..
Would anyone mind if I moved the class-vars defined in Float into a new Pool called "FloatConstants", so I may import them and write myFloat == NaN?
Instead simply define the pool and initialize it. IMO it is your own business if you want to define such a pool but it does not need to be in the base image. The existing access has worked just fine so far.
Since I'm able to put my method within the same scope as the Float class-var constants, I guess I don't need the FloatConstants pool afterall, but since these are constants that extend beyond the Milky Way to the known ends of the physical universe, I can't understand why you want access to them restricted to such a tiny bottle. FloatConstants would allow better brevity and elegance of code. Why should everyone be required to write "Float pi" over and over instead of simply "Pi"?
On 17.12.2014, at 18:17, Chris Muller asqueaker@gmail.com wrote:
Float>>#hashKey32 self == NegativeInfinity ifTrue: [ ^ 0 ]. self == Infinity ifTrue: [ ^ 4294967294 ]. self isNaN ifTrue: [ ^ 4294967295 ]. self == NegativeZero ifTrue: [ ^ 2147483650 ]. "Smallest to largest negative IEEE 32-bit floats range
from (2147483649 to: 4286578687), so invert that range." self negative ifTrue: [ ^ (4286578687 - self asIEEE32BitWord) + 1 ]. "We're positive. IEEE positives range from (0 to: 2139095039)." ^ self asIEEE32BitWord + 2147483651
Since I need _maximum_ speed, I do not wish the check for Infinites and NaN's, the special cases, to require a message send..
This is a Float method. You can use the NaN class variable directly.
- Bert -
On Wed, Dec 17, 2014 at 11:56 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 17.12.2014, at 18:17, Chris Muller asqueaker@gmail.com wrote:
Float>>#hashKey32 self == NegativeInfinity ifTrue: [ ^ 0 ]. self == Infinity ifTrue: [ ^ 4294967294 ]. self isNaN ifTrue: [ ^ 4294967295 ]. self == NegativeZero ifTrue: [ ^ 2147483650 ]. "Smallest to largest negative IEEE 32-bit floats range
from (2147483649 to: 4286578687), so invert that range." self negative ifTrue: [ ^ (4286578687 - self asIEEE32BitWord) + 1 ]. "We're positive. IEEE positives range from (0 to: 2139095039)." ^ self asIEEE32BitWord + 2147483651
Since I need _maximum_ speed, I do not wish the check for Infinites and NaN's, the special cases, to require a message send..
This is a Float method. You can use the NaN class variable directly.
Thanks. I thought about that but when I searched the entire 32-bit space, the ranges
{(2139095041 to: 2147483647). (4286578689 to: 4294967295)}
reported true for isNaN and, not knowing or caring about Float representation right now, wasn't sure whether any of those other values were needed to be nan for optimizing its internal bit-manipulation.. To be "safe" I decided to use the isNaN message.
But, I guess if that were true then the NaN class-var would be too dangerous to use, wouldn't it? So it seems it should be okay to compare against the class-var..?
I also don't care for the #== comparison, I much prefer #= but that would be an extra send, is that right?
thanks.
Hi Chris,
On Dec 17, 2014, at 9:17 AM, Chris Muller asqueaker@gmail.com wrote:
On Wed, Dec 17, 2014 at 12:20 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Chris,
On Dec 16, 2014, at 7:24 PM, Chris Muller asqueaker@gmail.com wrote:
I wish to access some of the Float constants without a message send.
I'm curious. Why?
Speed. I need a fast map of the 32-bit Float range to unsigned 32-bit integer range such that comparisons within the integer range are consistent with comparisons of their floats.
So am I right in thunking that you want that if the Float has an integer equivalent the float and integer have the same hashKey32 and if they don't, you don't care as long as the hash is well-distributed? If so why not...
Float>>hashKey32 | trunc | trunc := self truncated. ^self = trunc ifTrue: [trunc] ifFalse: [(self at: 1) bitXor: (self at: 2)]
?
Loosely:
-Infinity ---------------------- Zero ------------------------
+Infinity | NaN 0 ------------------------- (2^31) ----------------------- 2^32
Here is the method I came up with to do this conversion:
Float>>#hashKey32 self == NegativeInfinity ifTrue: [ ^ 0 ]. self == Infinity ifTrue: [ ^ 4294967294 ]. self isNaN ifTrue: [ ^ 4294967295 ]. self == NegativeZero ifTrue: [ ^ 2147483650 ]. "Smallest to largest negative IEEE 32-bit floats range
from (2147483649 to: 4286578687), so invert that range." self negative ifTrue: [ ^ (4286578687 - self asIEEE32BitWord) + 1 ]. "We're positive. IEEE positives range from (0 to: 2139095039)." ^ self asIEEE32BitWord + 2147483651
Since I need _maximum_ speed, I do not wish the check for Infinites and NaN's, the special cases, to require a message send..
Would anyone mind if I moved the class-vars defined in Float into a new Pool called "FloatConstants", so I may import them and write myFloat == NaN?
Instead simply define the pool and initialize it. IMO it is your own business if you want to define such a pool but it does not need to be in the base image. The existing access has worked just fine so far.
Since I'm able to put my method within the same scope as the Float class-var constants, I guess I don't need the FloatConstants pool afterall, but since these are constants that extend beyond the Milky Way to the known ends of the physical universe, I can't understand why you want access to them restricted to such a tiny bottle. FloatConstants would allow better brevity and elegance of code. Why should everyone be required to write "Float pi" over and over instead of simply "Pi"?
On Wed, Dec 17, 2014 at 12:55 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Chris,
On Dec 17, 2014, at 9:17 AM, Chris Muller asqueaker@gmail.com wrote:
On Wed, Dec 17, 2014 at 12:20 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Chris,
On Dec 16, 2014, at 7:24 PM, Chris Muller asqueaker@gmail.com wrote:
I wish to access some of the Float constants without a message send.
I'm curious. Why?
Speed. I need a fast map of the 32-bit Float range to unsigned 32-bit integer range such that comparisons within the integer range are consistent with comparisons of their floats.
Ah, I can see how my wording created an ambiguous meaning..
So am I right in thunking that you want that if the Float has an integer equivalent the float and integer have the same hashKey32 and if they don't, you don't care as long as the hash is well-distributed?
No I meant that I need to pickle Floats as an 32-bit Integer, but while in their pickled Integer state, I need to run #> and #< comparisons against other pickled Floats (e.g., as their Integer representation) and need those comparisons to produce the same results as if they were still in their Float state.
For example, the reason I cannot simply use asIEEE32Bit is because negative floats have a high-order bit set, and so the pickled represetnations don't compare correctly:
-4.321 asIEEE32Bit < 1.2345 asIEEE32Bit "false" <--- I need true
32-bit unsigned so I made -Infinity to be 0, +Infinity to be (2^32)-1. But since NaN needs representation too, I decided to put it at the top, so I bumped +Infinity down to (2^32)-2..
If so why not...
Float>>hashKey32 | trunc | trunc := self truncated. ^self = trunc ifTrue: [trunc] ifFalse: [(self at: 1) bitXor: (self at: 2)]
?
Loosely:
-Infinity ---------------------- Zero ------------------------
+Infinity | NaN 0 ------------------------- (2^31) ----------------------- 2^32
Here is the method I came up with to do this conversion:
Float>>#hashKey32 self == NegativeInfinity ifTrue: [ ^ 0 ]. self == Infinity ifTrue: [ ^ 4294967294 ]. self isNaN ifTrue: [ ^ 4294967295 ]. self == NegativeZero ifTrue: [ ^ 2147483650 ]. "Smallest to largest negative IEEE 32-bit floats range
from (2147483649 to: 4286578687), so invert that range." self negative ifTrue: [ ^ (4286578687 - self asIEEE32BitWord) + 1 ]. "We're positive. IEEE positives range from (0 to: 2139095039)." ^ self asIEEE32BitWord + 2147483651
Since I need _maximum_ speed, I do not wish the check for Infinites and NaN's, the special cases, to require a message send..
Would anyone mind if I moved the class-vars defined in Float into a new Pool called "FloatConstants", so I may import them and write myFloat == NaN?
Instead simply define the pool and initialize it. IMO it is your own business if you want to define such a pool but it does not need to be in the base image. The existing access has worked just fine so far.
Since I'm able to put my method within the same scope as the Float class-var constants, I guess I don't need the FloatConstants pool afterall, but since these are constants that extend beyond the Milky Way to the known ends of the physical universe, I can't understand why you want access to them restricted to such a tiny bottle. FloatConstants would allow better brevity and elegance of code. Why should everyone be required to write "Float pi" over and over instead of simply "Pi"?
On Wed, Dec 17, 2014 at 11:23 AM, Chris Muller asqueaker@gmail.com wrote:
On Wed, Dec 17, 2014 at 12:55 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Chris,
On Dec 17, 2014, at 9:17 AM, Chris Muller asqueaker@gmail.com wrote:
On Wed, Dec 17, 2014 at 12:20 AM, Eliot Miranda <
eliot.miranda@gmail.com> wrote:
Hi Chris,
On Dec 16, 2014, at 7:24 PM, Chris Muller asqueaker@gmail.com wrote:
I wish to access some of the Float constants without a message send.
I'm curious. Why?
Speed. I need a fast map of the 32-bit Float range to unsigned 32-bit integer range such that comparisons within the integer range are consistent with comparisons of their floats.
Ah, I can see how my wording created an ambiguous meaning..
So am I right in thunking that you want that if the Float has an integer
equivalent the float and integer have the same hashKey32 and if they don't, you don't care as long as the hash is well-distributed?
No I meant that I need to pickle Floats as an 32-bit Integer,
Now I'm really confused. How come you can get away with 32-bits when Floats are 64-bits?
but while in their pickled Integer state, I need to run #> and #< comparisons against other pickled Floats (e.g., as their Integer representation) and need those comparisons to produce the same results as if they were still in their Float state.
For example, the reason I cannot simply use asIEEE32Bit is because negative floats have a high-order bit set, and so the pickled represetnations don't compare correctly:
-4.321 asIEEE32Bit < 1.2345 asIEEE32Bit "false" <--- I need true
So you need an sign-insensitive absolute comparison? Easy to synthesize:
(self at: 1) bitAnd: 16r7FFFFFFF) << 32 + (self at: 2)
This will put Infinity beyond the finite values.
32-bit unsigned so I made -Infinity to be 0, +Infinity to be (2^32)-1. But since NaN needs representation too, I decided to put it at the top, so I bumped +Infinity down to (2^32)-2..
Again floats are 64-bit not 32-bit so I don't see how this can work. If you want something that orders things absolutely then something like this, which is close to my immediate float representation will work. It effectively rotates, putting the sign in the lsb:
| mostSignificantWord | mostSignificantWord := self at: 1. ^(mostSignificantWord bitAnd: 16r7FFFFFFF) << 33 + ((self at: 2) << 1) + (mostSignificantWord >> 31)
This doesn't work for +/-0.0 since they have a zero exponent, but that's the only exception
If so why not...
Float>>hashKey32 | trunc | trunc := self truncated. ^self = trunc ifTrue: [trunc] ifFalse: [(self at: 1) bitXor: (self at: 2)]
?
Loosely:
-Infinity ---------------------- Zero ------------------------
+Infinity | NaN 0 ------------------------- (2^31) ----------------------- 2^32
Here is the method I came up with to do this conversion:
Float>>#hashKey32 self == NegativeInfinity ifTrue: [ ^ 0 ]. self == Infinity ifTrue: [ ^ 4294967294 ]. self isNaN ifTrue: [ ^ 4294967295 ]. self == NegativeZero ifTrue: [ ^ 2147483650 ]. "Smallest to largest negative IEEE 32-bit floats range
from (2147483649 to: 4286578687), so invert that range." self negative ifTrue: [ ^ (4286578687 - self asIEEE32BitWord) + 1 ]. "We're positive. IEEE positives range from (0 to:
2139095039)."
^ self asIEEE32BitWord + 2147483651
Since I need _maximum_ speed, I do not wish the check for Infinites and NaN's, the special cases, to require a message send..
Would anyone mind if I moved the class-vars defined in Float into a new Pool called "FloatConstants", so I may import them and write myFloat == NaN?
Instead simply define the pool and initialize it. IMO it is your own
business if you want to define such a pool but it does not need to be in the base image. The existing access has worked just fine so far.
Since I'm able to put my method within the same scope as the Float class-var constants, I guess I don't need the FloatConstants pool afterall, but since these are constants that extend beyond the Milky Way to the known ends of the physical universe, I can't understand why you want access to them restricted to such a tiny bottle. FloatConstants would allow better brevity and elegance of code. Why should everyone be required to write "Float pi" over and over instead of simply "Pi"?
On Wed, Dec 17, 2014 at 11:23 AM, Chris Muller asqueaker@gmail.com wrote:
On Wed, Dec 17, 2014 at 12:55 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Chris,
On Dec 17, 2014, at 9:17 AM, Chris Muller asqueaker@gmail.com wrote:
On Wed, Dec 17, 2014 at 12:20 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Chris,
On Dec 16, 2014, at 7:24 PM, Chris Muller asqueaker@gmail.com wrote:
I wish to access some of the Float constants without a message send.
I'm curious. Why?
Speed. I need a fast map of the 32-bit Float range to unsigned 32-bit integer range such that comparisons within the integer range are consistent with comparisons of their floats.
Ah, I can see how my wording created an ambiguous meaning..
So am I right in thunking that you want that if the Float has an integer equivalent the float and integer have the same hashKey32 and if they don't, you don't care as long as the hash is well-distributed?
No I meant that I need to pickle Floats as an 32-bit Integer,
Now I'm really confused. How come you can get away with 32-bits when Floats are 64-bits?
Because I need speed and efficiency more than precision. I'll be loading _billions_ of 64-bit Squeak Floats into a indexing system that operates on 32-bit integers (it can operate at any size even 256-bit but it operates much faster in a 32-bit range due to a lot smaller and fewer LargeIntegers and performance takes precedence).
but while in their pickled Integer state, I need to run #> and #< comparisons against other pickled Floats (e.g., as their Integer representation) and need those comparisons to produce the same results as if they were still in their Float state.
For example, the reason I cannot simply use asIEEE32Bit is because negative floats have a high-order bit set, and so the pickled represetnations don't compare correctly:
-4.321 asIEEE32Bit < 1.2345 asIEEE32Bit "false" <--- I need true
So you need an sign-insensitive absolute comparison? Easy to synthesize:
(self at: 1) bitAnd: 16r7FFFFFFF) << 32 + (self at: 2)
No, I need it to be sign-sensitive. That does not pass the example I gave. Here are the two number lines again from my original email. I need to map Floats from:
-Infinity<-----------> +Infinity
to Integers in the range:
0<------------>((2^32)-1)
-Infinity needs to map to 0 and +Infinity to (2^32)-1.
This will put Infinity beyond the finite values.
32-bit unsigned so I made -Infinity to be 0, +Infinity to be (2^32)-1. But since NaN needs representation too, I decided to put it at the top, so I bumped +Infinity down to (2^32)-2..
Again floats are 64-bit not 32-bit so I don't see how this can work.
Squeak Floats are 64-bit, but they can be easily converted to 32-bit floats for a loss in precision.
If you want something that orders things absolutely then something like this, which is close to my immediate float representation will work. It effectively rotates, putting the sign in the lsb:
| mostSignificantWord | mostSignificantWord := self at: 1. ^(mostSignificantWord bitAnd: 16r7FFFFFFF) << 33 + ((self at: 2) << 1)
- (mostSignificantWord >> 31)
This doesn't work for +/-0.0 since they have a zero exponent, but that's the only exception
It benches to 18% faster than my range-checking but fails the example.
-4.321 eliotHashKey32 < 1.2345 eliotHashKey32 "false" <--- I need true
+/- 0.0 doesn't matter, I only need one 0.0.
On Wed, Dec 17, 2014 at 1:22 PM, Chris Muller ma.chris.m@gmail.com wrote:
On Wed, Dec 17, 2014 at 11:23 AM, Chris Muller asqueaker@gmail.com
wrote:
On Wed, Dec 17, 2014 at 12:55 PM, Eliot Miranda <
eliot.miranda@gmail.com>
wrote:
Hi Chris,
On Dec 17, 2014, at 9:17 AM, Chris Muller asqueaker@gmail.com
wrote:
On Wed, Dec 17, 2014 at 12:20 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Chris,
On Dec 16, 2014, at 7:24 PM, Chris Muller asqueaker@gmail.com
wrote:
> I wish to access some of the Float constants without a message
send.
I'm curious. Why?
Speed. I need a fast map of the 32-bit Float range to unsigned
32-bit
integer range such that comparisons within the integer range are consistent with comparisons of their floats.
Ah, I can see how my wording created an ambiguous meaning..
So am I right in thunking that you want that if the Float has an
integer
equivalent the float and integer have the same hashKey32 and if they
don't,
you don't care as long as the hash is well-distributed?
No I meant that I need to pickle Floats as an 32-bit Integer,
Now I'm really confused. How come you can get away with 32-bits when
Floats
are 64-bits?
Because I need speed and efficiency more than precision. I'll be loading _billions_ of 64-bit Squeak Floats into a indexing system that operates on 32-bit integers (it can operate at any size even 256-bit but it operates much faster in a 32-bit range due to a lot smaller and fewer LargeIntegers and performance takes precedence).
but while in their pickled Integer state, I need to run #> and #< comparisons against other pickled Floats (e.g., as their Integer representation) and need those comparisons to produce the same results as if they were still in their Float state.
For example, the reason I cannot simply use asIEEE32Bit is because negative floats have a high-order bit set, and so the pickled represetnations don't compare correctly:
-4.321 asIEEE32Bit < 1.2345 asIEEE32Bit "false" <--- I need true
So you need an sign-insensitive absolute comparison? Easy to synthesize:
(self at: 1) bitAnd: 16r7FFFFFFF) << 32 + (self at: 2)
No, I need it to be sign-sensitive. That does not pass the example I gave. Here are the two number lines again from my original email. I need to map Floats from:
-Infinity<-----------> +Infinity
to Integers in the range:
0<------------>((2^32)-1)
-Infinity needs to map to 0 and +Infinity to (2^32)-1.
This will put Infinity beyond the finite values.
32-bit unsigned so I made -Infinity to be 0, +Infinity to be (2^32)-1. But since NaN needs representation too, I decided to put it at the top, so I bumped +Infinity down to (2^32)-2..
Again floats are 64-bit not 32-bit so I don't see how this can work.
Squeak Floats are 64-bit, but they can be easily converted to 32-bit floats for a loss in precision.
If you want something that orders things absolutely then something like this,
which
is close to my immediate float representation will work. It effectively rotates, putting the sign in the lsb:
| mostSignificantWord | mostSignificantWord := self at: 1. ^(mostSignificantWord bitAnd: 16r7FFFFFFF) << 33 + ((self at: 2) <<
- (mostSignificantWord >> 31)
This doesn't work for +/-0.0 since they have a zero exponent, but that's
the
only exception
It benches to 18% faster than my range-checking but fails the example.
-4.321 eliotHashKey32 < 1.2345 eliotHashKey32 "false" <--- I need
true
So you want the sign inverted then, right?
| mostSignificantWord | mostSignificantWord := self at: 1. mostSignificantWord := mostSignificantWord bitXor: 16r80000000. ^mostSignificantWord << 32 + (self at: 2)
+/- 0.0 doesn't matter, I only need one 0.0.
On Wed, Dec 17, 2014 at 03:22:17PM -0600, Chris Muller wrote:
On Wed, Dec 17, 2014 at 11:23 AM, Chris Muller asqueaker@gmail.com wrote:
On Wed, Dec 17, 2014 at 12:55 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
So am I right in thunking that you want that if the Float has an integer equivalent the float and integer have the same hashKey32 and if they don't, you don't care as long as the hash is well-distributed?
No I meant that I need to pickle Floats as an 32-bit Integer,
Now I'm really confused. How come you can get away with 32-bits when Floats are 64-bits?
Because I need speed and efficiency more than precision. I'll be loading _billions_ of 64-bit Squeak Floats into a indexing system that operates on 32-bit integers (it can operate at any size even 256-bit but it operates much faster in a 32-bit range due to a lot smaller and fewer LargeIntegers and performance takes precedence).
Hi Chris,
A bit off topic but hopefully thinking out of the box:
You have lots of 64-bit Squeak floats. You want to store the values as 32-bit pickles, because the pickle jar can easily store them that way. You do not care about loss of precision. You do care about storing them efficiently in some 32-bit system, and you do care about being able to unpickle them back into Squeak.
So how about using a FloatArray? The primitives already exist to do this efficiently, so you can store and retrieve Float values (with loss of precision) to the FloatArray, and you can read and write the pickles with basicAt: and basicAt:put:.
For example, in a workspace:
"A float array of 32 bit floats" fa := FloatArray new: 10.
"Use at: and put: to store some 64 bit Float values in the array" (1 to: 10) do: [:i | fa at: i put: Float pi * i].
"Get the raw 32-bit raw data values of the floats from the array using basicAt:" pickles := (1 to: 10) collect: [:i | fa basicAt: i].
"Store the pickles in another FloatArray, demonstrating unpickling with basicAt:put:" newFa := FloatArray new: 10. (1 to: 10) do: [:i | newFa basicAt: i put: (pickles at: i)].
newFa = fa ==> true
So convert the Float values to pickles using a FloatArray, and store the pickles.
Dave
Because I need speed and efficiency more than precision. I'll be loading _billions_ of 64-bit Squeak Floats into a indexing system
I’m more than a little curious about a system that needs to handle billions of anything; that’s one honkin’ great image we could be talking about!
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim There can never be a computer language in which you cannot write a bad program.
On Wed, Dec 17, 2014 at 04:30:10PM -0800, tim Rowledge wrote:
Because I need speed and efficiency more than precision. I'll be loading _billions_ of 64-bit Squeak Floats into a indexing system
I?m more than a little curious about a system that needs to handle billions of anything; that?s one honkin? great image we could be talking about!
It does sound like an interesting problem. Chris seems to be focused on exporting to an external indexing system, but I think that we should be able to manage data sets of that scale directly in a Squeak memory.
The 64-bit Spur object memory, with partitioning of object memory spaces and with improved garbage collection, is likely to make this a real practical option that can be implemented on very low cost hardware platforms.
Even now, the simple 68002 object format on an intepreter VM enables huge object memories on low cost personal computers, so to me the potential of a more efficient object memory and garbage collector seems clear. I'd like to try loading some of Chris' billions of Floats into a 68002 memory just to see how it runs (slowly I presume, but still an interesting experiment - Chris, can I try it?). I expect that if object memories of that scale work at all on the old 68002 format, then they are likely to work very well indeed on a 64-bit Spur object memory.
Dave
On Wed, 17 Dec 2014, David T. Lewis wrote:
On Wed, Dec 17, 2014 at 04:30:10PM -0800, tim Rowledge wrote:
Because I need speed and efficiency more than precision. I'll be loading _billions_ of 64-bit Squeak Floats into a indexing system
I?m more than a little curious about a system that needs to handle billions of anything; that?s one honkin? great image we could be talking about!
It does sound like an interesting problem. Chris seems to be focused on exporting to an external indexing system, but I think that we should be able to manage data sets of that scale directly in a Squeak memory.
The 64-bit Spur object memory, with partitioning of object memory spaces and with improved garbage collection, is likely to make this a real practical option that can be implemented on very low cost hardware platforms.
Even now, the simple 68002 object format on an intepreter VM enables huge object memories on low cost personal computers, so to me the potential of a more efficient object memory and garbage collector seems clear. I'd like to try loading some of Chris' billions of Floats into a 68002 memory just to see how it runs (slowly I presume, but still an interesting experiment - Chris, can I try it?). I expect that if object memories of that scale work at all on the old 68002 format, then they are likely to work very well indeed on a 64-bit Spur object memory.
I think it depends on the way data is stored. One can easily avoid GC slowdowns by using byte and word objects to store data, because the garbage collector doesn't have to check their contents. I implemented an in-image cache a few years ago, which stores rendered webpages. It's 1000 times faster to access or store a webpage using this cache, than using memcached from the image. It has its limits of course, but it works great for what it's designed for.
Levente
Dave
Hi Chris,
Is this any faster?
Float>>#hashKey32
^self isFinite ifTrue: [ self negative ifTrue: [4286578688 - self asIEEE32BitWord] ifFalse: [self asIEEE32BitWord + 2147483651] ] ifFalse: [self negative ifTrue: [0] ifFalse: [4294967294]].
Lou
On Wed, Dec 17, 2014 at 12:20 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Chris,
On Dec 16, 2014, at 7:24 PM, Chris Muller asqueaker@gmail.com wrote:
I wish to access some of the Float constants without a message send.
I'm curious. Why?
Speed. I need a fast map of the 32-bit Float range to unsigned 32-bit integer range such that comparisons within the integer range are consistent with comparisons of their floats. Loosely:
-Infinity ---------------------- Zero ------------------------
+Infinity | NaN 0 ------------------------- (2^31) ----------------------- 2^32
Here is the method I came up with to do this conversion:
Float>>#hashKey32 self == NegativeInfinity ifTrue: [ ^ 0 ]. self == Infinity ifTrue: [ ^ 4294967294 ]. self isNaN ifTrue: [ ^ 4294967295 ]. self == NegativeZero ifTrue: [ ^ 2147483650 ]. "Smallest to largest negative IEEE 32-bit floats range
from (2147483649 to: 4286578687), so invert that range." self negative ifTrue: [ ^ (4286578687 - self asIEEE32BitWord) + 1 ]. "We're positive. IEEE positives range from (0 to: 2139095039)." ^ self asIEEE32BitWord + 2147483651
Since I need _maximum_ speed, I do not wish the check for Infinites and NaN's, the special cases, to require a message send..
Would anyone mind if I moved the class-vars defined in Float into a new Pool called "FloatConstants", so I may import them and write myFloat == NaN?
Instead simply define the pool and initialize it. IMO it is your own business if you want to define such a pool but it does not need to be in the base image. The existing access has worked just fine so far.
Since I'm able to put my method within the same scope as the Float class-var constants, I guess I don't need the FloatConstants pool afterall, but since these are constants that extend beyond the Milky Way to the known ends of the physical universe, I can't understand why you want access to them restricted to such a tiny bottle. FloatConstants would allow better brevity and elegance of code. Why should everyone be required to write "Float pi" over and over instead of simply "Pi"?
----------------------------------------------------------- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon mailto:Lou@Keystone-Software.com http://www.Keystone-Software.com
On Fri, Dec 19, 2014 at 4:01 PM, Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
Is this any faster?
Float>>#hashKey32
^self isFinite ifTrue: [ self negative ifTrue: [4286578688 - self asIEEE32BitWord] ifFalse: [self asIEEE32BitWord + 2147483651] ] ifFalse: [self negative ifTrue: [0] ifFalse: [4294967294]].
About the same, but I think I like your code better. Thanks.
On Sat, 20 Dec 2014, Chris Muller wrote:
On Fri, Dec 19, 2014 at 4:01 PM, Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
Is this any faster?
Float>>#hashKey32
^self isFinite ifTrue: [ self negative ifTrue: [4286578688 - self asIEEE32BitWord] ifFalse: [self asIEEE32BitWord + 2147483651] ] ifFalse: [self negative ifTrue: [0] ifFalse: [4294967294]].
About the same, but I think I like your code better. Thanks.
Dave has already suggested to use a FloatArray for conversion instead of #asIEEE32BitWord. We use this technique in various network protocol implementations, and it works great.
Here's a significantly faster, optimized version:
hashKey32: aFloatArray
self - self = 0.0 ifTrue: [ self < 0.0 ifTrue: [ ^4286578688 - (aFloatArray at: 1 put: self; basicAt: 1) ]. ^2147483651 + (aFloatArray at: 1 put: self; basicAt: 1) ]. self < 0.0 ifTrue: [ ^0 ]. ^4294967294
The argument is any FloatArray instance with at least one slot.
Levente
Oh wow, thanks for clarifying that! I missed what Dave was trying to say. Now I see he meant for me to use FloatArray _as_ the 64-bit --> 32-bit conversion tool, with (basicAt: 1) providing the 32-bit Integer. Thank you for spelling that out!
On Sat, Dec 20, 2014 at 9:27 PM, Levente Uzonyi leves@elte.hu wrote:
On Sat, 20 Dec 2014, Chris Muller wrote:
On Fri, Dec 19, 2014 at 4:01 PM, Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
Is this any faster?
Float>>#hashKey32
^self isFinite ifTrue: [ self negative ifTrue: [4286578688 - self asIEEE32BitWord] ifFalse: [self asIEEE32BitWord + 2147483651] ] ifFalse: [self negative ifTrue: [0] ifFalse: [4294967294]].
About the same, but I think I like your code better. Thanks.
Dave has already suggested to use a FloatArray for conversion instead of #asIEEE32BitWord. We use this technique in various network protocol implementations, and it works great.
Here's a significantly faster, optimized version:
hashKey32: aFloatArray
self - self = 0.0 ifTrue: [ self < 0.0 ifTrue: [ ^4286578688 - (aFloatArray at: 1 put:
self; basicAt: 1) ]. ^2147483651 + (aFloatArray at: 1 put: self; basicAt: 1) ]. self < 0.0 ifTrue: [ ^0 ]. ^4294967294
The argument is any FloatArray instance with at least one slot.
Levente
Hi Levente,
please rewrite using a temp to hold the raw bits, a class var to hold the float array and hexadecimal. Then one can understand much easier. Also is inlining isFinite that important for performance?
rawBits := FloatArrayBuffer at: 1 put: self; basicAt: 1 etc...
Eliot (phone hence not writing the full method)
On Dec 20, 2014, at 7:27 PM, Levente Uzonyi leves@elte.hu wrote:
On Sat, 20 Dec 2014, Chris Muller wrote:
On Fri, Dec 19, 2014 at 4:01 PM, Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
Is this any faster?
Float>>#hashKey32
^self isFinite ifTrue: [ self negative ifTrue: [4286578688 - self asIEEE32BitWord] ifFalse: [self asIEEE32BitWord + 2147483651] ] ifFalse: [self negative ifTrue: [0] ifFalse: [4294967294]].
About the same, but I think I like your code better. Thanks.
Dave has already suggested to use a FloatArray for conversion instead of #asIEEE32BitWord. We use this technique in various network protocol implementations, and it works great.
Here's a significantly faster, optimized version:
hashKey32: aFloatArray
self - self = 0.0 ifTrue: [ self < 0.0 ifTrue: [ ^4286578688 - (aFloatArray at: 1 put: self; basicAt: 1) ]. ^2147483651 + (aFloatArray at: 1 put: self; basicAt: 1) ]. self < 0.0 ifTrue: [ ^0 ]. ^4294967294
The argument is any FloatArray instance with at least one slot.
Levente
Hi Eliot,
The reason why I wrote that method was to help Chris speed up his code. Since my only goal was to maximize performance, I didn't care about other things like readability. Using a temporary variable to hold the bits would improve it for sure. I suggested Chris to use a class variable to hold the FloatArray instance. If he can actually do that depends on the level of concurrency he uses. This method is just an optimized version of what Louis LaBrunda posted, so I kept all constant values as they were (they don't seem to be correct though). Inlining #isFinite gives an additional 15% speedup for infinite receivers, and about 2% for finite ones.
Here's another variant with even better performance, especially for infinite receivers:
hashKey32
| bits | self < Infinity ifFalse: [ ^16rFFFFFFFE ]. NegativeInfinity < self ifFalse: [ ^0 ]. bits := ConverterFloatArray at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^16rFF800000 - bits ]. ^16r80000003 + bits
Levente
On Sun, 21 Dec 2014, Eliot Miranda wrote:
Hi Levente,
please rewrite using a temp to hold the raw bits, a class var to hold the float array and hexadecimal. Then one can understand much easier. Also is inlining isFinite that important for performance?
rawBits := FloatArrayBuffer at: 1 put: self; basicAt: 1 etc...
Eliot (phone hence not writing the full method)
On Dec 20, 2014, at 7:27 PM, Levente Uzonyi leves@elte.hu wrote:
On Sat, 20 Dec 2014, Chris Muller wrote:
On Fri, Dec 19, 2014 at 4:01 PM, Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
Is this any faster?
Float>>#hashKey32
^self isFinite ifTrue: [ self negative ifTrue: [4286578688 - self asIEEE32BitWord] ifFalse: [self asIEEE32BitWord + 2147483651] ] ifFalse: [self negative ifTrue: [0] ifFalse: [4294967294]].
About the same, but I think I like your code better. Thanks.
Dave has already suggested to use a FloatArray for conversion instead of #asIEEE32BitWord. We use this technique in various network protocol implementations, and it works great.
Here's a significantly faster, optimized version:
hashKey32: aFloatArray
self - self = 0.0 ifTrue: [ self < 0.0 ifTrue: [ ^4286578688 - (aFloatArray at: 1 put: self; basicAt: 1) ]. ^2147483651 + (aFloatArray at: 1 put: self; basicAt: 1) ]. self < 0.0 ifTrue: [ ^0 ]. ^4294967294
The argument is any FloatArray instance with at least one slot.
Levente
On 22.12.2014, at 00:13, Levente Uzonyi leves@elte.hu wrote:
ConverterFloatArray at: 1 put: self; basicAt: 1.
Any reason not to use this in #asIEEE32BitWord? Endianness? Arch-dependency?
I see, it's not thread-safe. This would be:
(FloatArray new: 1) at: 1 put: self; basicAt: 1.
Might still be faster?
- Bert -
On Mon, Dec 22, 2014 at 3:59 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 22.12.2014, at 00:13, Levente Uzonyi leves@elte.hu wrote:
ConverterFloatArray at: 1 put: self; basicAt: 1.
Any reason not to use this in #asIEEE32BitWord? Endianness? Arch-dependency?
I see, it's not thread-safe. This would be:
(FloatArray new: 1) at: 1 put: self; basicAt: 1.
Might still be faster?
Yes. Since creation of a one-element FloatArray every time did not adversely affect performance of Levente's too significantly (only 3.7X instead of 4.0X faster), I decided it was worth the cost of the allocation than to worry about concurrency. So I ended up with Levente's latest except I cannot risk a calculation ending up -0.0, so I have to account for it too. And, NaN too. Thus:
hashKey32 | bits | self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. self = NegativeZero ifTrue: [ ^ 2147483651 ]. bits := (FloatArray new: 1) at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^ 4286578688 - bits ]. ^ 2147483651 + bits
Since there are not a full 32-bits worth of IEEE 32-bit floats (e.g., several thousand convert to NaN), it might be wise to move +Infinity and NaN _down_ a bit from the very maximum, for better continuity between the float and integer number lines, or for potential future special-case needs..?
In any case, I wanted to at least see if what we have, above, works for every 32-bit IEEE float. To verify that, I enumerated all Floats in numerical order from -Infinity to +Infinity by creating them via #fromIEEE32BitFloat: from the appropriate ranges.
It hit a snag at 2151677948. Check this out:
| this next | this := Float fromIEEE32Bit: 2151677949. next := Float fromIEEE32Bit: 2151677948. self assert: next > this ; assert: ((FloatArray new: 1) at: 1 put: (next); basicAt: 1)
((FloatArray new: 1) at: 1 put: (this); basicAt: 1)
As I thought, the representations between IEEE floats and FloatArray floats are different-enough that their precisions align differently onto the 32-bit map for these two floats. IEEE's are precise-enough to distinguish these two floats, FloatArray representations are not.
That these guys are considered "equal" by the FloatArray is actually good enough for my indexing requirement, but now I'm looking at the prim-fail code for FloatArray:
at: index <primitive: 'primitiveAt' module: 'FloatArrayPlugin'> ^Float fromIEEE32Bit: (self basicAt: index)
If this or the #at:put: primitive were to ever fail on the storage (at:put:) exclusive-or the access (at:) side, then it appears FloatArray itself would retrieve a value different than was stored..!
Hi Chris,
On Tue, Dec 23, 2014 at 12:50 PM, Chris Muller asqueaker@gmail.com wrote:
On Mon, Dec 22, 2014 at 3:59 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 22.12.2014, at 00:13, Levente Uzonyi leves@elte.hu wrote:
ConverterFloatArray at: 1 put: self; basicAt: 1.
Any reason not to use this in #asIEEE32BitWord? Endianness?
Arch-dependency?
I see, it's not thread-safe. This would be:
(FloatArray new: 1) at: 1 put: self; basicAt: 1.
Might still be faster?
Yes. Since creation of a one-element FloatArray every time did not adversely affect performance of Levente's too significantly (only 3.7X instead of 4.0X faster), I decided it was worth the cost of the allocation than to worry about concurrency. So I ended up with Levente's latest except I cannot risk a calculation ending up -0.0, so I have to account for it too. And, NaN too. Thus:
hashKey32 | bits | self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. self = NegativeZero ifTrue: [ ^ 2147483651 ]. bits := (FloatArray new: 1) at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^ 4286578688 - bits ]. ^ 2147483651 + bits
FloatArray basicNew: 1 will be a little bit faster. Please use hex to make the layout clear.
Since there are not a full 32-bits worth of IEEE 32-bit floats (e.g., several thousand convert to NaN), it might be wise to move +Infinity and NaN _down_ a bit from the very maximum, for better continuity between the float and integer number lines, or for potential future special-case needs..?
In any case, I wanted to at least see if what we have, above, works for every 32-bit IEEE float. To verify that, I enumerated all Floats in numerical order from -Infinity to +Infinity by creating them via #fromIEEE32BitFloat: from the appropriate ranges.
It hit a snag at 2151677948. Check this out:
| this next | this := Float fromIEEE32Bit: 2151677949. next := Float fromIEEE32Bit: 2151677948. self assert: next > this ; assert: ((FloatArray new: 1) at: 1 put: (next); basicAt: 1)
((FloatArray new: 1) at: 1 put: (this); basicAt: 1)
As I thought, the representations between IEEE floats and FloatArray floats are different-enough that their precisions align differently onto the 32-bit map for these two floats. IEEE's are precise-enough to distinguish these two floats, FloatArray representations are not.
Chris, FloatArray stores 32-bit ieee 754 single-precision floats, Float represents 64-bit ieee 754 double-precision floats. They look like this:
single-precision: sign, 8-bit exponent, 23 bit mantissa double-precision: sign, 11-bit exponent, 52 bit mantissa
So if you assign a large Float to a Float array it will map to Infinity:
((FloatArray new: 1) at: 1 put: 1.0e238; at: 1) => Infinity
and if you assign a small one it will map to zero:
((FloatArray new: 1) at: 1 put: 1.0e-238; at: 1) => 0.0
That these guys are considered "equal" by the FloatArray is actually
good enough for my indexing requirement, but now I'm looking at the prim-fail code for FloatArray:
at: index <primitive: 'primitiveAt' module: 'FloatArrayPlugin'> ^Float fromIEEE32Bit: (self basicAt: index)
If this or the #at:put: primitive were to ever fail on the storage (at:put:) exclusive-or the access (at:) side, then it appears FloatArray itself would retrieve a value different than was stored..!
But that happens whenever you store a double that cannot be represented as a 32-bit float. That;s exactly what we're doing here is mapping 64-bit floats to 32-bit floats so we expect to retrieve different values than those stored most of the time, on average 2^32-1/(2^32). Only 1/(2^32) of the double precision floats are exactly representable in 32-bits.
Chris, this is a case of gradual underflow, and it seems like it is not handled correctly in Float class>>fromIEEE32Bit: . Since the method has my initials, I'll try to sort out what the mess is...
2014-12-23 21:50 GMT+01:00 Chris Muller asqueaker@gmail.com:
On Mon, Dec 22, 2014 at 3:59 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 22.12.2014, at 00:13, Levente Uzonyi leves@elte.hu wrote:
ConverterFloatArray at: 1 put: self; basicAt: 1.
Any reason not to use this in #asIEEE32BitWord? Endianness?
Arch-dependency?
I see, it's not thread-safe. This would be:
(FloatArray new: 1) at: 1 put: self; basicAt: 1.
Might still be faster?
Yes. Since creation of a one-element FloatArray every time did not adversely affect performance of Levente's too significantly (only 3.7X instead of 4.0X faster), I decided it was worth the cost of the allocation than to worry about concurrency. So I ended up with Levente's latest except I cannot risk a calculation ending up -0.0, so I have to account for it too. And, NaN too. Thus:
hashKey32 | bits | self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. self = NegativeZero ifTrue: [ ^ 2147483651 ]. bits := (FloatArray new: 1) at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^ 4286578688 - bits ]. ^ 2147483651 + bits
Since there are not a full 32-bits worth of IEEE 32-bit floats (e.g., several thousand convert to NaN), it might be wise to move +Infinity and NaN _down_ a bit from the very maximum, for better continuity between the float and integer number lines, or for potential future special-case needs..?
In any case, I wanted to at least see if what we have, above, works for every 32-bit IEEE float. To verify that, I enumerated all Floats in numerical order from -Infinity to +Infinity by creating them via #fromIEEE32BitFloat: from the appropriate ranges.
It hit a snag at 2151677948. Check this out:
| this next | this := Float fromIEEE32Bit: 2151677949. next := Float fromIEEE32Bit: 2151677948. self assert: next > this ; assert: ((FloatArray new: 1) at: 1 put: (next); basicAt: 1)
((FloatArray new: 1) at: 1 put: (this); basicAt: 1)
As I thought, the representations between IEEE floats and FloatArray floats are different-enough that their precisions align differently onto the 32-bit map for these two floats. IEEE's are precise-enough to distinguish these two floats, FloatArray representations are not.
That these guys are considered "equal" by the FloatArray is actually good enough for my indexing requirement, but now I'm looking at the prim-fail code for FloatArray:
at: index <primitive: 'primitiveAt' module: 'FloatArrayPlugin'> ^Float fromIEEE32Bit: (self basicAt: index)
If this or the #at:put: primitive were to ever fail on the storage (at:put:) exclusive-or the access (at:) side, then it appears FloatArray itself would retrieve a value different than was stored..!
OK, I got it, the mantissa was not shifted correctly in case of underflow... I'll publish as soon as I can get an updated trunk.
2014-12-23 22:58 GMT+01:00 Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>:
Chris, this is a case of gradual underflow, and it seems like it is not handled correctly in Float class>>fromIEEE32Bit: . Since the method has my initials, I'll try to sort out what the mess is...
2014-12-23 21:50 GMT+01:00 Chris Muller asqueaker@gmail.com:
On Mon, Dec 22, 2014 at 3:59 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 22.12.2014, at 00:13, Levente Uzonyi leves@elte.hu wrote:
ConverterFloatArray at: 1 put: self; basicAt: 1.
Any reason not to use this in #asIEEE32BitWord? Endianness?
Arch-dependency?
I see, it's not thread-safe. This would be:
(FloatArray new: 1) at: 1 put: self; basicAt: 1.
Might still be faster?
Yes. Since creation of a one-element FloatArray every time did not adversely affect performance of Levente's too significantly (only 3.7X instead of 4.0X faster), I decided it was worth the cost of the allocation than to worry about concurrency. So I ended up with Levente's latest except I cannot risk a calculation ending up -0.0, so I have to account for it too. And, NaN too. Thus:
hashKey32 | bits | self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. self = NegativeZero ifTrue: [ ^ 2147483651 ]. bits := (FloatArray new: 1) at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^ 4286578688 - bits ]. ^ 2147483651 + bits
Since there are not a full 32-bits worth of IEEE 32-bit floats (e.g., several thousand convert to NaN), it might be wise to move +Infinity and NaN _down_ a bit from the very maximum, for better continuity between the float and integer number lines, or for potential future special-case needs..?
In any case, I wanted to at least see if what we have, above, works for every 32-bit IEEE float. To verify that, I enumerated all Floats in numerical order from -Infinity to +Infinity by creating them via #fromIEEE32BitFloat: from the appropriate ranges.
It hit a snag at 2151677948. Check this out:
| this next | this := Float fromIEEE32Bit: 2151677949. next := Float fromIEEE32Bit: 2151677948. self assert: next > this ; assert: ((FloatArray new: 1) at: 1 put: (next); basicAt: 1)
((FloatArray new: 1) at: 1 put: (this); basicAt: 1)
As I thought, the representations between IEEE floats and FloatArray floats are different-enough that their precisions align differently onto the 32-bit map for these two floats. IEEE's are precise-enough to distinguish these two floats, FloatArray representations are not.
That these guys are considered "equal" by the FloatArray is actually good enough for my indexing requirement, but now I'm looking at the prim-fail code for FloatArray:
at: index <primitive: 'primitiveAt' module: 'FloatArrayPlugin'> ^Float fromIEEE32Bit: (self basicAt: index)
If this or the #at:put: primitive were to ever fail on the storage (at:put:) exclusive-or the access (at:) side, then it appears FloatArray itself would retrieve a value different than was stored..!
OK Chris, it should be corrected now, but your assertion is false... Your floats are negative, but FloatArray basicAt: is considering unsigned 32 bit ints... So the inequality must be swapped in this case...
2014-12-23 23:10 GMT+01:00 Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>:
OK, I got it, the mantissa was not shifted correctly in case of underflow... I'll publish as soon as I can get an updated trunk.
2014-12-23 22:58 GMT+01:00 Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>:
Chris, this is a case of gradual underflow, and it seems like it is not handled correctly in Float class>>fromIEEE32Bit: . Since the method has my initials, I'll try to sort out what the mess is...
2014-12-23 21:50 GMT+01:00 Chris Muller asqueaker@gmail.com:
On Mon, Dec 22, 2014 at 3:59 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 22.12.2014, at 00:13, Levente Uzonyi leves@elte.hu wrote:
ConverterFloatArray at: 1 put: self; basicAt: 1.
Any reason not to use this in #asIEEE32BitWord? Endianness?
Arch-dependency?
I see, it's not thread-safe. This would be:
(FloatArray new: 1) at: 1 put: self; basicAt: 1.
Might still be faster?
Yes. Since creation of a one-element FloatArray every time did not adversely affect performance of Levente's too significantly (only 3.7X instead of 4.0X faster), I decided it was worth the cost of the allocation than to worry about concurrency. So I ended up with Levente's latest except I cannot risk a calculation ending up -0.0, so I have to account for it too. And, NaN too. Thus:
hashKey32 | bits | self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. self = NegativeZero ifTrue: [ ^ 2147483651 ]. bits := (FloatArray new: 1) at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^ 4286578688 - bits ]. ^ 2147483651 + bits
Since there are not a full 32-bits worth of IEEE 32-bit floats (e.g., several thousand convert to NaN), it might be wise to move +Infinity and NaN _down_ a bit from the very maximum, for better continuity between the float and integer number lines, or for potential future special-case needs..?
In any case, I wanted to at least see if what we have, above, works for every 32-bit IEEE float. To verify that, I enumerated all Floats in numerical order from -Infinity to +Infinity by creating them via #fromIEEE32BitFloat: from the appropriate ranges.
It hit a snag at 2151677948. Check this out:
| this next | this := Float fromIEEE32Bit: 2151677949. next := Float fromIEEE32Bit: 2151677948. self assert: next > this ; assert: ((FloatArray new: 1) at: 1 put: (next); basicAt: 1)
((FloatArray new: 1) at: 1 put: (this); basicAt: 1)
As I thought, the representations between IEEE floats and FloatArray floats are different-enough that their precisions align differently onto the 32-bit map for these two floats. IEEE's are precise-enough to distinguish these two floats, FloatArray representations are not.
That these guys are considered "equal" by the FloatArray is actually good enough for my indexing requirement, but now I'm looking at the prim-fail code for FloatArray:
at: index <primitive: 'primitiveAt' module: 'FloatArrayPlugin'> ^Float fromIEEE32Bit: (self basicAt: index)
If this or the #at:put: primitive were to ever fail on the storage (at:put:) exclusive-or the access (at:) side, then it appears FloatArray itself would retrieve a value different than was stored..!
On Tue, Dec 23, 2014 at 5:08 PM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
OK Chris, it should be corrected now,
This time it worked! #hashKey32 passed for all 32-bit Floats. Thank you!!
but your assertion is false... Your floats are negative, but FloatArray basicAt: is considering unsigned 32 bit ints... So the inequality must be swapped in this case...
Indeed. As you know I was actually testing #hashKey32 and so that's what I get for cut-n-pasting-n-tweaking.
Thanks again.
I don't know how you measured the speedup, but I got 8-12x improvement for finite numbers. By creating a new FloatArray, the speedup decreases to 6-9x.
Levente
On Tue, 23 Dec 2014, Chris Muller wrote:
On Mon, Dec 22, 2014 at 3:59 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 22.12.2014, at 00:13, Levente Uzonyi leves@elte.hu wrote:
ConverterFloatArray at: 1 put: self; basicAt: 1.
Any reason not to use this in #asIEEE32BitWord? Endianness? Arch-dependency?
I see, it's not thread-safe. This would be:
(FloatArray new: 1) at: 1 put: self; basicAt: 1.
Might still be faster?
Yes. Since creation of a one-element FloatArray every time did not adversely affect performance of Levente's too significantly (only 3.7X instead of 4.0X faster), I decided it was worth the cost of the allocation than to worry about concurrency. So I ended up with Levente's latest except I cannot risk a calculation ending up -0.0, so I have to account for it too. And, NaN too. Thus:
hashKey32 | bits | self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. self = NegativeZero ifTrue: [ ^ 2147483651 ]. bits := (FloatArray new: 1) at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^ 4286578688 - bits ]. ^ 2147483651 + bits
Since there are not a full 32-bits worth of IEEE 32-bit floats (e.g., several thousand convert to NaN), it might be wise to move +Infinity and NaN _down_ a bit from the very maximum, for better continuity between the float and integer number lines, or for potential future special-case needs..?
In any case, I wanted to at least see if what we have, above, works for every 32-bit IEEE float. To verify that, I enumerated all Floats in numerical order from -Infinity to +Infinity by creating them via #fromIEEE32BitFloat: from the appropriate ranges.
It hit a snag at 2151677948. Check this out:
| this next | this := Float fromIEEE32Bit: 2151677949. next := Float fromIEEE32Bit: 2151677948. self assert: next > this ; assert: ((FloatArray new: 1) at: 1 put: (next); basicAt: 1)
((FloatArray new: 1) at: 1 put: (this); basicAt: 1)
As I thought, the representations between IEEE floats and FloatArray floats are different-enough that their precisions align differently onto the 32-bit map for these two floats. IEEE's are precise-enough to distinguish these two floats, FloatArray representations are not.
That these guys are considered "equal" by the FloatArray is actually good enough for my indexing requirement, but now I'm looking at the prim-fail code for FloatArray:
at: index <primitive: 'primitiveAt' module: 'FloatArrayPlugin'> ^Float fromIEEE32Bit: (self basicAt: index)
If this or the #at:put: primitive were to ever fail on the storage (at:put:) exclusive-or the access (at:) side, then it appears FloatArray itself would retrieve a value different than was stored..!
Here is what I used to measure:
| rand | rand := Random seed: 12345. [ (rand next ) hashKey32 ] bench
This baseline version reports '902,000 per second.'
hashKey32 self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. "Identity check to allow a distinction between -0.0 and +0.0." self == NegativeZero ifTrue: [ ^ 2147483650 ]. "Smallest to largest negative IEEE 32-bit floats range from (2147483649 to: 4286578687), so invert that range." self negative ifTrue: [ ^ ("4286578687" 4286578688 - self asIEEE32BitWord) "+ 1" ]. "We're positive. IEEE 32-bit positives range from (0 to: 2139095039)." ^ self asIEEE32BitWord + 2147483651
Switching it to use FloatArray reports '3,530,000 per second.'
hashKey32 | bits | self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. self = NegativeZero ifTrue: [ ^ 2147483651 ]. bits := (FloatArray basicNew: 1) at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^ 4286578688 - bits ]. ^ 2147483651 + bits
Do you think the difference is less pronounced than yours due to my going through Random #next?
On Tue, Dec 23, 2014 at 5:52 PM, Levente Uzonyi leves@elte.hu wrote:
I don't know how you measured the speedup, but I got 8-12x improvement for finite numbers. By creating a new FloatArray, the speedup decreases to 6-9x.
Levente
On Tue, 23 Dec 2014, Chris Muller wrote:
On Mon, Dec 22, 2014 at 3:59 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 22.12.2014, at 00:13, Levente Uzonyi leves@elte.hu wrote:
ConverterFloatArray at: 1 put: self; basicAt: 1.
Any reason not to use this in #asIEEE32BitWord? Endianness? Arch-dependency?
I see, it's not thread-safe. This would be:
(FloatArray new: 1) at: 1 put: self; basicAt: 1.
Might still be faster?
Yes. Since creation of a one-element FloatArray every time did not adversely affect performance of Levente's too significantly (only 3.7X instead of 4.0X faster), I decided it was worth the cost of the allocation than to worry about concurrency. So I ended up with Levente's latest except I cannot risk a calculation ending up -0.0, so I have to account for it too. And, NaN too. Thus:
hashKey32 | bits | self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. self = NegativeZero ifTrue: [ ^ 2147483651 ]. bits := (FloatArray new: 1) at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^ 4286578688 - bits ]. ^ 2147483651 + bits
Since there are not a full 32-bits worth of IEEE 32-bit floats (e.g., several thousand convert to NaN), it might be wise to move +Infinity and NaN _down_ a bit from the very maximum, for better continuity between the float and integer number lines, or for potential future special-case needs..?
In any case, I wanted to at least see if what we have, above, works for every 32-bit IEEE float. To verify that, I enumerated all Floats in numerical order from -Infinity to +Infinity by creating them via #fromIEEE32BitFloat: from the appropriate ranges.
It hit a snag at 2151677948. Check this out:
| this next | this := Float fromIEEE32Bit: 2151677949. next := Float fromIEEE32Bit: 2151677948. self assert: next > this ; assert: ((FloatArray new: 1) at: 1 put: (next); basicAt: 1)
((FloatArray new: 1) at: 1 put: (this); basicAt: 1)
As I thought, the representations between IEEE floats and FloatArray floats are different-enough that their precisions align differently onto the 32-bit map for these two floats. IEEE's are precise-enough to distinguish these two floats, FloatArray representations are not.
That these guys are considered "equal" by the FloatArray is actually good enough for my indexing requirement, but now I'm looking at the prim-fail code for FloatArray:
at: index <primitive: 'primitiveAt' module: 'FloatArrayPlugin'> ^Float fromIEEE32Bit: (self basicAt: index)
If this or the #at:put: primitive were to ever fail on the storage (at:put:) exclusive-or the access (at:) side, then it appears FloatArray itself would retrieve a value different than was stored..!
I see many reasons why the difference is smaller: - you're also measuring the generation of the input - this creates new numbers and triggers GC more often - you're only benchmarking numbers between 0 and 1. #asIEEE32BitWord is a lot slower for negative values - you're using #bench, which has high overhead - you are comparing different versions than I did
About your modifications: self = NaN will always return false, so that comparison is wrong. self == NegativeZero will almost never be true (try -0.0 == Float negativeZero). Use #= instead.
After trying to understand what the code is about to do, I came to the conclusion that there's no reason to treat negative infinity and infinity separately.
hashKey32
self > 0.0 ifTrue: [ ^16r80000003 + ((FloatArray basicNew: 1) at: 1 put: self; basicAt: 1) ]. self < 0.0 ifTrue: [ ^16rFF800000 - ((FloatArray basicNew: 1) at: 1 put: self; basicAt: 1) ]. self = self ifFalse: [ ^16rFFFFFFFF "NaN" ]. (self at: 1) = 0 ifTrue: [ ^16r80000003 "Zero" ]. ^16r7F800000 "Negative zero"
Levente
On Tue, 23 Dec 2014, Chris Muller wrote:
Here is what I used to measure:
| rand | rand := Random seed: 12345. [ (rand next ) hashKey32 ] bench
This baseline version reports '902,000 per second.'
hashKey32 self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. "Identity check to allow a distinction between -0.0 and +0.0." self == NegativeZero ifTrue: [ ^ 2147483650 ]. "Smallest to largest negative IEEE 32-bit floats range from
(2147483649 to: 4286578687), so invert that range." self negative ifTrue: [ ^ ("4286578687" 4286578688 - self asIEEE32BitWord) "+ 1" ]. "We're positive. IEEE 32-bit positives range from (0 to: 2139095039)." ^ self asIEEE32BitWord + 2147483651
Switching it to use FloatArray reports '3,530,000 per second.'
hashKey32 | bits | self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. self = NegativeZero ifTrue: [ ^ 2147483651 ]. bits := (FloatArray basicNew: 1) at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^ 4286578688 - bits ]. ^ 2147483651 + bits
Do you think the difference is less pronounced than yours due to my going through Random #next?
On Tue, Dec 23, 2014 at 5:52 PM, Levente Uzonyi leves@elte.hu wrote:
I don't know how you measured the speedup, but I got 8-12x improvement for finite numbers. By creating a new FloatArray, the speedup decreases to 6-9x.
Levente
On Tue, 23 Dec 2014, Chris Muller wrote:
On Mon, Dec 22, 2014 at 3:59 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 22.12.2014, at 00:13, Levente Uzonyi leves@elte.hu wrote:
ConverterFloatArray at: 1 put: self; basicAt: 1.
Any reason not to use this in #asIEEE32BitWord? Endianness? Arch-dependency?
I see, it's not thread-safe. This would be:
(FloatArray new: 1) at: 1 put: self; basicAt: 1.
Might still be faster?
Yes. Since creation of a one-element FloatArray every time did not adversely affect performance of Levente's too significantly (only 3.7X instead of 4.0X faster), I decided it was worth the cost of the allocation than to worry about concurrency. So I ended up with Levente's latest except I cannot risk a calculation ending up -0.0, so I have to account for it too. And, NaN too. Thus:
hashKey32 | bits | self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. self = NegativeZero ifTrue: [ ^ 2147483651 ]. bits := (FloatArray new: 1) at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^ 4286578688 - bits ]. ^ 2147483651 + bits
Since there are not a full 32-bits worth of IEEE 32-bit floats (e.g., several thousand convert to NaN), it might be wise to move +Infinity and NaN _down_ a bit from the very maximum, for better continuity between the float and integer number lines, or for potential future special-case needs..?
In any case, I wanted to at least see if what we have, above, works for every 32-bit IEEE float. To verify that, I enumerated all Floats in numerical order from -Infinity to +Infinity by creating them via #fromIEEE32BitFloat: from the appropriate ranges.
It hit a snag at 2151677948. Check this out:
| this next | this := Float fromIEEE32Bit: 2151677949. next := Float fromIEEE32Bit: 2151677948. self assert: next > this ; assert: ((FloatArray new: 1) at: 1 put: (next); basicAt: 1)
((FloatArray new: 1) at: 1 put: (this); basicAt: 1)
As I thought, the representations between IEEE floats and FloatArray floats are different-enough that their precisions align differently onto the 32-bit map for these two floats. IEEE's are precise-enough to distinguish these two floats, FloatArray representations are not.
That these guys are considered "equal" by the FloatArray is actually good enough for my indexing requirement, but now I'm looking at the prim-fail code for FloatArray:
at: index <primitive: 'primitiveAt' module: 'FloatArrayPlugin'> ^Float fromIEEE32Bit: (self basicAt: index)
If this or the #at:put: primitive were to ever fail on the storage (at:put:) exclusive-or the access (at:) side, then it appears FloatArray itself would retrieve a value different than was stored..!
On Wed, Dec 24, 2014 at 9:38 AM, Levente Uzonyi leves@elte.hu wrote:
I see many reasons why the difference is smaller:
- you're also measuring the generation of the input
- this creates new numbers and triggers GC more often
- you're only benchmarking numbers between 0 and 1. #asIEEE32BitWord is a
lot slower for negative values
- you're using #bench, which has high overhead
- you are comparing different versions than I did
About your modifications: self = NaN will always return false, so that comparison is wrong.
Doh! Thanks. I guess I burned myself again by that "invariant" that two identical objects can be considered equal..
I see you used self = self; is that better than an identity check against NaN? I guess its safer just in case some other NaN instance would be generated in the system? I've just always had an aversion to send #= to a Float and expect to get back true, but I guess if the arg is itself, it should be okay..
self == NegativeZero will almost never be true (try -0.0 == Float negativeZero).
Oh wow. I had changed it to #== to avoid a different issue, but introduced this one..!
Use #= instead.
After trying to understand what the code is about to do, I came to the conclusion that there's no reason to treat negative infinity and infinity separately.
Yes! Something bugged me about putting +Infinity all the way up at (2^32)-2 because of the non-symmetry with the negative side. I like yours better!
hashKey32
self > 0.0 ifTrue: [ ^16r80000003 + ((FloatArray basicNew: 1) at: 1 put: self;
basicAt: 1) ]. self < 0.0 ifTrue: [ ^16rFF800000 - ((FloatArray basicNew: 1) at: 1 put: self; basicAt: 1) ]. self = self ifFalse: [ ^16rFFFFFFFF "NaN" ]. (self at: 1) = 0 ifTrue: [ ^16r80000003 "Zero" ]. ^16r7F800000 "Negative zero"
I'm going with the above (testing 32-bit range on it now), and I even remembered to avoid using the code-formatter to preserve the hex representations for Eliot.
Levente, that you can do such positive critical review of this one method makes me shiver to wonder how many improvements you could discover for Ma-Object-Serializer! ;) Mucho thanks.
Implementing isFinite by looking at the exponent field would avoid the creation of a transient floating point number.
On 12/21/14 13:11 , Eliot Miranda wrote:
Hi Levente,
please rewrite using a temp to hold the raw bits, a class var to hold the float array and hexadecimal. Then one can understand much easier. Also is inlining isFinite that important for performance?
rawBits := FloatArrayBuffer at: 1 put: self; basicAt: 1 etc...
Eliot (phone hence not writing the full method)
On Dec 20, 2014, at 7:27 PM, Levente Uzonyi leves@elte.hu wrote:
On Sat, 20 Dec 2014, Chris Muller wrote:
On Fri, Dec 19, 2014 at 4:01 PM, Louis LaBrunda Lou@keystone-software.com wrote:
Hi Chris,
Is this any faster?
Float>>#hashKey32
^self isFinite ifTrue: [ self negative ifTrue: [4286578688 - self asIEEE32BitWord] ifFalse: [self asIEEE32BitWord + 2147483651] ] ifFalse: [self negative ifTrue: [0] ifFalse: [4294967294]].
About the same, but I think I like your code better. Thanks.
Dave has already suggested to use a FloatArray for conversion instead of #asIEEE32BitWord. We use this technique in various network protocol implementations, and it works great.
Here's a significantly faster, optimized version:
hashKey32: aFloatArray
self - self = 0.0 ifTrue: [ self < 0.0 ifTrue: [ ^4286578688 - (aFloatArray at: 1 put: self; basicAt: 1) ]. ^2147483651 + (aFloatArray at: 1 put: self; basicAt: 1) ]. self < 0.0 ifTrue: [ ^0 ]. ^4294967294
The argument is any FloatArray instance with at least one slot.
Levente
squeak-dev@lists.squeakfoundation.org