[squeak-dev] 32 vs 64 bits and large integer hash

Levente Uzonyi leves at caesar.elte.hu
Thu Nov 22 00:15:22 UTC 2018


Why not do it the other way around and implement SmallInteger >> #hash as 
a primitive with the following fallback code?

SmallInteger >> #hash

 	<primitive: XXX>
 	| remainder hash |
 	self < 0
 		ifTrue: [
 			remainder := 0 - self.
 			hash := LargeNegativeInteger hash ]
 		ifFalse: [
 			remainder := self.
 			hash := LargePositiveInteger hash ].
 	[ remainder > 0 ] whileTrue: [
 		hash := (hash + (remainder bitAnd: 16rFF)) hashMultiply.
 		remainder := remainder bitShift: -8 ].
 	^hash

The only problem to solve is the calculation of the initial hash value.
The VM has to know the initial hash value (e.g. it could be a constant 
based on the sign of the receiver) or it has to know how to calculate it 
(but that's currently done differently among forks) or the value has to be 
an argument of the primitive.

Levente

P.S.: This is another case where Behavior >> #hash bites Squeak and causes 
additional slowdown.
P.P.S.: Float >> #hash should use #bitXor: and #hashMultiply instead of 
#+ and #bitShift:

On Wed, 21 Nov 2018, Eliot Miranda wrote:

> Hi All,
>     right now we have the following definition of Large(Positive)Integer>>hash:
> 
> hash
> ^ByteArray hashBytes: self startingWith: self species hash
> 
> which means that for all integers outside of the 32-bit SmallInteger range (-2 ^ 30 to 2 ^ 30 - 1), the 32-bit system and the 64-bit system answer different values for hash.
> 
> e.g. in 64 bits: (2 raisedTo: 30) hash 1073741824
>  but in 32 bits: (2 raisedTo: 30) hash 230045764
> 
> This is unsatisfactory.  I propose changing Large(Positive)Integer>>hash to
> 
> hash
> ^self digitLength <= 8
> ifTrue: [self]
> ifFalse: [ByteArray hashBytes: self startingWith: self species hash]
> 
> 
> P.S. Note that this will not break Float hash, which is defined as
> 
> Float>>hash
> "Hash is reimplemented because = is implemented. Both words of the float are used. (The bitShift:'s ensure that the intermediate results do not become a large integer.) Care is taken to answer same
> hash as an equal Integer."
> 
> (self isFinite and: [self fractionPart = 0.0]) ifTrue: [^self truncated hash].
> ^ ((self basicAt: 1) bitShift: -4) +
>   ((self basicAt: 2) bitShift: -4)
> 
> P.P.S. I *think* that "(self isFinite and: [self fractionPart = 0.0])" is equivalent to "self - self = self fractionPart" ;-)
> 
> _,,,^..^,,,_
> best, Eliot
> 
>


More information about the Squeak-dev mailing list