[squeak-dev] 32 vs 64 bits and large integer hash

Eliot Miranda eliot.miranda at gmail.com
Thu Nov 22 18:32:54 UTC 2018


> On Nov 21, 2018, at 11:48 PM, Tobias Pape <Das.Linux at gmx.de> wrote:
> 
> 
>> On 22.11.2018, at 01:15, Levente Uzonyi <leves at caesar.elte.hu> wrote:
>> 
>> Why not do it the other way around and implement SmallInteger >> #hash as a primitive with the following fallback code?
>> 
>> SmallInteger >> #hash
>> 
>>    <primitive: XXX>
>>    | remainder hash |
>>    self < 0
>>        ifTrue: [
>>            remainder := 0 - self.
>>            hash := LargeNegativeInteger hash ]
>>        ifFalse: [
>>            remainder := self.
>>            hash := LargePositiveInteger hash ].
>>    [ remainder > 0 ] whileTrue: [
>>        hash := (hash + (remainder bitAnd: 16rFF)) hashMultiply.
>>        remainder := remainder bitShift: -8 ].
>>    ^hash
>> 
>> The only problem to solve is the calculation of the initial hash value.
>> The VM has to know the initial hash value (e.g. it could be a constant based on the sign of the receiver) or it has to know how to calculate it (but that's currently done differently among forks) or the value has to be an argument of the primitive.
>> 
>> Levente
> 
> I'm a bit puzzled. I thought  (small)Integers being their own hash is a good thing?

+1

> 
> Best regards
>    -Tobias
> 
>> 
>> P.S.: This is another case where Behavior >> #hash bites Squeak and causes additional slowdown.
>> P.P.S.: Float >> #hash should use #bitXor: and #hashMultiply instead of #+ and #bitShift:
>> 
>>> On Wed, 21 Nov 2018, Eliot Miranda wrote:
>>> 
>>> Hi All,
>>>    right now we have the following definition of Large(Positive)Integer>>hash:
>>> hash
>>> ^ByteArray hashBytes: self startingWith: self species hash
>>> which means that for all integers outside of the 32-bit SmallInteger range (-2 ^ 30 to 2 ^ 30 - 1), the 32-bit system and the 64-bit system answer different values for hash.
>>> e.g. in 64 bits: (2 raisedTo: 30) hash 1073741824
>>> but in 32 bits: (2 raisedTo: 30) hash 230045764
>>> This is unsatisfactory.  I propose changing Large(Positive)Integer>>hash to
>>> hash
>>> ^self digitLength <= 8
>>> ifTrue: [self]
>>> ifFalse: [ByteArray hashBytes: self startingWith: self species hash]
>>> P.S. Note that this will not break Float hash, which is defined as
>>> Float>>hash
>>> "Hash is reimplemented because = is implemented. Both words of the float are used. (The bitShift:'s ensure that the intermediate results do not become a large integer.) Care is taken to answer same
>>> hash as an equal Integer."
>>> (self isFinite and: [self fractionPart = 0.0]) ifTrue: [^self truncated hash].
>>> ^ ((self basicAt: 1) bitShift: -4) +
>>>  ((self basicAt: 2) bitShift: -4)
>>> P.P.S. I *think* that "(self isFinite and: [self fractionPart = 0.0])" is equivalent to "self - self = self fractionPart" ;-)
>>> _,,,^..^,,,_
>>> best, Eliot
>> 
> 
> 


More information about the Squeak-dev mailing list