Incongruent hash

Eliot & Linda elcm at pacbell.net
Mon Feb 9 17:03:55 UTC 1998


David N. Smith wrote:
> 
> At 14:16 -0500 2/8/98, Leandro Caniglia wrote:
> Hi!
> 
> Luciano Notarfancesco told me this bug:
> 
>     1 = 1.0 true
> but
>     1 hash = 1.0 hash false
> 
> 
> Saludos,
> Leandro
> 
> That may be my fault. The hash code for Float was rewritten (by me) so that
> it produces a passable hash value. The old one was truly awful. There was
> discussion on the issue in the group; hopefully the archive still has it
> (May 97). If not, I may have it stuck away somewhere.
> 
> If you really want to fix the problem, fix integer hash to convert an
> integer to float before taking the hash. SmallInteger uses a primitive to
> convert to float, so:
> 
>    hash
>       self asFloat hash
> 
> should be fairly cheap. Long integers pose a bigger problem since floats
> can represent about 52 bits of value and you might want to convert to float
> when there are 52 or less bits in the long integer, and do something else
> when more.
> 
> There is a real problem with the rule that objects which compare equal must
> have equal hashes because it promotes really bad hash algorithms. For
> example, the hash for floats in IBM Smalltalk, and I think in Squeak before
> I 'fixed' it, is/was:
> 
>    hash
>       self truncated hash
> 
> This means that only the integer part of the float is considered, making
> fully half of the range of a floating point number answer 0, one-quarter
> answer 1, one-eighth answer 2, etc.

[snip]

> I think this whole problem needs a fresh look. Anyone have any thoughts on
> how to have reasonably fast yet reasonably good hashes? Should 'the rule'
> be changed? How?

Steve Dahl reimplemented numeric hashes for VisualWorks 3.0 and now we
have equal hashes for fractions representable as doubles, and for the
integers.  The double hash function explicitly tests for integerness,
i.e.

Double methodsFor: 'comparing'
hash
    | tr |
    tr := self truncated.
    ^self = tr
        ifTrue: [tr hash]
        ifFalse:
            ["clever algorithm with good distribution deleted"
            ...


Then Fraction's hash is derived by coercing to a Double.

Point's hash needs to be something like
Point methodsFor: 'comparing'
hash
    ^x = y
        ifTrue: [x hash]
        ifFalse:
            ["original Point hash function deleted"
            ...

_______________,,,^..^,,,_______________
Eliot





More information about the Squeak-dev mailing list