hash and equal problem
stéphane ducasse
ducasse at iam.unibe.ch
Tue Mar 28 08:20:34 UTC 2006
Nicolas
do you have a fix for that behavior?
Stef
On 27 mars 06, at 21:44, nicolas cellier wrote:
> Hi all,
>
> i do not think that this behavior is correct
> (2 = 2.0) = true.
> (2 hash = 2.0 hash) = false.
>
> i saw a discussion about it in an old old thread (1999), but did not
> understand the rationale for such a behavior.
>
> i can exhibit a lot of examples of code completely broken by this
> behavior.
>
> Try this one:
>
> | aSet |
> aSet := Set new.
> aSet add: #( 1.0 2 3 4 5 6).
> aSet add: #( 1 2.0 3 4 5 6).
> aSet add: #( 1 2 3.0 4 5 6).
> aSet add: #( 1 2 3 4.0 5 6).
> aSet add: #( 1 2 3 4 5.0 6).
> aSet add: #( 1 2 3 4 5 6.0).
> aSet size
>
> Then, try this slight modification:
>
> | aSet |
> aSet := Set new: 12.
> aSet add: #( 1.0 2 3 4 5 6).
> aSet add: #( 1 2.0 3 4 5 6).
> aSet add: #( 1 2 3.0 4 5 6).
> aSet add: #( 1 2 3 4.0 5 6).
> aSet add: #( 1 2 3 4 5.0 6).
> aSet add: #( 1 2 3 4 5 6.0).
> aSet size
>
> You will get 6 or 5 depending on initial Set capacity...
>
> Secondly, Array hash, Matrix hash, WideSstring hash are hashing
> each and every
> element... Sawing this, I imagine one never store 100x100 Matrices
> in a
> Dictionary, unless prepared to spend a lot of time at coffee
> machine...
>
> In other words, finding a good balance between long to run hash
> algorithm and
> too many identical hash values is difficult. I am not sure that the
> exhaustive way is a good balance.
>
> The bet that string will be short is maybe not the worst one
> (unless you store
> whole sentence like i saw in translation), but betting that every
> array in
> Smalltalk will be short is nonsense.
>
> Try this one if you are not convinced:
> (SequenceableCollection allSubInstances collect: [:e | e size])
> asBag
> valuesAndCounts associations asSortedCollection: [:a :b | a
> key > b key]
>
> In a 3.9a image, i have an average of 74 and a maximum over the
> millon.
>
> So i raise the question again. Should we really let above behavior
> exist in
> squeak ?
> Shouldn't we look for more efficient hashing ?
> I'm asking for a good rationale (compatibility issues included).
>
>
More information about the Squeak-dev
mailing list
|