[squeak-dev] 32 vs 64 bits and large integer hash
asqueaker at gmail.com
Fri Nov 23 00:35:37 UTC 2018
> >> I'm a bit puzzled. I thought (small)Integers being their own hash is a good thing?
I was wondering exactly the same thing!
> > I would call it simple but not necessarily good.
> > The problem with it is that consecutive numbers generate long chains in HashedCollections:
> > a := (1 to: 1000) asArray.
> > s := Set withAll: a.
> > [ 1 to: 1000000 do: [ :each | s includes: each ] ] timeToRun.
> > "==> 7014"
> > The solution in Squeak is to use PluggableSet instead of Set, because it applies #hashMultiply on the hash value:
> > ps := PluggableSet integerSet.
> > ps addAll: a.
> > [ 1 to: 1000000 do: [ :each | ps includes: each ] ] timeToRun.
> > "==> 95"
> > IIRC in Pharo SmallInteger's hash is based on #hashMultiply to avoid the long chains. That was probably the main reason for the push to make #hashMultply a numbered primitive.
Indeed! When making a #hash methods, one always focuses on hash
distribution and finding the elements, but its easy to forget about
performance of NOT finding an element.
More information about the Squeak-dev