[squeak-dev] 32 vs 64 bits and large integer hash
luchiano at gmail.com
Sat Nov 24 15:41:22 UTC 2018
On Fri, Nov 23, 2018 at 12:36 AM Chris Muller <asqueaker at gmail.com> wrote:
> > >> I'm a bit puzzled. I thought (small)Integers being their own hash is
> a good thing?
> I was wondering exactly the same thing!
> > > I would call it simple but not necessarily good.
> > > The problem with it is that consecutive numbers generate long chains
> in HashedCollections:
> > >
> > > a := (1 to: 1000) asArray.
> > > s := Set withAll: a.
> > > [ 1 to: 1000000 do: [ :each | s includes: each ] ] timeToRun.
> > > "==> 7014"
> > >
> > > The solution in Squeak is to use PluggableSet instead of Set, because
> it applies #hashMultiply on the hash value:
> > >
> > > ps := PluggableSet integerSet.
> > > ps addAll: a.
> > > [ 1 to: 1000000 do: [ :each | ps includes: each ] ] timeToRun.
> > > "==> 95"
> > >
> > > IIRC in Pharo SmallInteger's hash is based on #hashMultiply to avoid
> the long chains. That was probably the main reason for the push to make
> #hashMultply a numbered primitive.
> > >
> > Interesting!
> Indeed! When making a #hash methods, one always focuses on hash
> distribution and finding the elements, but its easy to forget about
> performance of NOT finding an element.
Yes! I had this problem blow in my face a couple of years ago, and Juan
agreed to include the change in Cuis to make SmallIntegers NOT their own
I think it is a good idea in general when programming a hash method in
Smalltalk to make it somewhat random. It will never be perfect for every
use case, and it doesn't need to be cryptographically secure, but there
shouldn't be simple use cases (e.g., consecutive integers) that produce
hashes that are not uniformly distributed. This can be achieved with
minimal performance impact (and potentially big performance gains in hashed
collections) quite simply by sending some hashMultiply message in your hash
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Squeak-dev