[BUG] Symbol>>hash

Andres Valloud avalloud at exobox.com
Thu Aug 3 01:19:43 UTC 2000


Hi Bob.

> >I haven't noticed any slow
> >down in Squeak though, and certainly String>>hash delivers more than 12
> >bits :).
> 
> I'm not sure how true that is. I ran a little test in my image. I found
> 
> total strings: 32137
> unique strings: 16583
> unique hash values: 3556
> 
> The distribution of hash values for the unique strings looked like (count->hash):
> 
> 48->5644
> 39->4876
> 38->5639
> 38->5640
> 36->3235
> 36->5642
> 36->5643
> 35->3234
> 35->4877
> ...etc
> 
> That doesn't look like much more than 12 bits to me. In fact, since only 3556 hash values were used, I'd be inclined to say it's just about exactly 12 bits.

Hmmmmm... closer examination reveals that String>>hash can deliver up to
14 bits of hash (from the bottom line of code). Ummm. But it's using two
characters to generate such hash, so it could be delivering 16 bits of
hash, ummm. And there's also that magic number return with 15 bits.
Ummm...

> biggerHash
> 
>         self size <= 2 ifTrue: [^self hash].
>         ^self inject: self size into: [ :sum :each | (sum + sum + each asciiValue) bitAnd: 16rFFFFFF]

I think this was what the current hash was meant to avoid. Perhaps we
could change it into something not so expensive, but simply better...
something that gives you 24 bits of hash.

Andres.





More information about the Squeak-dev mailing list