Horrible unscalability, how do I change hashBitsOf: ?

Tim Rowledge tim at sumeru.stanford.edu
Sat Oct 6 16:55:54 UTC 2001


Scott A Crosby wrote:
> The hashing strategy used by dictionaries and sets is non-optimal if the
> hashcode returned is not over a wide range..
I think a long conversation about hashing can be found in the list
archive; 'fraid I don't remember any dates to help you. It's a perennial
topic in Smalltalk.


> The fix is to make the following change:
> 
> ```
> hashBitsOf: oop
>         ^ ((self baseHeader: oop)
>                 >> 17 bitAnd: 4095) << 10
err, I think that will simply leave the same number of hash values,
simply multiplied by 1024 or whatever. It's hard to see how that could
help with the problem. To really improve things would require a biger
hash field in the object headers. Of course, that would cost other
useuful bits, or an extra header word or some other complication.

Don't forget that you can implement hash for any object you like; the
primitive is there mainly as a convenient default these days. Again
there are long discussions in the list archives (and in
comp.lang.smalltalk) about the best ways to do hashing, how to make sure
you keep hash and == in sync, when to expect identity to lead to the
hash and when to expect content to make the hash. 

I'm surprised that symbol hashing seemed expensive - didn't Andres
Valloud's rework of all that stuff get in some time ago? It was supposed
to make a big difference.

tim






More information about the Squeak-dev mailing list