identityHash patch discussion.

Scott A Crosby crosby at qwes.math.cmu.edu
Mon Dec 10 20:32:34 UTC 2001


On Mon, 10 Dec 2001, David Chase wrote:

> Perhaps I am misunderstanding this proposed longer hash

Yes. All objects in squeak have 12 bits in their header that form the
default hash of that object. (Though identityhash).

The problem is that those bits are used as-is, to construct a number in
the *specific* range 0...4095, which is used as the starting index in
Set>>scanFor: This flaw means that *any* set and *any* dictionary that has
>4000 elements in it will devolve into a linear search through all of the
elements, when it could be 4000 times faster.

This is *not* a problem in there being only 12 bits in the object header,
but that those 12 bits are interpreted as values in the *specific* range
0...4095.

My patch just multiplies that 12-bit default-hash number by a constant, to
spread the numbers out to avoid this horrible unscalability. The hard part
is getting it into the image, because you have to rebuild all existing
hash tables.

See my 'Horrible unscalability in collections' thread from a few months
back.


Scott






More information about the Squeak-dev mailing list