Hash rehashed (was: [SqF]Report of VI4 Project for Feb '02)

Scott A Crosby crosby at qwes.math.cmu.edu
Tue Feb 5 02:42:17 UTC 2002


On Mon, 4 Feb 2002, Tim Rowledge wrote:

> Is there any value in making the putative hash-multiplier be dependant
> upon the size of the collection into which the object is being put?

There is a *small* reason for this. Let the multiplier be $p$. If the size
of the array is a multiple of $p$, that would be very very bad. A more
subtle problem occurs if the size of the array has many factors in common
with $p-1$/$p-1$, but that is no worse the performance we get as-is now.

So, let $p$ be a big beefy 6 digit prime that nobody will hit by mistake.
(Hell, we'll make sure the hash growth code explicitly disallows any
multiple of this number.)

If we dedicate 16-20 bits to hashBits, then a different solution then this
will be wanted. (some non-linear function.)

I'd say to keep the multiplier fixed for the reasons below. My most recent
version has some random 6 digit prime, but if you don't like it, I can
give another prime.

> This could be implemented by adding an instvar to all collections,
> though I fear that would be rather wasteful since so many collections
> are small enough that the multiplier would be 1. If a single list of
> values would be good (ie all varieties of collection can use the same
> values) then a trivial prim could take the array size, the object hash
> and do the entire lookup/algorithm - multiply - modulo operation.

Scott





More information about the Squeak-dev mailing list