[BUG] Symbol>>hash

Bob Arning arning at charm.net
Wed Aug 2 21:41:19 UTC 2000


On Wed, 02 Aug 2000 14:46:51 -0700 Andres Valloud <avalloud at exobox.com> wrote:
>I haven't noticed any slow
>down in Squeak though, and certainly String>>hash delivers more than 12
>bits :).

I'm not sure how true that is. I ran a little test in my image. I found

total strings: 32137
unique strings: 16583
unique hash values: 3556

The distribution of hash values for the unique strings looked like (count->hash):

48->5644
39->4876 
38->5639 
38->5640 
36->3235 
36->5642 
36->5643 
35->3234 
35->4877 
...etc

That doesn't look like much more than 12 bits to me. In fact, since only 3556 hash values were used, I'd be inclined to say it's just about exactly 12 bits.

To see the implications of this, I created a big set (Set new: 100000) and added all 32137 strings to it. The results:

time to add all strings to a big set: 532 seconds
time to add one string to a big set: 16 milliseconds

That's a _lot_ of time in my book.

I tried a different hash algorithm that produced a greater range of values and got these results:

time to add all strings to a fast set: 8 seconds
time to add one string to a fast set: 0.257 milliseconds

And this is with a hash that takes longer to calculate, but it sure helps in the end.

Cheers,
Bob
---- a faster hash -----
biggerHash

	self size <= 2 ifTrue: [^self hash].
	^self inject: self size into: [ :sum :each | (sum + sum + each asciiValue) bitAnd: 16rFFFFFF]





More information about the Squeak-dev mailing list