[Q] Dictionary delays

Ned Konz ned at bike-nomad.com
Sat Apr 13 02:55:57 UTC 2002


On Friday 12 April 2002 08:10 am, Harry Chesley wrote:

> Does anyone understand why I'm getting the behavior I am?

Perhaps hash collisions? The hash values are only 12 bits in size, and 
the code in IdentityDictionary>>scanFor: will first multiply by 
(size/4K):

scanFor: anObject
	| finish hash start element |
	finish _ array size.
	finish > 4096
		ifTrue: [hash _ anObject identityHash * (finish // 4096)]
		ifFalse: [hash _ anObject identityHash].
	start _ (hash \\ array size) + 1.

And then it starts scanning at start to the end, and then back from 
the beginning to start to find a free slot.

So the reason it starts slower with a bigger dictionary is that it has 
to do the multiplication and divide each time as soon as the 
dictionary is >4K in capacity.

At 409600 entries there's an average of 10 entries with the same hash 
value; the method has to skip over these.

This all assumes that your hash values are evenly distributed. If 
they're not, then the number of slots that have to be searched to 
find a new one will be much worse. As will the access time.

You might try examining the hash distribution on your keys (the hash 
is computed using identityHash).

-- 
Ned Konz
http://bike-nomad.com
GPG key ID: BEEA7EFE




More information about the Squeak-dev mailing list