Reimplementing WeakKeyDictionary

"Martin v. Löwis" martin at v.loewis.de
Mon Mar 19 05:57:25 UTC 2007


> That's an interesting approach. I would expect this to impact the time 
> it takes to insert/find elements in the dict, no? Like:
>     wd := WeakKeyDictionary new.
>     1 to: 100 do:[:i| wd at: i put: i].
>     1 to: 99  do:[:i| wd removeKey: i].
> After this point I'd expect the operations to be slower since we have 
> about a hundred expired elements in the dict. 

It strongly depends on the operations now. The lookup for the only
remaining key (100) will not be slower, since that key is not colliding.

The next invocation of fullCheck will see that there are way too many
expired entries in the dictionary, and rehash it. So the very next 
insertion may be slightly slower, but all subsequent insertions
and other operations will have the same performance - in this specific
use case, the new code just saved a hundred rehash operations.

> Nothing that a good 
> #rehash can't fix of course, but I'm wondering if at some point the WKD 
> shouldn't recognize the ratio of expired/free/used slots and act 
> accordingly. 

Ah, it does: if 4*expired > array size, it rehashes (in fullCheck). The
precise percentage may be debatable, and it may be an option to also
consider the relationship of expired and tally, but it will definitely
rehash from time to time (and growing will drop expired entries, too).

It might also be useful to add a fullCheck into removeKey:, so that in 
above example it won't end up with with 99 expired associations, but
only one. With rehashing as implemented in the patch, this would cause
the dictionary to shrink (as it uses essentially the implementation
of Dictionary>>rehash); in turn, the loop would do 7 rehashes, at
41, 21, 14, 9, 6, 4, 3 expired elements.

With rehashing similar to the original one (i.e. one that keeps the 
size), a fullCheck on remove, in above example, would rehash every
40 removals (given that the dictionary will have 160 slots), so you
would end up with 20 expired elements after the loop.

Regards,
Martin




More information about the Squeak-dev mailing list