Need an efficient large weak identity key dictionary

Stephen Pair spair at acm.org
Sun Aug 11 14:03:56 UTC 2002


The need to associate extended attributes with potentially very large
subsets of the objects in memory seem to me to be a very common
occurrence.  It's needed for the DependentsFields and the EventsFields
(even though neither of those seem to get heavily used).  It's also
needed in virtually any persistence or distributed object framework.  I
could have used it for my Kats transaction service and made things a bit
simpler and more general.

The best way of doing this (aside from modifying the VM to add a special
slot) is of course to use a hashed structure.  We have
WeakIdentityKeyDictionary, however is suffers from a few problems:

A) The optimization to spread the hash when the array grows larger than
4096 slots that is in IdentityDictionary was never added to
WeakIdentityKeyDictionary.  A small test that adds 10000 items to the
dictionary went from 150 seconds to just 5.5 seconds on my machine after
making that change.

B) When such a collection does grow very large, it has to potential to
impact incremental GC performance if that collections is in young space,
or if it is marked as a root.

I think that without modifying the VM, we could get a reasonably good
WeakIdentityKeyDictionary if we add the optimization mentioned in A, and
if we break up the WeakIdentityKeyDictionary's array into a set of
smaller arrays.  I have an implementation of a LargeArray that does
exactly this that could be used.  The LargeArray simply breaks an array
up into a number of smaller chunks.

I'm thinking about hacking out such a thing (I'm not exactly sure what
to name it...suggestions)...also, I'd like to hear opinions on this
topic (i.e. is it needed? are there any flaws in my arguments?).

- Stephen




More information about the Squeak-dev mailing list