Faster dictionaries?

sig siguctua at gmail.com
Fri Jul 20 15:18:23 UTC 2007


On 20/07/07, Adrian Lienhard <adi at netstyle.ch> wrote:
> On Jul 19, 2007, at 22:53 , sig wrote:
> >>
> >> Time it takes to add 100k consecutive integers as keys of a
> >> dictionary.
> >>
> > | sd md r |
> > Transcript clear.
> > r := OrderedCollection new.
> > 1 to: 100000 do: [:each | r add: each ].
> > sd := BDictionary new.
> > Smalltalk garbageCollect.
> > Transcript cr; show: 'time to add all strings to bd: ',
> >       [ r do: [:each | sd at: each put: each ] ] timeToRun printString.
> > sd := Dictionary new.
> > Smalltalk garbageCollect.
> > Transcript cr; show: 'time to add all strings to sd: ',
> >       [ r do: [:each | sd at: each put: each ] ] timeToRun printString.
> >
> >
> > time to add all strings to bd: 719
> > time to add all strings to sd: 870
>
> However, accessing the integer keys is faster with the Smalltalk
> Dictionary!
>
> I got the folllowing numbers:
>
> time to add to bd: 298
> time to add to sd: 347
>
> time to access bd: 207
> time to access sd: 110
>

This is no longer true :)

time to add all  to md: 499
time to add all  to sd: 868

time to access to md: 315
time to add access to sd: 325

md - stands for MaDictionary. i replaced them with mine and tuned some
methods :)

>
> ok, the additional memory consumption seems reasonable as there is
> only one additional instance variable in BAssociation (note,
> Association is a compact class and hence its header is only 4 bytes
> compared to 8 bytes of the header of BAssociation; but BAssocitaion
> could also be made a compact class).
>
> As far as I can tell, your implementation performs very well for
> objects with no reasonable hash, like the identityHash of  Squeak
> (which is way too small). Else, this does not seem to be the case.
>

My dicts showing most better results , when you need to remove keys.
In most cases, dictionaries used to be populated once, then
accessing/replacing values.
But in case, when you need to remove keys (especially weak key
dictionaries) current squeak implementation is very slow.

Dictionary calls #fixCollisionsFrom:
and WeakKeyDictionary calls rehash on each key removal!
This is a show-stopper approach.
Dictionary with linked associations don't require immediate rehashing
upon removing a key.
Actually, in my case i need rehashing just for two reasons: optimize
access time and optimize memory consumption.
So, whenever you using dictionary not only for populate and access
purposes, but also deleting keys from it , then its preferable to use
linked-assoc. dicts.

> Cheers,
> Adrian
>
> ___________________
> Adrian Lienhard
> www.adrian-lienhard.ch
>
>
>



More information about the Squeak-dev mailing list