[squeak-dev] 4.1 - hashed collections still a problem

Levente Uzonyi leves at elte.hu
Fri Mar 26 14:40:43 UTC 2010


On Fri, 26 Mar 2010, Igor Stasenko wrote:

> On 25 March 2010 10:27, Levente Uzonyi <leves at elte.hu> wrote:
>> On Thu, 25 Mar 2010, Igor Stasenko wrote:
>>
>>> i think that #pointsTo: is a cheat :), which you can use in Sets but
>>> not dictionaries, because
>>> it contains associations. Also, it works only for identity-based
>>> collections.
>>
>> Dictionaries don't have to use associations (for example MethodDictionary
>> doesn't use them), that's why #pointsTo: works (MethodDictionary also uses
>> it).
>>
> But that means a linear scan of the whole collection, even if done primitively,
> this is not scalable.

If you mean that MethodDictionaries implementation of #includesKey: 
doesn't scale, then you're right. But it doesn't have to scale at all.
The average size of the MethodDictionaries in my image is ~11.1, the 
average capacity is ~17.2. For these dictionaries #includesKey: is >3x 
faster than the following (which is 30% faster than Dictionary >> 
#includesKey:):

 	^aSymbol
 		ifNil: [ false ]
 		ifNotNil: [ (array at: (self scanFor: aSymbol)) ~~ nil ]

The largest MethodDictionary contains 1176 keys in my image and the 
primitive is only 20% slower for that than the non-primitive method. The 
second largest has 610 entries and the primitive method is still 37% 
faster for that than the non-primitive version.


In LargeIdentityDictionary/LargeIdentitySet #pointsTo: does the same job 
as the list scanning loop in MaIdentityDictionary/MaIdentitySet, and
#pointsTo: is faster than that.


Levente

>
>>>
>>>> I wonder how LargeIdentityDictionary compares to your dictionaries'.
>>>>
>>> me too.
>>
>> If you give me a pointer to the source code, I can run the benchmarks.
>>
>>
>> Levente
>>
>>
>
>
>
> -- 
> Best regards,
> Igor Stasenko AKA sig.
>
>



More information about the Squeak-dev mailing list