[squeak-dev] Re: 4.1 - hashed collections still a problem
Andres Valloud
avalloud at smalltalk.comcastbiz.net
Tue Mar 30 09:44:21 UTC 2010
I mentioned implementing identityHash as hash as an example given. I
also think I mentioned the rule x == y => x identityHash = y
identityHash, so I hope it's clear that one shouldn't just blindly move
ahead in these matters. The point is that nothing prevents anybody from
adding an instance variable called identityHash to their objects,
storing arbitrary (but well chosen!) small integers in said instance
variable, and then having the identityHash message just answer said
values. If you do this for the cases in which you have significantly
more than 4096 objects, then you only pay the price to hold better
identityHash values for the objects that need them (as opposed to every
single object in the image when the header is expanded instead). Or
perhaps you don't really need identity for the cases being discussed in
practice, and just using hash as identityHash is fine. It's hard to
tell without concrete examples. One way or the other, I do not think
the size of the identityHash field *must* result in poor hashed
collection performance. Such an implication does not follow.
On 3/30/10 1:48 , Andreas Raab wrote:
> On 3/30/2010 1:09 AM, Andres Valloud wrote:
>
>> Right, so provide a better identityHash implementation in the image
>> (e.g.: implement hash and then have identityHash call hash instead), and
>> problem solved...
>>
> Except that #hash is not constant over the lifetime of most objects but
> #identityHash is. So if you have a property associated with an object in
> a IDDict and the #hash depends on a value of a variable it may change
> over the lifetime of the object and your key gets invalid.
>
> Cheers,
> - Andreas
>
>
>> On 3/26/10 3:37 , Levente Uzonyi wrote:
>>
>>> On Thu, 25 Mar 2010, Andres Valloud wrote:
>>>
>>>
>>>> If lookups find the sought object in mostly one attempt, the
>>>> primitive is
>>>> overkill... most of the time, the real issue is the quality of the hash
>>>> function.
>>>>
>>> That's true, but this is not the case with the 4096 hash values.
>>>
>>>
>>> Levente
>>>
>>>
>>>> On 3/25/10 1:27 , Levente Uzonyi wrote:
>>>>
>>>>> On Thu, 25 Mar 2010, Igor Stasenko wrote:
>>>>>
>>>>>
>>>>>
>>>>>> i think that #pointsTo: is a cheat :), which you can use in Sets but
>>>>>> not dictionaries, because
>>>>>> it contains associations. Also, it works only for identity-based
>>>>>> collections.
>>>>>>
>>>>>>
>>>>> Dictionaries don't have to use associations (for example
>>>>> MethodDictionary
>>>>> doesn't use them), that's why #pointsTo: works (MethodDictionary also
>>>>> uses it).
>>>>>
>>>>>
>>>>>
>>>>>>> I wonder how LargeIdentityDictionary compares to your dictionaries'.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> me too.
>>>>>>
>>>>>>
>>>>> If you give me a pointer to the source code, I can run the benchmarks.
>>>>>
>>>>>
>>>>> Levente
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>
>
>
More information about the Squeak-dev
mailing list
|