[squeak-dev] identityHash or scaledIdentityHash ?

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Thu Dec 24 00:18:16 UTC 2009


2009/12/24 Levente Uzonyi <leves at elte.hu>:
> On Thu, 24 Dec 2009, Nicolas Cellier wrote:
>
>> Maybe Levente will help me clarifying this...
>> I encountered several different implementations:
>>
>> MessageTally, Semaphore, FutureMaker, LocatedMethod, TextAnchor
>> hash
>>   ^self identityHash
>> (or ^someInstVar identityHash, which is equivalent)
>>
>> Object
>> hash
>>  ^self scaledIdentityHash
>>
>> The first does not scale well for large Set/Dictionary, and the second
>> costs a little overhed on small (Set/Dictionary).
>> My questions are:
>> - are these different implementations deliberate ? For example knowing
>> there won't be any large Set... In which case a comment would help
>
> No.
>
>> - or should we always use the scaledIdentityHash to hash ? (and let
>> identityHash for printing and a few well commented exceptional cases)
>
> Exactly.
>
> I quickly checked the classes and I think that:
> - all of the above should be #scaledIdentityHash
> - optionally MessageTally >> #hash should use the process's hash,
>  since #= also compares the process variable
>
>>
>> Beside, apart printing usage, I note these two usages of IdentityHash as
>> hash:
>> MethodDictionary scanFor: (I guess we don't expect more than 4096
>> messages too often).
>
> MethodDictionary's array's size is a power of two, and #scaledIdentityHash
> is just #identityHash bitShift: 18. Using #scaledIdentityHash would degrade
> the performance to the minimum. The use of #hash instead of #identityHash
> could help with large MethodDictionaries, since the keys are Symbols, but
> the current code assumes that the size will never reach 4096. And that's
> acceptable IMO.
>

Indeed, current trunk monster is far:
(ProtoObject withAllSubclasses asArray collect: [:e | e selectors size
-> e]) sort last.
  1167->Morph

We should accept only refactorings that reduce this number :)

>> Player recaptureUniqueCostumes
>>
>
> That should use #scaledIdentityHash, since the set defined there mimics an
> IdentitySet.
>
>> I also noted that MethodDictionary does not override
>> scanForEmptySlotFor: which is defined in term either of hash or
>> scaledIdentityHash, rather than IdentityHash
>> This is currently OK, because the unique path I found was:
>>
>> MethodDictionary>>#at:put
>> -> HashCollection>>atNewIndex:put:
>> -> MethodDictionary>>#grow
>> OK, grow overrides super, otherwise:
>> -> HashCollection>>#grow
>> -> HashCollection>>#growTo:
>> -> Dictionary>>#noCheckNoGrowFillFrom:
>> -> HashCollection>>#scanForEmptySlotFor:
>> -> Symbol>>#hash
>>
>> But it's a potential tricky trap for future.
>>
>
> Well, maybe. MethodDictionary has a different structure than other
> HashedCollections and because of this it has to reimplement #grow and
> #rehash. #scanForEmptySlotFor: is only used by those two methods in general
> (WeakKeyDictionary is the only exception).
>

You have a better view (conceptual) and maybe my fear was exagerated.
But it was not that easy to follow the senders the reverse-engineering way...

>
> Levente
>

Thanks for explanations.

>> Nicolas
>>
>>
>
>



More information about the Squeak-dev mailing list