[squeak-dev] Message>>#= & Message>>#hash

Levente Uzonyi leves at caesar.elte.hu
Wed Nov 21 23:23:09 UTC 2018


Nice find. So the reason for the change was to improve performance, but 
now that change actually makes things slower.
We have 22-bit identity hashes and #hashMultiply is quicker too, so a new 
hash function based on those could be 6x-10x quicker.

Levente

On Wed, 21 Nov 2018, Bob Arning wrote:

> I'm neutral on this, but here is a bit more (and from earlier it would seem)
> 
> 
> 'From Squeak3.7beta of ''1 April 2004'' [latest update: #5963] on 22 June 2004 at 8:51:57 pm'
> "Change Set:        BehaviorHashEnh v1.2
> Date:            22 June 2004, 16.02.2006
> Author:            Stephan Rudlof, md
> md: added a line to the poscript to uncompactify the MethodProperties class. We want to add an instVar for the selector.
> Improves the default Object>>hash for Behaviors by installing Behavior>>hash. String>>hash has been changed a little to avoid infinite recursion (without changing its semantics).
> All is done in the postscript.
> 
> Important
> -----------
> This is a special changeset: Do not export and import this changeset again after importing it the first time! Then the methods are not installed alone in the postscript anymore, leading to serious
> problems!
> -----------
> 
> Rationale: Object>>hash calling ProtoObject>>identityHash gives poor results for Behaviors. Therefore a new Behavior>>hash using Symbol>>hash or String>>hash (the latter slightly changed to avoide
> infinite recursion) will be installed.
> 
> Consequences:
> - It speeds up Set/Dictionary operations with Behaviors a lot (see below).
> - The main consequence for other objects as Behaviors seems to be a changed hash if they use
>     self species hash
> as a start value for computing their hash.
> But AFAICS this doesn't hurt, since in most cases (non meta classes as species) it maps to Symbol>>hash, which is fast.
> On 11/21/18 7:31 AM, Levente Uzonyi wrote:
>       On Tue, 20 Nov 2018, Chris Muller wrote:
>
>                   To make things more clear, the current implementation of Behavior >> #hash
>                   has two negative side effects:
>                   - behaviors stored in collections relying on the hash value (e.g. Set,
>                   Dictionary) will have to be rehashed whenever a behavior is renamed
>                   - objects using Behavior >> #hash to implement their own #hash, like what
>                   Eliot just did to Message will suffer from the same issue. So Sets and
>                   Dictionaries holding those kind of objects will have to be rehashed as
>                   well upon the rename of the behavior.
>
>                   My questions related to this:
>                   - why does Behavior >> #hash rely on the name instead of identity?
> 
>
>             If you mean #identityHash, then its because involving an unstable
>             value in a #hash calculation is never a good idea.  #identityHash can
>             be different for the same class between two different images, or if
>             the class was ever becomed or reloaded into a new image, etc.
> 
>
>       Is there an actual user of that feature?
>
>       Bob found out that #hash had been changed during the developement of Squeak 3.9. Therefore this issue is not present in Cuis (forked from 3.7). And I just checked Pharo and found that
>       Behavior >> #hash had been removed from there.
>       So, I suggest we remove it as well unless there's a really good reason to keep it.
> 
>
>                   - do we want to fix those issues mentioned above or do we just say that
>                   one should not rename classes and expect things to work?
> 
>
>             Neither.  We just say that when one renames a class to rehash all
>             relevant HashedCollections.
> 
>
>       That's "one should not rename classes and expect things to work", isn't it?
>
>       Levente
> 
>
>             - Chris
> 
> 
> 
> 
>


More information about the Squeak-dev mailing list