[squeak-dev] Association >> #hash performance

Marcel Taeumel marcel.taeumel at hpi.de
Thu Jul 7 08:23:22 UTC 2022

Hi Christoph --

I agree with Levente in that we should not change Association >> #hash.

+1 on your proposed changes to UserInterfaceTheme.

Maybe just use "UserInterfaceTheme cleanUpAndReset" as a postscript. :-) At this point, we still do not expect custom stuff in there.

Am 26.05.2022 23:32:27 schrieb Levente Uzonyi <leves at caesar.elte.hu>:
Hi Christoph,

On Thu, 26 May 2022, christoph.thiede at student.hpi.uni-potsdam.de wrote:

> Hi all!
> while playing around with code simulation recently, I found out that the following dictionary lookup consumes extremely many bytecodes:
>     UserInterfaceTheme current get: #return for: SHTextStylerST80.
> Dictionary >> #scanFor: takes 59 (sic!) attempts to find the right item in the dictionary. In other words, almost 20% of the property table is traversed, which does not conform my idea of a hashtable.
> The reason for this can be found in Association >> #hash: Since Collections-ar.304, it does not take the value of an association into consideration for the hash any longer. As UserInterfaceTheme uses Associations as keys in
> the Dictionary, we have a lot of "hash misses". Even though you might not notice this in practice, I noticed this indeed during debugging und simulating, I can tell you. :-)
> On the other hand, reverting Association >> #hash to honor the value slows down Unicode class compileAll indeed (significantly: 60 ms vs. 2100 ms), as noted in Andreas' original comment:
>     Name: Collections-ar.304
>     Author: ar
>     Time: 11 February 2010, 11:52:58.959 pm
>     UUID: 9e97b360-adf1-f745-8f8c-562aa08d566e
>     Ancestors: Collections-ar.303
>     Fix an age-old bug in Association>>hash which causes extreme slowdowns in compiling Unicode methods. Association>>hash does not need to hash the value; it's slow and useless.
> So my question is: What would be the right fix for this situation? Is it
> a) Do not use Associations in Dictionarys when its value matters for the lookup; or
> b) Do only use Associations in Dictionarys when its value maters for the lookup?

It's the user's (the developer who decided to use a hashed collection)
responsibility to provide keys that have a good hash distribution.
If it's not possible with the default hash function, a PluggableDictionary
should be used with a better one.

In this case, however, I would use Arrays as tuples instead of
Associations because not all the keys are Associations in that dictionary,
and Array's #hash is good enough here.
Changing Association >> #hash would have too many side effects for such a
localized problem to fix.

> For a), the dictionary layout of UserInterfaceTheme needs to be changed, for which I am attaching a changeset that replaces Associations by Arrays there.* For b), the layout of litIndSet in Encoder needs to be changed, for
> instance by using Bindings instead of Associations for classPool entries. However, I'm not familiar enough with Encoder etc. to answer this question.
> *The changeset will automatically update all UserInterfaceTheme instances. Benchmark:
>     theme := UserInterfaceTheme current.
>     random := Random seed: 20220526.
>     properties := (1 to: 10000) collect: [:i | theme properties keys atRandom: random].
>     [properties collect: [:ea | theme get: ea]] bench.
> Old: 45 ms | New: 10 ms
> Looking forward to your opinion: How should we treat Associations in Dictionarys? Of course, this is not release-critical. :-)
> Best,
> Christoph
> =============== Postscript (accelerateUserInterfaceTheme.2.cs) ===============
> "Postscript: Update UserInterfaceTheme instances"
> UserInterfaceTheme allSubInstancesDo: [:theme |
>     | newProperties |
>     newProperties := theme properties copyEmpty.
>     theme properties keysAndValuesDo: [:key :value |
>         newProperties
>             at: ((key isKindOf: Association)
>                 ifTrue: [{key key. key value}]
>                 ifFalse: [key])
>             put: value].
>     theme instVarNamed: 'properties' put: newProperties].

There's a method named #atomicUpdate:. I'm not sure if it's needed here,
but the name suggests that it's better to use that to update the
properties. The rest of the changes looks good to me. +1

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20220707/0fced268/attachment.html>

More information about the Squeak-dev mailing list