[squeak-dev] The Inbox: Collections-ul.871.mcz

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Sat Jan 25 01:04:27 UTC 2020


Hi Chris,
For such small size, did u try using SmallDictionary? (see in RB)

It avoids hash and just iterate thru keys with = until hit or exhausting
the tally.

It also avoid creating Association, maintaining keys and values in 2 arrays.

It thus has no extra room requirement.
It seems a perfect fit for your requirements.

Le sam. 25 janv. 2020 à 01:47, Chris Muller <ma.chris.m at gmail.com> a écrit :

> Hi Levente,
>
> >       > That's better, but it still has that same fundamental problem.
>> Every time a developer makes a HashedCollection of a known-at-runtime size
>> (e.g., in a variable), they're forced to choose between execution
>> performance pain or code pain.
>> >       >
>> >       >     {
>> >       >    '[ Dictionary new ]'->'100% of baseline rate, 27,600,000 per
>> second. 36.2 nanoseconds per run. 11.33547 % GC time.'
>> >       >
>> >       >     "performance pain?"
>> >       >     '[ Dictionary new: 1 ]'->'60% of baseline rate, 16,600,000
>> per second. 60.1 nanoseconds per run. 5.61888 % GC time.'
>> >       >     '[ Dictionary new: 2 ]'->'61% of baseline rate, 16,900,000
>> per second. 59.2 nanoseconds per run. 5.67886 % GC time.'
>> >       >     '[ Dictionary new: 3 ]'->'59% of baseline rate, 16,300,000
>> per second. 61.5 nanoseconds per run. 6.77864 % GC time.'
>> >
>> >       Even if there's a performance overhead, you use less memory.
>> >
>> >
>> > But #new: is about optimization along *both* of those dimensions.
>> Imagine how complicated a "manpage" for #new: would have to be if it
>> weren't.  #new: must _never_ perform significantly worse than #new (for
>> sizes <= the default), because it would either trick or force developers
>> into writing less-performant
>> > code, or into acknowledging Squeak's internal Dictionary implementation
>> in their own code.  It feels like an API-design bug.
>>
>> It was okay for quite a long time. :)
>>
>
> ... until this proposal which changed the smallest internal array size
> from 5 to 3 and introduced the above anomaly.  I assume you made that
> change because you felt something else *wasn't* okay (to which I agree!).
>
>
>>
>> >
>> >        >     "into #sizeFor:"
>> >
>> >       >     '[ Dictionary new: 4 ]'->'57% of baseline rate, 15,800,000
>> per second. 63.5 nanoseconds per run. 7.87685 % GC time.'
>> >
>> >       Starting from 4, you also save time by avoiding growing, which is
>> >       more significant than what you "lose" during instance creation.
>> >
>> >
>> > Except my Dictionary is never going to grow.
>>
>> So, we actually say the same thing.
>>
>
> Did we?  You were arguing about saving time with growing, which I can
> never gain.  I only lose during instance creation...
>
>
>> > In case it helps bring clarity, my scenario is the GraphQL server.  As
>> a Request comes in, the server will know, depending on the type, how many
>> named arguments to expect (for most "normal" schema's, 0 to 4, but it can
>> define any number it wants).  So it creates right sized Dictionary to hold
>> them all, and will
>> > never grow beyond that.  I simply don't want the server to have to do
>> extra work when **vast majority** of requests will have fewer than 4
>> arguments.
>> >
>> >       We could get rid of the anomaly by changing #new to ^self new: 3.
>> >
>> >
>> > Yes, I'd be fine with that solution, too!   For me, it's only about
>> violation of #new:'s contract.
>>
>> I don't see any broken contract here. It may be surprising to see
>> somewhat worse performance with #new: than with #new (25 nanoseconds per
>> instance according to your measurements),
>
>
> a 40% hit...
>
>
>> but both of those methods do
>> what they should. Just because there was an easy optimization applied to
>> #new, nothing was broken IMHO.
>>
>
> ... for using #new: that's *supposed* to be for optimization!
>
> So if you were in my shoes (which, if this goes to trunk as-is eventually
> you will be!), would you take a 40% hit in performance-critical code
> needing to instantiate a Dictionary with:
>
>     Dictionary new: runtimeSize
>
> Or, actually violate encapsulation to preserve performance?
>
>    sz>3 ifTrue: [Dictionary new: sz] ifFalse: [Dictionary new]
>
> ?
>
> My past observations have been that you like and appreciate the
> most-efficient performing code, so I'm truly curious!
>
> Best Regards.
>   Chris
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200125/3c307dc7/attachment.html>


More information about the Squeak-dev mailing list