[squeak-dev] The Inbox: Collections-ul.871.mcz

Chris Muller ma.chris.m at gmail.com
Thu Jan 23 20:25:16 UTC 2020


> That's better, but it still has that same fundamental problem.  Every
> time a developer makes a HashedCollection of a known-at-runtime size (e.g.,
> in a variable), they're forced to choose between execution performance pain
> or code pain.
> >
> >     {
> >    '[ Dictionary new ]'->'100% of baseline rate, 27,600,000 per second.
> 36.2 nanoseconds per run. 11.33547 % GC time.'
> >
> >     "performance pain?"
> >     '[ Dictionary new: 1 ]'->'60% of baseline rate, 16,600,000 per
> second. 60.1 nanoseconds per run. 5.61888 % GC time.'
> >     '[ Dictionary new: 2 ]'->'61% of baseline rate, 16,900,000 per
> second. 59.2 nanoseconds per run. 5.67886 % GC time.'
> >     '[ Dictionary new: 3 ]'->'59% of baseline rate, 16,300,000 per
> second. 61.5 nanoseconds per run. 6.77864 % GC time.'
> Even if there's a performance overhead, you use less memory.

But #new: is about optimization along *both* of those dimensions.  Imagine
how complicated a "manpage" for #new: would have to be if it weren't.
#new: must _never_ perform significantly worse than #new (for sizes <= the
default), because it would either trick or force developers into writing
less-performant code, or into acknowledging Squeak's internal Dictionary
implementation in their own code.  It feels like an API-design bug.

>  >     "into #sizeFor:"

>     '[ Dictionary new: 4 ]'->'57% of baseline rate, 15,800,000 per
> second. 63.5 nanoseconds per run. 7.87685 % GC time.'
> Starting from 4, you also save time by avoiding growing, which is
> more significant than what you "lose" during instance creation.

Except my Dictionary is never going to grow.

In case it helps bring clarity, my scenario is the GraphQL server.  As a
Request comes in, the server will know, depending on the type, how many
named arguments to expect (for most "normal" schema's, 0 to 4, but it can
define any number it wants).  So it creates right sized Dictionary to hold
them all, and will never grow beyond that.  I simply don't want the server
to have to do extra work when **vast majority** of requests will have fewer
than 4 arguments.

> We could get rid of the anomaly by changing #new to ^self new: 3.

Yes, I'd be fine with that solution, too!   For me, it's only about
violation of #new:'s contract.

If we decide to keep #new as it is, then I'm not against using a similar
> optimization scheme in #new:. But there should be some tests to verify
> that the methods always return valid dictionaries.
> And, I'd also prefer to swap the branches in #new: so that < 4 is the
> first check and < 3 is the second. There should be a comment as well
> about the optimization.

Okay, sure thing.  I just wanted to get your initial feedback before
embarking on that much work.

But, personally, I think default size of 3 sounds like a fine way to go.  I
don't think it'd affect places currently using "Dictionary new" very much,
and anywhere it did could be easily fixed...

 - Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200123/6eb00623/attachment.html>

More information about the Squeak-dev mailing list