[squeak-dev] The Inbox: Collections-ul.871.mcz

Levente Uzonyi leves at caesar.elte.hu
Sat Jan 25 00:11:20 UTC 2020


Hi Chris,

On Thu, 23 Jan 2020, Chris Muller wrote:

> Hi,
>
>       > That's better, but it still has that same fundamental problem.  Every time a developer makes a HashedCollection of a known-at-runtime size (e.g., in a variable), they're forced to choose between execution performance pain or code pain.
>       >
>       >     {
>       >    '[ Dictionary new ]'->'100% of baseline rate, 27,600,000 per second. 36.2 nanoseconds per run. 11.33547 % GC time.'
>       >
>       >     "performance pain?"
>       >     '[ Dictionary new: 1 ]'->'60% of baseline rate, 16,600,000 per second. 60.1 nanoseconds per run. 5.61888 % GC time.'
>       >     '[ Dictionary new: 2 ]'->'61% of baseline rate, 16,900,000 per second. 59.2 nanoseconds per run. 5.67886 % GC time.'
>       >     '[ Dictionary new: 3 ]'->'59% of baseline rate, 16,300,000 per second. 61.5 nanoseconds per run. 6.77864 % GC time.'
>
>       Even if there's a performance overhead, you use less memory.
> 
> 
> But #new: is about optimization along *both* of those dimensions.  Imagine how complicated a "manpage" for #new: would have to be if it weren't.  #new: must _never_ perform significantly worse than #new (for sizes <= the default), because it would either trick or force developers into writing less-performant
> code, or into acknowledging Squeak's internal Dictionary implementation in their own code.  It feels like an API-design bug.

It was okay for quite a long time. :)

>  
>        >     "into #sizeFor:"
>
>       >     '[ Dictionary new: 4 ]'->'57% of baseline rate, 15,800,000 per second. 63.5 nanoseconds per run. 7.87685 % GC time.'
>
>       Starting from 4, you also save time by avoiding growing, which is
>       more significant than what you "lose" during instance creation.
> 
> 
> Except my Dictionary is never going to grow.

So, we actually say the same thing.

> 
> In case it helps bring clarity, my scenario is the GraphQL server.  As a Request comes in, the server will know, depending on the type, how many named arguments to expect (for most "normal" schema's, 0 to 4, but it can define any number it wants).  So it creates right sized Dictionary to hold them all, and will
> never grow beyond that.  I simply don't want the server to have to do extra work when **vast majority** of requests will have fewer than 4 arguments.
>
>       We could get rid of the anomaly by changing #new to ^self new: 3.
> 
> 
> Yes, I'd be fine with that solution, too!   For me, it's only about violation of #new:'s contract.

I don't see any broken contract here. It may be surprising to see 
somewhat worse performance with #new: than with #new (25 nanoseconds per 
instance according to your measurements), but both of those methods do 
what they should. Just because there was an easy optimization applied to 
#new, nothing was broken IMHO.

>
>       If we decide to keep #new as it is, then I'm not against using a similar
>       optimization scheme in #new:. But there should be some tests to verify
>       that the methods always return valid dictionaries.
>       And, I'd also prefer to swap the branches in #new: so that < 4 is the
>       first check and < 3 is the second. There should be a comment as well
>       about the optimization.
> 
> 
> Okay, sure thing.  I just wanted to get your initial feedback before embarking on that much work.
> 
> But, personally, I think default size of 3 sounds like a fine way to go.  I don't think it'd affect places currently using "Dictionary new" very much, and anywhere it did could be easily fixed...

I didn't mean to use 3 as default capacity. I meant that in #new: only a 
single optimization branch should be before #sizeFor: is applied. So, 
instead of

  		(numberOfElements < 3
  			ifTrue: [ 3 ]
  			ifFalse:
  				[ numberOfElements < 4
  					ifTrue: [ 5 ]
  					ifFalse: [ self sizeFor: numberOfElements ] ])

the code should be

  		(numberOfElements <= 3
  			ifFalse: [ self sizeFor: numberOfElements ]
  			ifTrue:
  				[ numberOfElements < 3
  					ifTrue: [ 3 ]
  					ifFalse: [ 5 ] ])


Levente

> 
>  - Chris
> 
> 
>


More information about the Squeak-dev mailing list