[squeak-dev] The Inbox: Collections-cmm.874.mcz

Levente Uzonyi leves at caesar.elte.hu
Sat Jan 25 02:32:19 UTC 2020


Hi Chris,

On Fri, 24 Jan 2020, Chris Muller wrote:

> On Fri, Jan 24, 2020 at 6:19 PM Levente Uzonyi <leves at caesar.elte.hu> wrote:
>       Hi Chris,
>
>       On Fri, 24 Jan 2020, Chris Muller wrote:
>
>       > Hi Jakob,
>       >
>       > Optimal execution performance can never be assured with #new, since the potential cost of under-allocation in various places is offset by the gains of not over-allocating in other places. 
>       >
>       > Although it's impossible to tweak the default initial size in #new to optimize for performance, we can at least optimize for space, and also for clarity-of-usage.  #new, by nature, expresses a "lack of concern for optimization", so Squeak should at least provide compactness in that case.  Places in
>       the code that
>       > need optimization will correctly express that by writing #new: with a larger pre-allocation.
>       >
>       > It's true that system level changes of any kind could result in the need for downstream changes but, in this case, it seems very unlikely.
>
>       The way I understand it, Jakob says that your suggested change has a
>       chance to slow down things just to save a few hundred kilobytes of memory.
> 
> 
> But your suggested change is 100% *guaranteed* to slow things down, with no space savings.

Which change? To remove the optimization from #new? That wasn't a serious 
suggestion just a possible solution to your problem (which I don't 
really consider to be a problem).

> 
> This avoids that particular slow down, while guarantee'ing to save space!  All for only a very teency-weency chance of other areas being minorly affected (and, easily fixed if they are!).

It's funny that you call it a slowdown. You may perceive it as one but 
that's just a simple optimization applied to #new, which cannot 
be directly applied to #new:.

>  
>       I don't think we would want to do that without measuring the effects of
>       the change somehow.
> 
> 
> Performance measuring is already part of every system that cares, right?   :)

Of course not. That's why "somehow" is there. I don't know how it could be 
measured, but I think that without measurements, it's risky to change 
something like that.

> None of those systems are depending on any particular default initial size for performance.  If they are, they should be fixed.

Here is an example scenario:
I write my code. I use [OrderedCollection new]. I see that my code is fast 
enough.
You change the default capacity from 10 to 3. My code is now too slow. I 
have to profile it to see why. It turns out that I store 7-9 elements most 
of the time, and the capacity of 10 was a good fit, but 3 is not, because 
it means growing twice (first to 6, then to 12), and my code ends up 
being slower and using more memory than before.

>  
>
>       Same applies to OrderedCollections.
>
>       Following your reasoning, the optimal initial capacity for any dynamic
>       collection created with #new should be 0, but that doesn't feel right...
> 
> 
> Not 0, 3.  It's "reasoning", not extremism.  :)
> 
> 3 would be a great default for OrderedCollection.  Ultra-minimal, but with a purpose.   I mean, really, with 10, we have 92% of all instances wasting space.

"Wasting space" is a must when you implement dynamic arrays and 
hash tables. It's a space-time tradeoff[1].


Levente

[1] https://en.wikipedia.org/wiki/Space%E2%80%93time_tradeoff

> 
>  - Chris
> 
> 
>


More information about the Squeak-dev mailing list