[squeak-dev] The Inbox: Collections-cmm.874.mcz
leves at caesar.elte.hu
Sat Jan 25 02:32:19 UTC 2020
On Fri, 24 Jan 2020, Chris Muller wrote:
> On Fri, Jan 24, 2020 at 6:19 PM Levente Uzonyi <leves at caesar.elte.hu> wrote:
> Hi Chris,
> On Fri, 24 Jan 2020, Chris Muller wrote:
> > Hi Jakob,
> > Optimal execution performance can never be assured with #new, since the potential cost of under-allocation in various places is offset by the gains of not over-allocating in other places.
> > Although it's impossible to tweak the default initial size in #new to optimize for performance, we can at least optimize for space, and also for clarity-of-usage. #new, by nature, expresses a "lack of concern for optimization", so Squeak should at least provide compactness in that case. Places in
> the code that
> > need optimization will correctly express that by writing #new: with a larger pre-allocation.
> > It's true that system level changes of any kind could result in the need for downstream changes but, in this case, it seems very unlikely.
> The way I understand it, Jakob says that your suggested change has a
> chance to slow down things just to save a few hundred kilobytes of memory.
> But your suggested change is 100% *guaranteed* to slow things down, with no space savings.
Which change? To remove the optimization from #new? That wasn't a serious
suggestion just a possible solution to your problem (which I don't
really consider to be a problem).
> This avoids that particular slow down, while guarantee'ing to save space! All for only a very teency-weency chance of other areas being minorly affected (and, easily fixed if they are!).
It's funny that you call it a slowdown. You may perceive it as one but
that's just a simple optimization applied to #new, which cannot
be directly applied to #new:.
> I don't think we would want to do that without measuring the effects of
> the change somehow.
> Performance measuring is already part of every system that cares, right? :)
Of course not. That's why "somehow" is there. I don't know how it could be
measured, but I think that without measurements, it's risky to change
something like that.
> None of those systems are depending on any particular default initial size for performance. If they are, they should be fixed.
Here is an example scenario:
I write my code. I use [OrderedCollection new]. I see that my code is fast
You change the default capacity from 10 to 3. My code is now too slow. I
have to profile it to see why. It turns out that I store 7-9 elements most
of the time, and the capacity of 10 was a good fit, but 3 is not, because
it means growing twice (first to 6, then to 12), and my code ends up
being slower and using more memory than before.
> Same applies to OrderedCollections.
> Following your reasoning, the optimal initial capacity for any dynamic
> collection created with #new should be 0, but that doesn't feel right...
> Not 0, 3. It's "reasoning", not extremism. :)
> 3 would be a great default for OrderedCollection. Ultra-minimal, but with a purpose. I mean, really, with 10, we have 92% of all instances wasting space.
"Wasting space" is a must when you implement dynamic arrays and
hash tables. It's a space-time tradeoff.
> - Chris
More information about the Squeak-dev