[squeak-dev] The Inbox: Collections-cmm.874.mcz

Levente Uzonyi leves at caesar.elte.hu
Sat Jan 25 00:19:19 UTC 2020


Hi Chris,

On Fri, 24 Jan 2020, Chris Muller wrote:

> Hi Jakob,
> 
> Optimal execution performance can never be assured with #new, since the potential cost of under-allocation in various places is offset by the gains of not over-allocating in other places. 
> 
> Although it's impossible to tweak the default initial size in #new to optimize for performance, we can at least optimize for space, and also for clarity-of-usage.  #new, by nature, expresses a "lack of concern for optimization", so Squeak should at least provide compactness in that case.  Places in the code that
> need optimization will correctly express that by writing #new: with a larger pre-allocation.
> 
> It's true that system level changes of any kind could result in the need for downstream changes but, in this case, it seems very unlikely.

The way I understand it, Jakob says that your suggested change has a 
chance to slow down things just to save a few hundred kilobytes of memory. 
I don't think we would want to do that without measuring the effects of 
the change somehow.

Same applies to OrderedCollections.

Following your reasoning, the optimal initial capacity for any dynamic 
collection created with #new should be 0, but that doesn't feel right...


Levente

> 
> Best,
>   Chris
> 
> 
> On Fri, Jan 24, 2020 at 4:51 PM Jakob Reschke <forums.jakob at resfarm.de> wrote:
>       Note that your statistics do not account for transient collections. If minimizing the initial capacity made most uses of these collections slower, it might not be justified to do so.
> 
> Am Fr., 24. Jan. 2020 um 23:28 Uhr schrieb Chris Muller <asqueaker at gmail.com>:
>       Wow, the number of oversized OrderedCollection instances is much worse, 92%.
> ((OrderedCollection allInstances count: [ : e | e size < 10 and: [ e array size >= 10 ] ]) /
> OrderedCollection allInstances size) asFloat.
> 
> On Fri, Jan 24, 2020 at 4:01 PM Chris Muller <asqueaker at gmail.com> wrote:
>       Hi all,
> In my trunk image, currently >10% of Dictionary instances have unnecessarily large internal arrays because they were created with #new but never grew.  
>      ((Dictionary allInstances count: [ : e | e size < 3 and: [ e array size >= 5 ] ]) / Dictionary allInstances size) asFloat      "0.11282051282051282"
> 
> A developer wanting compactness should not be required to probe into the internal implementation to know whether they can write the most compact code, "Dictionary new" and "Set new".  The most compact code should result in the most compact size by default, and only if a *larger* default size is
> desired, use #new: for optimization.
> 
> This also addresses the issue I've been discussing with Levente, ensuring #new: maintains its performance optimization along with space, as with OrderedCollections, etc.
> 
> Best,
>   Chris
> 
> 
> 
> 
> On Fri, Jan 24, 2020 at 3:32 PM <commits at source.squeak.org> wrote:
>       Chris Muller uploaded a new version of Collections to project The Inbox:
>       http://source.squeak.org/inbox/Collections-cmm.874.mcz
>
>       ==================== Summary ====================
>
>       Name: Collections-cmm.874
>       Author: cmm
>       Time: 24 January 2020, 3:32:43.628249 pm
>       UUID: 107f3a74-177c-4338-9ad3-648e427419a1
>       Ancestors: Collections-ul.871
>
>       - Optimize for system compactness by ensuring the default internal array size of any HashedCollection is not initialized larger than it may ever need to be.
>       - Let #new: be used to define larger sizes than the minimum, and perform comparably with #new even if the minimum size is specified.
>
>       =============== Diff against Collections-ul.871 ===============
>
>       Item was changed:
>         ----- Method: HashedCollection class>>new (in category 'instance creation') -----
>         new
>       +       ^ self basicNew initialize: 3!
>       -       ^ self basicNew initialize: 5!
> 
> 
> 
> 
>


More information about the Squeak-dev mailing list