[squeak-dev] The Inbox: Collections-cmm.874.mcz

Chris Muller ma.chris.m at gmail.com
Sat Jan 25 00:06:51 UTC 2020


Hi Jakob,

Optimal execution performance can *never* be assured with #new, since the
potential cost of under-allocation in various places is offset by the gains
of *not* *over*-allocating in other places.

Although it's impossible to tweak the default initial size in #new to
optimize for *performance*, we can at least optimize for *space*, and also
for *clarity-of-usage.  *#new, by nature, expresses a "lack of concern for
optimization", so Squeak should at least provide *compactness *in that
case.  Places in the code that need optimization will correctly express
that by writing #new: with a larger pre-allocation.

It's true that system level changes of any kind could result in the need
for downstream changes but, in this case, it seems very unlikely.

Best,
  Chris


On Fri, Jan 24, 2020 at 4:51 PM Jakob Reschke <forums.jakob at resfarm.de>
wrote:

> Note that your statistics do not account for transient collections. If
> minimizing the initial capacity made most uses of these collections slower,
> it might not be justified to do so.
>
> Am Fr., 24. Jan. 2020 um 23:28 Uhr schrieb Chris Muller <
> asqueaker at gmail.com>:
>
>> Wow, the number of oversized OrderedCollection instances is much worse,
>> 92%.
>>
>> ((OrderedCollection allInstances count: [ : e | e size < 10 and: [ e
>> array size >= 10 ] ]) /
>> OrderedCollection allInstances size) asFloat.
>>
>> On Fri, Jan 24, 2020 at 4:01 PM Chris Muller <asqueaker at gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> In my trunk image, currently >10% of Dictionary instances have
>>> unnecessarily large internal arrays because they were created with #new but
>>> never grew.
>>>
>>>      ((Dictionary allInstances count: [ : e | e size < 3 and: [ e array
>>> size >= 5 ] ]) / Dictionary allInstances size) asFloat
>>> "0.11282051282051282"
>>>
>>> A developer wanting compactness should not be required to probe into the
>>> internal implementation to know whether they can write the most compact
>>> code, "Dictionary new" and "Set new".  The most compact code should result
>>> in the most compact size by default, and only if a *larger* default size is
>>> desired, use #new: for optimization.
>>>
>>> This also addresses the issue I've been discussing with Levente,
>>> ensuring #new: maintains its performance optimization along with space, as
>>> with OrderedCollections, etc.
>>>
>>> Best,
>>>   Chris
>>>
>>>
>>>
>>>
>>> On Fri, Jan 24, 2020 at 3:32 PM <commits at source.squeak.org> wrote:
>>>
>>>> Chris Muller uploaded a new version of Collections to project The Inbox:
>>>> http://source.squeak.org/inbox/Collections-cmm.874.mcz
>>>>
>>>> ==================== Summary ====================
>>>>
>>>> Name: Collections-cmm.874
>>>> Author: cmm
>>>> Time: 24 January 2020, 3:32:43.628249 pm
>>>> UUID: 107f3a74-177c-4338-9ad3-648e427419a1
>>>> Ancestors: Collections-ul.871
>>>>
>>>> - Optimize for system compactness by ensuring the default internal
>>>> array size of any HashedCollection is not initialized larger than it may
>>>> ever need to be.
>>>> - Let #new: be used to define larger sizes than the minimum, and
>>>> perform comparably with #new even if the minimum size is specified.
>>>>
>>>> =============== Diff against Collections-ul.871 ===============
>>>>
>>>> Item was changed:
>>>>   ----- Method: HashedCollection class>>new (in category 'instance
>>>> creation') -----
>>>>   new
>>>> +       ^ self basicNew initialize: 3!
>>>> -       ^ self basicNew initialize: 5!
>>>>
>>>>
>>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200124/429b1d53/attachment.html>


More information about the Squeak-dev mailing list