[squeak-dev] The Inbox: Collections-cmm.874.mcz

Jakob Reschke forums.jakob at resfarm.de
Sat Jan 25 09:21:00 UTC 2020


Am Sa., 25. Jan. 2020 um 06:55 Uhr schrieb Chris Muller <
ma.chris.m at gmail.com>:

> Here is an example scenario:
>>
> I write my code. I use [OrderedCollection new]. I see that my code is fast
>> enough.
>> You change the default capacity from 10 to 3. My code is now too slow. I
>> have to profile it to see why. It turns out that I store 7-9 elements
>> most
>> of the time, and the capacity of 10 was a good fit, but 3 is not, because
>> it means growing twice (first to 6, then to 12), and my code ends up
>> being slower and using more memory than before.
>>
>
> This example makes the case for this proposal, by showing that it was
> *depending on knowing the private, internal initial size*, for its
> performance.  By having written #new instead of #new: in
> performance-critical code, it was and still is less efficient than it could
> be.  No amount of "guessing" of an initial size will help execution
> performance, but could at least guarantee space efficiency.
>

If you optimize the default for space instead of sticking with a reasonable
tradeoff, you might force people to use new: and think about the very
implementation details of those collections to get back to reasonable
results. You might turn a piece of code into a bottleneck even though it
was not considered performance-critical before. So you may break/brake
existing applications in a non-obvious manner and cause maintenance cost.
And free time is also precious...

On the other hand, who else was bothered by too sparse hashed or ordered
collections until now? Is it a problem that bothers many, in comparison to
the group which the change could bother?

I suppose this is premature optimization. If people have identified
compactness as a requirement, they shall use #new: with (domain specific)
expected numbers or patch #new for their application. But don't force it on
everyone. You wouldn't like me to submit a "performance optimization" that
changes the new default capacity to 100 because my application happened to
deal with collections of that size frequently and because memory is
comparably cheap and large nowadays, would you?

We can only really know the impact if we have a benchmark or even an idea
of realistic average collection usage. Maybe someone wrote a paper about
that...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200125/5b7a8d93/attachment-0001.html>


More information about the Squeak-dev mailing list