[squeak-dev] The Inbox: Collections-cmm.874.mcz
asqueaker at gmail.com
Wed Jan 29 09:00:56 UTC 2020
My main reply is in the other one to you and Levente, but some quick
responses here for minor embellishment. :)
On Sat, Jan 25, 2020 at 3:21 AM Jakob Reschke <forums.jakob at resfarm.de>
> Am Sa., 25. Jan. 2020 um 06:55 Uhr schrieb Chris Muller <
> ma.chris.m at gmail.com>:
>> Here is an example scenario:
>> I write my code. I use [OrderedCollection new]. I see that my code is
>>> You change the default capacity from 10 to 3. My code is now too slow. I
>>> have to profile it to see why. It turns out that I store 7-9 elements
>>> of the time, and the capacity of 10 was a good fit, but 3 is not,
>>> it means growing twice (first to 6, then to 12), and my code ends up
>>> being slower and using more memory than before.
>> This example makes the case for this proposal, by showing that it was
>> *depending on knowing the private, internal initial size*, for its
>> performance. By having written #new instead of #new: in
>> performance-critical code, it was and still is less efficient than it could
>> be. No amount of "guessing" of an initial size will help execution
>> performance, but could at least guarantee space efficiency.
> If you optimize the default for space instead of sticking with a
> reasonable tradeoff,
"reasonable tradeoff" is what I'm trying to convince you is completely
> you might force people to use new: and think about the very implementation
> details of those collections to get back to reasonable results.
Its no different than we have now. Thinking about the size wherever you
can is a good thing.
Your fear of changing #new is because of the fuzziness of its definition.
What you're calling "reasonable" is actually just "random". If it were
definitive (e.g., space-efficient), the impact of changing it would be, too.
You might turn a piece of code into a bottleneck even though it was not
> considered performance-critical before.
Or it might rescue a suffering application because it's no longer paging
RAM out to disk... :)
On the other hand, who else was bothered by too sparse hashed or ordered
> collections until now?
It's about designing the most-efficient system and the best API, not who
has been bothered yet.
> Is it a problem that bothers many, in comparison to the group which the
> change could bother?
What happens to that group when they move their code to another Smalltalk
which uses a different default?
> I suppose this is premature optimization. If people have identified
> compactness as a requirement,
When all else is equal, more compact is *always* better than less.
> they shall use #new: with (domain specific) expected numbers or patch #new
> for their application. But don't force it on everyone.
Patching #new and then using it because you patched it is a ridiculous
suggestion. That's what #new: is for. This is about Squeak, not any one
app.. 10 is currently "forced" on everyone, and with 92% of
OrderedCollections in trunk over-allocated, a smaller choice might be
> You wouldn't like me to submit a "performance optimization" that changes
> the new default capacity to 100 because my application happened to deal
> with collections of that size frequently and because memory is comparably
> cheap and large nowadays, would you?
It's no less arbitrary than 10. Both guarantee nothing. At least 1, 2, or
3 guarantees space efficiency, and guarantees to make the API more
#new is fuzzy. The whole reason you're worried about uses of #new being
affected at all is because of that fuzziness. We should give it clarity,
make it definitively space-efficient...
The core library cannot *possibly* guess the shape of people's domains.
Our attempts to do so are causing more harm than good..
> We can only really know the impact if we have a benchmark or even an idea
> of realistic average collection usage. Maybe someone wrote a paper about
The beginning of a dev cycle is where such a change can be implemented,
leaving plenty of time for testing.
> Couldn't it be faster to use an OrderedCollection instead of a hashed one
for such small numbers of elements? If the hash computation outweighs the
As mentioned, this is about Squeak system efficiency and API design, not
any one specific app.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Squeak-dev