[squeak-dev] The Trunk: Collections-cmm.1016.mcz
Levente Uzonyi
leves at caesar.elte.hu
Wed Jul 13 21:00:55 UTC 2022
Hi Chris,
On Wed, 13 Jul 2022, Chris Muller wrote:
> Hi Christoph,
>
> Thanks for the review and excellent suggestions. Please see Collections-cmm.1019, and let me know if you see anything else.
>
> Regarding the implementation: Did you run any benchmarks and how massive is the slowdown? The previous implementation used #basicAt: to avoid comparing characters (which is not fast on all platforms) and sending
> messages to them (in favor of inlining). I'm curious whether this could be avoided in the new implementation as well and how much performance could be won with that. #numberOrValue also looks very slow. Maybe it
> would be worth providing two alternative branches depending on the type of the collection argument?
>
> I tend to weigh toward a system defined tersely in terms of its own messages, and letting performance emanate from the _design_, as opposed to chasing an extra 5% of execution performance improvement at the expense of
> expressivity of the code. If the point of that 5% is to "save time", it seems reasonable to consider the time of future readers of the code.
>
> The method, #isAlphanumeric, is a prime example. Originally, its implementation beautifully matched its definition.
> _____
> isAlphaNumeric
> "Answer whether the receiver is a letter or a digit."
> ^self isLetter or: [self isDigit]
> _____
>
> Compare that to now, a complex, copy-and-pasted "implementation" which is a lot harder to understand and maintain, but only 10% faster in execution. IMO, that seems past the point of diminishing returns of what a user of
> Smalltalk would expect.
If #isAlphaNumeric is too complex, then so are #isLetter and #isDigit.
Please revert those as well to the _simple_ implementation and redo your
benchmark.
>
> Having said that, Squeak's speed is sweet, I can appreciate the desire to hyper-optimize at the bytecode level. Here's Marcel's benchmark with the latest:
Wasn't it you who considered the weak dictionaries too slow and had to
use a different implementation?
> ___
> ['Hello {1}!' format: { 'Squeak' }] bench.
>
> '3,450,000 per second. 290 nanoseconds per run. 1.35946 % GC time.' <--- new
> '3,820,000 per second. 262 nanoseconds per run. 4.22 % GC time.' <--- old
>
> 3450.0/3820 0.9031413612565445
What if there are multiple substitutions instead of just one? Is it still
just 10 percent slower?
> ___
>
> Looks like about a 10% hit for this example. Maybe it could be improved, but I doubt by very much. Unfortunately using #basicAt: isn't convenient when alphanumeric tokens are possible.
Why limit the tokens to alphanumeric ones? Why am I not allowed to write
the following?
'{foo_bar}' format: ({ 'foo_bar' -> 1 })
Also, why do I get an error when I try this?
'{0x1}{0x2}' format: ({ '0x1' -> 1. '0x2' -> 2 } as: Dictionary)
Levente
>
> If this is still too much, let me know, I'll take it back to the original numerals-only version.
>
> Best,
> Chris
>
>
More information about the Squeak-dev
mailing list
|