[squeak-dev] The Trunk: Collections-cmm.1016.mcz

Chris Muller asqueaker at gmail.com
Wed Jul 13 22:33:11 UTC 2022


Hi Levente,

> I tend to weigh toward a system defined tersely in terms of its own
> messages, and letting performance emanate from the _design_, as opposed to
> chasing an extra 5% of execution performance improvement at the expense of
> > expressivity of the code.  If the point of that 5% is to "save time", it
> seems reasonable to consider the time of future readers of the code.
> >
> > The method, #isAlphanumeric, is a prime example.  Originally, its
> implementation beautifully matched its definition.
> > _____
> >     isAlphaNumeric
> >         "Answer whether the receiver is a letter or a digit."
> >         ^self isLetter or: [self isDigit]
> > _____
> >
> > Compare that to now, a complex, copy-and-pasted "implementation" which
> is a lot harder to understand and maintain, but only 10% faster in
> execution.  IMO, that seems past the point of diminishing returns of what a
> user of
> > Smalltalk would expect.
>
> If #isAlphaNumeric is too complex, then so are #isLetter and #isDigit.
> Please revert those as well to the _simple_ implementation and redo your
> benchmark.
>

There's no reason to take my comments personally.  #isLetter looks to be
sufficiently uniquely defined -- there's very little that could be factored
with #isDigit that would be worthwhile.  But we can disagree about the
trade-off of copying those implementations up to #isAlphaNumeric instead of
reusing them.


> > Having said that, Squeak's speed is sweet, I can appreciate the desire
> to hyper-optimize at the bytecode level.  Here's Marcel's benchmark with
> the latest:
>
> Wasn't it you who considered the weak dictionaries too slow and had to
> use a different implementation?
>

The IdentityDictionary's actually (and, yes, the Weak flavor), back when
Squeak's 12-bit #identityHash would result in way too many collisions.
That's an example of what I was saying about letting performance emanate
from the _design_ and less from micro-optimizations of the code, to realize
a huge gain.


> > ___
> > ['Hello {1}!' format: { 'Squeak' }] bench.
> >
> >  '3,450,000 per second. 290 nanoseconds per run. 1.35946 % GC time.'
>  <--- new
> >  '3,820,000 per second. 262 nanoseconds per run. 4.22 % GC time.'
>  <--- old
> >
> > 3450.0/3820   0.9031413612565445
>
> What if there are multiple substitutions instead of just one? Is it still
> just 10 percent slower?
>

Looks like with the following 100 substitution case, the hit increased to
25%.
___
|str values| str := String streamContents: [ : stream | 1 to: 100 do: [ : n
| stream nextPutAll: 'Hello {'; nextPutAll: n asString; nextPutAll: '}
Squeak!  ' ] ].
values := (1 to: 100) collect: [ : each | each asWords ].
[str format: values] bench.

 '83,800 per second. 11.9 microseconds per run. 2.4 % GC time.'  "<-- old"
 '63,100 per second. 15.9 microseconds per run. 1.85926 % GC time.' "<--
new"
___

Overall, I expected it to be a little bit slower, but in exchange for more
power.

> Looks like about a 10% hit for this example.  Maybe it could be improved,
> but I doubt by very much.  Unfortunately using #basicAt: isn't convenient
> when alphanumeric tokens are possible.
>
> Why limit the tokens to alphanumeric ones? Why am I not allowed to write
> the following?
>
>         '{foo_bar}' format: ({ 'foo_bar' -> 1 })
>

To maximize speed and simplicity.  Underscore is a great idea, but checking
for it exacts another few % speed hit.  I think it's worth it, how about
you?


> Also, why do I get an error when I try this?
>
>         '{0x1}{0x2}' format: ({ '0x1' -> 1. '0x2' -> 2 } as: Dictionary)
>

As with Symbols in Squeak and identifiers in many languages, the
alphanumeric tokens are limited to beginning with an alphabetic character.
Starting with a numeral will cause the optimized code to assume you're
using numeric tokens.

Please don't take offense to my priorities of coding.  I
certainly appreciate yours.  If this feature isn't worth the performance
hit, let me know.  I don't think it will ever be noticeable, but I'm happy
to revert it if it is (even happier if you're able to work your magic to
make it even faster).

Regards,
  Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20220713/50dce4b4/attachment.html>


More information about the Squeak-dev mailing list