[squeak-dev] FormInspector, or also: Text>>#= and its consequences

Eliot Miranda eliot.miranda at gmail.com
Tue Sep 15 19:18:11 UTC 2020


Hi Marcel,  Hi Levente, Hi Christoph, Hi All,

On Tue, Sep 15, 2020 at 7:42 AM Marcel Taeumel <marcel.taeumel at hpi.de>
wrote:

> Hi Christoph.
>
> Performance. Change it, bench it, post the results here. :-) Please
> specify you machine and try it on a slow RaspPi, too.
>

I think it's a historical hold-over.  Here's the same method in
Smalltalk-80 v2:

!Text methodsFor: 'comparing'!
= anotherText
    ^string = anotherText string! !

Changing it to read
   other isText ifTrue: [ ^string = other string and: [runs = other runs]].
other isString ifTrue: [ ^string = other ].
^false

 is not going to affect performance noticeably (runs are typically shorter
than the strings and Array comparison isn't particularly slow).  However,
it corresponds much closer to my intuitive understanding of Texts.  If I
wanted to see if two texts had the same characters I would use aText string
= bText string.  I see Levente's comment. but I think he's just commenting
the anomaly inherited from Smalltalk-80, not saying "it must be this way".
Am I right Levente?


So how bad is the performance?  I chose some texts (they happen to be in
the help browser, and as such they represent representative large texts,
which I think is what we're worried about for performance) via

Text allInstances select: [:t| t size > 5000 and: [t runs runs size > (t
size / 200)]]

(why text runs runs size?  Because text runs size = text size.  text runs
runs answers the size of the array holding the lengths of each emphasis run)

Then I benchmarked the comparison via

"Using the existing method compare strings."
| copy |
copy := self first copy.
[self first = copy] bench '186,000 per second. 5.39 microseconds per run. 0
% GC time.'

"Estimate the additional cost of comparing runs in a typical text"
| copy |
copy := self first copy.
[self first string = copy string and: [self first runs = copy runs]] bench
'154,000 per second. 6.48 microseconds per run. 0 % GC time.'

"Estimate the additional cost when there is some difference in emphasis"
| copy |
copy := self first copy.
copy addAttribute: (TextColor color: Color red) from: copy size // 2 to:
copy size.
[self first string = copy string and: [self first runs = copy runs]] bench
'187,000 per second. 5.36 microseconds per run. 0 % GC time.'


What the second one shows is that including testing for runs worsens
performance by about 20%.  For me that's acceptable.

What the third one shows is that if emphases do in fact differ the overhead
is far less, because in the runs comparison there is a size comparison, and
that fails without bothering to compare all the elements.


And of course the additional cost of comparing runs depends on how complex
typical runs are.  Here's a histogram:

| texts |
texts := Text allInstances select: [:t| t size > 0].
(10 to: 100 by: 10) collect:
[:percentage|
{ percentage.
 (texts select: [:t| | ratio |
ratio := t runs runs size / t size * 100.
(ratio between: percentage - 10 and: percentage) and: [ratio ~= (percentage
- 10)]]) size * 100.0 / texts size roundTo: 0.01} ]
#(#(10 67.62) #(20 15.48) #(30 6.58) #(40 2.49) #(50 2.49) #(60 0.36) #(70
0.18) #(80 0.0) #(90 0.0) #(100 4.8))

So most texts have very few emphases (typically one ;-).  Only 5.3% of
texts have runs longer than half the size of the text.  So in most cases
the slow down by adding the runs comparison to the mix will be less than
the 20% overhead above.  The worst case is represented by benchmark two
above, a large text compared against an identical copy.

Best,
> Marcel
>
> Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <
> christoph.thiede at student.hpi.uni-potsdam.de>:
>
> Hi all,
>
>
> is there any old thread about the design discussion of how Text>>#= works?
> (It does not consider attributes for quality.) Has this decision ever been
> questioned?
>
>
> Naively and without an overview of any existing components that could rely
> on this implementation, I would like to question it.
>
> Why should *'foo' asText allBold* be equal to *'foo' asText addAttribute:
> TextURL new*? With the same logic, we could also say that two
> dictionaries are equal iff they have got the same keys ...
>
>
> There is even a concrete client in the Trunk suffering from this design
> decision: Marcel's new FormInspector (and analogously, MorphInspector). It
> uses
> TextFontReference with a FormSetFont to display a screenshot right in the
> inspector pane. Unfortunately, the pane is never updated automatically
> because even if the screenshot changes, the text morph thinks the old text
> would equal the new one. I'd like to fix that without hacking any
> workaround into the inspectors.
> Even though this inspector implementation is a bit unusual, in my opinion,
> it shows that the current Text >> #= implementation might not be a perfect
> solution.
>
> I'm looking forward to your opinions.
>
> Best,
> Christoph
>
>
>

-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200915/3aa579d3/attachment.html>


More information about the Squeak-dev mailing list