Hi all,
is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?
Naively and without an overview of any existing components that could rely on this implementation, I would like to question it.
Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...
There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses
TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into the inspectors. Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
I'm looking forward to your opinions.
Best, Christoph
Hi Christoph.
Performance. Change it, bench it, post the results here. :-) Please specify you machine and try it on a slow RaspPi, too.
Best, Marcel Am 10.09.2020 20:32:34 schrieb Thiede, Christoph christoph.thiede@student.hpi.uni-potsdam.de: Hi all,
is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?
Naively and without an overview of any existing components that could rely on this implementation, I would like to question it. Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...
There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into the inspectors. Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
I'm looking forward to your opinions.
Best, Christoph
Hi Marcel, Hi Levente, Hi Christoph, Hi All,
On Tue, Sep 15, 2020 at 7:42 AM Marcel Taeumel marcel.taeumel@hpi.de wrote:
Hi Christoph.
Performance. Change it, bench it, post the results here. :-) Please specify you machine and try it on a slow RaspPi, too.
I think it's a historical hold-over. Here's the same method in Smalltalk-80 v2:
!Text methodsFor: 'comparing'! = anotherText ^string = anotherText string! !
Changing it to read other isText ifTrue: [ ^string = other string and: [runs = other runs]]. other isString ifTrue: [ ^string = other ]. ^false
is not going to affect performance noticeably (runs are typically shorter than the strings and Array comparison isn't particularly slow). However, it corresponds much closer to my intuitive understanding of Texts. If I wanted to see if two texts had the same characters I would use aText string = bText string. I see Levente's comment. but I think he's just commenting the anomaly inherited from Smalltalk-80, not saying "it must be this way". Am I right Levente?
So how bad is the performance? I chose some texts (they happen to be in the help browser, and as such they represent representative large texts, which I think is what we're worried about for performance) via
Text allInstances select: [:t| t size > 5000 and: [t runs runs size > (t size / 200)]]
(why text runs runs size? Because text runs size = text size. text runs runs answers the size of the array holding the lengths of each emphasis run)
Then I benchmarked the comparison via
"Using the existing method compare strings." | copy | copy := self first copy. [self first = copy] bench '186,000 per second. 5.39 microseconds per run. 0 % GC time.'
"Estimate the additional cost of comparing runs in a typical text" | copy | copy := self first copy. [self first string = copy string and: [self first runs = copy runs]] bench '154,000 per second. 6.48 microseconds per run. 0 % GC time.'
"Estimate the additional cost when there is some difference in emphasis" | copy | copy := self first copy. copy addAttribute: (TextColor color: Color red) from: copy size // 2 to: copy size. [self first string = copy string and: [self first runs = copy runs]] bench '187,000 per second. 5.36 microseconds per run. 0 % GC time.'
What the second one shows is that including testing for runs worsens performance by about 20%. For me that's acceptable.
What the third one shows is that if emphases do in fact differ the overhead is far less, because in the runs comparison there is a size comparison, and that fails without bothering to compare all the elements.
And of course the additional cost of comparing runs depends on how complex typical runs are. Here's a histogram:
| texts | texts := Text allInstances select: [:t| t size > 0]. (10 to: 100 by: 10) collect: [:percentage| { percentage. (texts select: [:t| | ratio | ratio := t runs runs size / t size * 100. (ratio between: percentage - 10 and: percentage) and: [ratio ~= (percentage - 10)]]) size * 100.0 / texts size roundTo: 0.01} ] #(#(10 67.62) #(20 15.48) #(30 6.58) #(40 2.49) #(50 2.49) #(60 0.36) #(70 0.18) #(80 0.0) #(90 0.0) #(100 4.8))
So most texts have very few emphases (typically one ;-). Only 5.3% of texts have runs longer than half the size of the text. So in most cases the slow down by adding the runs comparison to the mix will be less than the 20% overhead above. The worst case is represented by benchmark two above, a large text compared against an identical copy.
Best,
Marcel
Am 10.09.2020 20:32:34 schrieb Thiede, Christoph < christoph.thiede@student.hpi.uni-potsdam.de>:
Hi all,
is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?
Naively and without an overview of any existing components that could rely on this implementation, I would like to question it.
Why should *'foo' asText allBold* be equal to *'foo' asText addAttribute: TextURL new*? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...
There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into the inspectors. Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
I'm looking forward to your opinions.
Best, Christoph
Interesting, I would not have assumed that this would be only about performance, sounded like a more profound design decision to me.
If no one sees a problem in changing this behavior, I can try my luck. :-)
Best,
Christoph
http://www.hpi.de/ ________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Taeumel, Marcel Gesendet: Dienstag, 15. September 2020 16:42:11 An: squeak-dev Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences
Hi Christoph.
Performance. Change it, bench it, post the results here. :-) Please specify you machine and try it on a slow RaspPi, too.
Best, Marcel
Am 10.09.2020 20:32:34 schrieb Thiede, Christoph christoph.thiede@student.hpi.uni-potsdam.de:
Hi all,
is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?
Naively and without an overview of any existing components that could rely on this implementation, I would like to question it.
Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...
There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses
TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into the inspectors. Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
I'm looking forward to your opinions.
Best, Christoph
Hi Christoph,
On Wed, 16 Sep 2020, Thiede, Christoph wrote:
Interesting, I would not have assumed that this would be only about performance, sounded like a more profound design decision to me.
If you have a look at the comment of Text >> #=, you'll find that it's a design decision (no reasoning though):
= other "Am I equal to the other Text or String? ***** Warning ***** Two Texts are considered equal if they have the same characters in them. They might have completely different emphasis, fonts, sizes, text actions, or embedded morphs. If you need to find out if one is a true copy of the other, you must do (text1 = text2 and: [text1 runs = text2 runs])."
Though equality with Strings is not symmetric;
'foo' asText = 'foo'. "==> true" 'foo' = 'foo' asText. "==> false"
I don't know what relies on Text-String equality, but probably many things assume that Texts and Strings are somewhat interchangable (remember when you changed SHParserST80 >> #initializeVariablesFromContext, and Shout ran into errors because it expected source to be a String but got a Text?)
You can't keep equality with Strings if you change #= because you'll lose transitivity:
'foo' asText allBold = 'foo'. "==> true" 'foo' asText = 'foo'. "==> true" 'foo' asText allBold = 'foo' asText "==> false"
Should you decide to change #=, remember to change #hash as well, and rehash all hashed collections with text keys.
Levente
If no one sees a problem in changing this behavior, I can try my luck. :-)
Best,
Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Taeumel, Marcel Gesendet: Dienstag, 15. September 2020 16:42:11 An: squeak-dev Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences Hi Christoph. Performance. Change it, bench it, post the results here. :-) Please specify you machine and try it on a slow RaspPi, too.
Best, Marcel
Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <christoph.thiede@student.hpi.uni-potsdam.de>: Hi all, is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned? Naively and without an overview of any existing components that could rely on this implementation, I would like to question it. Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ... There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into the inspectors.
Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
I'm looking forward to your opinions.
Best, Christoph
Hi Levente,
hm, I think #= should be always commutative and transitive, everything else is at least confusing ...
Can't we move that "attribute invariant" comparison rather to something like Text >> #sameAs:?
Best,
Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Levente Uzonyi leves@caesar.elte.hu Gesendet: Mittwoch, 16. September 2020 15:00:28 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences
Hi Christoph,
On Wed, 16 Sep 2020, Thiede, Christoph wrote:
Interesting, I would not have assumed that this would be only about performance, sounded like a more profound design decision to me.
If you have a look at the comment of Text >> #=, you'll find that it's a design decision (no reasoning though):
= other "Am I equal to the other Text or String? ***** Warning ***** Two Texts are considered equal if they have the same characters in them. They might have completely different emphasis, fonts, sizes, text actions, or embedded morphs. If you need to find out if one is a true copy of the other, you must do (text1 = text2 and: [text1 runs = text2 runs])."
Though equality with Strings is not symmetric;
'foo' asText = 'foo'. "==> true" 'foo' = 'foo' asText. "==> false"
I don't know what relies on Text-String equality, but probably many things assume that Texts and Strings are somewhat interchangable (remember when you changed SHParserST80 >> #initializeVariablesFromContext, and Shout ran into errors because it expected source to be a String but got a Text?)
You can't keep equality with Strings if you change #= because you'll lose transitivity:
'foo' asText allBold = 'foo'. "==> true" 'foo' asText = 'foo'. "==> true" 'foo' asText allBold = 'foo' asText "==> false"
Should you decide to change #=, remember to change #hash as well, and rehash all hashed collections with text keys.
Levente
If no one sees a problem in changing this behavior, I can try my luck. :-)
Best,
Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Taeumel, Marcel Gesendet: Dienstag, 15. September 2020 16:42:11 An: squeak-dev Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences Hi Christoph. Performance. Change it, bench it, post the results here. :-) Please specify you machine and try it on a slow RaspPi, too.
Best, Marcel
Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <christoph.thiede@student.hpi.uni-potsdam.de>: Hi all, is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned? Naively and without an overview of any existing components that could rely on this implementation, I would like to question it. Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ... There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into the inspectors.
Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
I'm looking forward to your opinions.
Best, Christoph
Hi Christoph --
At this point, let's not fiddle with Text >> #=.
I think that comparing sets of attributes with each other can be as tricky as comparing morphs, unless you restrict yourself to very simple emphasis (bold/italic/...).
Identity-vs-state will bite you for "non-literal" attributes: - PluggableTextAttribute (i.e., compiled code, bindings, complex objects, ...) - TextFontReference (i.e., various font properties, form-set fonts, pixel comparison?, ...) - TextAnchor (i.e., morphs ...)
Thus, the very mechanism of text properties is so flexible that implementing a useful comparison should be done on a case-by-case basis. For example, compare #runs if necessary and sufficient, as suggested in the comment in #=.
Just comparing the identity of text attributes is not worth breaking backwards compatibility.
Best, Marcel
Am 16.09.2020 16:44:07 schrieb Thiede, Christoph christoph.thiede@student.hpi.uni-potsdam.de:
Hi Levente,
hm, I think #= should be always commutative and transitive, everything else is at least confusing ...
Can't we move that "attribute invariant" comparison rather to something like Text >> #sameAs:?
Best,
Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Levente Uzonyi leves@caesar.elte.hu Gesendet: Mittwoch, 16. September 2020 15:00:28 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences
Hi Christoph,
On Wed, 16 Sep 2020, Thiede, Christoph wrote:
Interesting, I would not have assumed that this would be only about performance, sounded like a more profound design decision to me.
If you have a look at the comment of Text >> #=, you'll find that it's a design decision (no reasoning though):
= other "Am I equal to the other Text or String? ***** Warning ***** Two Texts are considered equal if they have the same characters in them. They might have completely different emphasis, fonts, sizes, text actions, or embedded morphs. If you need to find out if one is a true copy of the other, you must do (text1 = text2 and: [text1 runs = text2 runs])."
Though equality with Strings is not symmetric;
'foo' asText = 'foo'. "==> true" 'foo' = 'foo' asText. "==> false"
I don't know what relies on Text-String equality, but probably many things assume that Texts and Strings are somewhat interchangable (remember when you changed SHParserST80 >> #initializeVariablesFromContext, and Shout ran into errors because it expected source to be a String but got a Text?)
You can't keep equality with Strings if you change #= because you'll lose transitivity:
'foo' asText allBold = 'foo'. "==> true" 'foo' asText = 'foo'. "==> true" 'foo' asText allBold = 'foo' asText "==> false"
Should you decide to change #=, remember to change #hash as well, and rehash all hashed collections with text keys.
Levente
If no one sees a problem in changing this behavior, I can try my luck. :-)
Best,
Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Taeumel, Marcel Gesendet: Dienstag, 15. September 2020 16:42:11 An: squeak-dev Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences Hi Christoph. Performance. Change it, bench it, post the results here. :-) Please specify you machine and try it on a slow RaspPi, too.
Best, Marcel
Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <christoph.thiede@student.hpi.uni-potsdam.de>: Hi all, is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned? Naively and without an overview of any existing components that could rely on this implementation, I would like to question it. Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ... There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into the inspectors.
Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
I'm looking forward to your opinions.
Best, Christoph
On Jan 11, 2023, at 5:37 AM, Taeumel, Marcel Marcel.Taeumel@hpi.de wrote:
Hi Christoph --
At this point, let's not fiddle with Text >> #=.
+1
I think that comparing sets of attributes with each other can be as tricky as comparing morphs, unless you restrict yourself to very simple emphasis (bold/italic/...).
Identity-vs-state will bite you for "non-literal" attributes:
- PluggableTextAttribute (i.e., compiled code, bindings, complex objects, ...)
- TextFontReference (i.e., various font properties, form-set fonts, pixel comparison?, ...)
- TextAnchor (i.e., morphs ...)
Thus, the very mechanism of text properties is so flexible that implementing a useful comparison should be done on a case-by-case basis. For example, compare #runs if necessary and sufficient, as suggested in the comment in #=.
Just comparing the identity of text attributes is not worth breaking backwards compatibility.
+1
I faced a similar issue when defining CompiledMethod >> #=. CompiledMethods hold onto their class via the method class association. So if the same exact sequence of bytecodes and literals occurs in two methods but on different classes, comparing the method class associations will show them as different. But this kind of code duplication is often exactly what we want to identify. So CompiledMethod >> #= treats the method class associations comparison carefully; each method should have one or neither should.
As a thought experiment let’s imagine one wanted to compare two Smalltalk images to find out where they differ. One could progress through the Smalltalk dictionary in alphabetic order and compare classes and globals. But classes refer to subclasses and superclasses, and methods, so the transitive closure from any class will include the Smalltalk dictionary. That’s not a serious issue; we can add a visited set to avoid looping in the comparison. But if equality does look deep into the structure, and not just at a surface string (eg of methods and inst var names & class vars) then *any* difference, say in the value of a global in Smalltalk (release id?) will mark any class as different, and that’s not very useful. Instead we probably want to compare classes “skin deep” (do they have the same superclass name, inst var names, class vars, methods and organization) and then we have a chance of finding the much more specific difference.
I know the text comparison issue is a little different. But the current definition is useful (after all the visual representation of a text is affected by the font and that’s not defined by the text itself (in our model)). And as Marcel says, one can always define a different one for special cases. And we have lots of precedence for that (isSameSequenceAs: et al).
Equality is tricky. A given definition must be useful “in a general context” (hand waving I know), and must match #hash, and must be reasonably efficient to be usable. All that implies that in non-trivial implements a good comment should be written :-)
Happy ‘23!
Best, Marcel
Am 16.09.2020 16:44:07 schrieb Thiede, Christoph christoph.thiede@student.hpi.uni-potsdam.de:
Hi Levente,
hm, I think #= should be always commutative and transitive, everything else is at least confusing ...
Can't we move that "attribute invariant" comparison rather to something like Text >> #sameAs:?
Best,
Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Levente Uzonyi leves@caesar.elte.hu Gesendet: Mittwoch, 16. September 2020 15:00:28 An: The general-purpose Squeak developers list Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences
Hi Christoph,
On Wed, 16 Sep 2020, Thiede, Christoph wrote:
Interesting, I would not have assumed that this would be only about performance, sounded like a more profound design decision to me.
If you have a look at the comment of Text >> #=, you'll find that it's a design decision (no reasoning though):
= other "Am I equal to the other Text or String? ***** Warning ***** Two Texts are considered equal if they have the same characters in them. They might have completely different emphasis, fonts, sizes, text actions, or embedded morphs. If you need to find out if one is a true copy of the other, you must do (text1 = text2 and: [text1 runs = text2 runs])."
Though equality with Strings is not symmetric;
'foo' asText = 'foo'. "==> true" 'foo' = 'foo' asText. "==> false"
I don't know what relies on Text-String equality, but probably many things assume that Texts and Strings are somewhat interchangable (remember when you changed SHParserST80 >> #initializeVariablesFromContext, and Shout ran into errors because it expected source to be a String but got a Text?)
You can't keep equality with Strings if you change #= because you'll lose transitivity:
'foo' asText allBold = 'foo'. "==> true" 'foo' asText = 'foo'. "==> true" 'foo' asText allBold = 'foo' asText "==> false"
Should you decide to change #=, remember to change #hash as well, and rehash all hashed collections with text keys.
Levente
If no one sees a problem in changing this behavior, I can try my luck. :-)
Best,
Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Taeumel, Marcel Gesendet: Dienstag, 15. September 2020 16:42:11 An: squeak-dev Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences Hi Christoph. Performance. Change it, bench it, post the results here. :-) Please specify you machine and try it on a slow RaspPi, too.
Best, Marcel
Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <christoph.thiede@student.hpi.uni-potsdam.de>: Hi all, is there any old thread about the design discussion of how Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned? Naively and without an overview of any existing components that could rely on this implementation, I would like to question it. Why should 'foo' asText allBold be equal to 'foo' asText addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ... There is even a concrete client in the Trunk suffering from this design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses TextFontReference with a FormSetFont to display a screenshot right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into the inspectors.
Even though this inspector implementation is a bit unusual, in my opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
I'm looking forward to your opinions.
Best, Christoph
Tangential to comparison is also the morphic copy methods. What do we mean when we copy a complex object?
Best, Karl
On Wed, Jan 11, 2023 at 5:40 PM Eliot Miranda eliot.miranda@gmail.com wrote:
On Jan 11, 2023, at 5:37 AM, Taeumel, Marcel Marcel.Taeumel@hpi.de wrote:
Hi Christoph --
At this point, let's not fiddle with Text >> #=.
+1
I think that comparing sets of attributes with each other can be as tricky as comparing morphs, unless you restrict yourself to very simple emphasis (bold/italic/...).
Identity-vs-state will bite you for "non-literal" attributes:
- PluggableTextAttribute (i.e., compiled code, bindings, complex objects,
...)
- TextFontReference (i.e., various font properties, form-set fonts, pixel
comparison?, ...)
- TextAnchor (i.e., morphs ...)
Thus, the very mechanism of text properties is so flexible that implementing a useful comparison should be done on a case-by-case basis. For example, compare #runs if necessary and sufficient, as suggested in the comment in #=.
Just comparing the identity of text attributes is not worth breaking backwards compatibility.
+1
I faced a similar issue when defining CompiledMethod >> #=. CompiledMethods hold onto their class via the method class association. So if the same exact sequence of bytecodes and literals occurs in two methods but on different classes, comparing the method class associations will show them as different. But this kind of code duplication is often exactly what we want to identify. So CompiledMethod >> #= treats the method class associations comparison carefully; each method should have one or neither should.
As a thought experiment let’s imagine one wanted to compare two Smalltalk images to find out where they differ. One could progress through the Smalltalk dictionary in alphabetic order and compare classes and globals. But classes refer to subclasses and superclasses, and methods, so the transitive closure from any class will include the Smalltalk dictionary. That’s not a serious issue; we can add a visited set to avoid looping in the comparison. But if equality does look deep into the structure, and not just at a surface string (eg of methods and inst var names & class vars) then *any* difference, say in the value of a global in Smalltalk (release id?) will mark any class as different, and that’s not very useful. Instead we probably want to compare classes “skin deep” (do they have the same superclass name, inst var names, class vars, methods and organization) and then we have a chance of finding the much more specific difference.
I know the text comparison issue is a little different. But the current definition is useful (after all the visual representation of a text is affected by the font and that’s not defined by the text itself (in our model)). And as Marcel says, one can always define a different one for special cases. And we have lots of precedence for that (isSameSequenceAs: et al).
Equality is tricky. A given definition must be useful “in a general context” (hand waving I know), and must match #hash, and must be reasonably efficient to be usable. All that implies that in non-trivial implements a good comment should be written :-)
Happy ‘23!
Best, Marcel
Am 16.09.2020 16:44:07 schrieb Thiede, Christoph < christoph.thiede@student.hpi.uni-potsdam.de>:
Hi Levente,
hm, I think #= should be always commutative and transitive, everything else is at least confusing ...
Can't we move that "attribute invariant" comparison rather to something like Text >> #sameAs:?
Best,
Christoph
*Von:* Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Levente Uzonyi leves@caesar.elte.hu *Gesendet:* Mittwoch, 16. September 2020 15:00:28 *An:* The general-purpose Squeak developers list *Betreff:* Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences
Hi Christoph,
On Wed, 16 Sep 2020, Thiede, Christoph wrote:
Interesting, I would not have assumed that this would be only about
performance, sounded like a more profound design decision to me.
If you have a look at the comment of Text >> #=, you'll find that it's a design decision (no reasoning though):
= other "Am I equal to the other Text or String? ***** Warning ***** Two Texts are considered equal if they have the same characters in them. They might have completely different emphasis, fonts, sizes, text actions, or embedded morphs. If you need to find out if one is a true copy of the other, you must do (text1 = text2 and: [text1 runs = text2 runs])."
Though equality with Strings is not symmetric;
'foo' asText = 'foo'. "==> true" 'foo' = 'foo' asText. "==> false"
I don't know what relies on Text-String equality, but probably many things assume that Texts and Strings are somewhat interchangable (remember when you changed SHParserST80 >> #initializeVariablesFromContext, and Shout ran into errors because it expected source to be a String but got a Text?)
You can't keep equality with Strings if you change #= because you'll lose transitivity:
'foo' asText allBold = 'foo'. "==> true" 'foo' asText = 'foo'. "==> true" 'foo' asText allBold = 'foo' asText "==> false"
Should you decide to change #=, remember to change #hash as well, and rehash all hashed collections with text keys.
Levente
If no one sees a problem in changing this behavior, I can try my luck.
:-)
Best,
Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im
Auftrag von Taeumel, Marcel
Gesendet: Dienstag, 15. September 2020 16:42:11 An: squeak-dev Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its
consequences
Hi Christoph. Performance. Change it, bench it, post the results here. :-) Please
specify you machine and try it on a slow RaspPi, too.
Best, Marcel
Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <
christoph.thiede@student.hpi.uni-potsdam.de>:
Hi all, is there any old thread about the design discussion of how
Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?
Naively and without an overview of any existing components that
could rely on this implementation, I would like to question it.
Why should 'foo' asText allBold be equal to 'foo' asText
addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...
There is even a concrete client in the Trunk suffering from this
design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses
TextFontReference with a FormSetFont to display a screenshot right
in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into
the inspectors.
Even though this inspector implementation is a bit unusual, in my
opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
I'm looking forward to your opinions.
Best, Christoph
On 2023-01-11, at 10:14 AM, karl ramberg karlramberg@gmail.com wrote:
Tangential to comparison is also the morphic copy methods. What do we mean when we copy a complex object?
..and...
On Wed, Jan 11, 2023 at 5:40 PM Eliot Miranda eliot.miranda@gmail.com wrote:
Equality is tricky. A given definition must be useful “in a general context” (hand waving I know), and must match #hash, and must be reasonably efficient to be usable. All that implies that in non-trivial implements a good comment should be written :-)
These are important points that also impact old favourites like hashing. For copying we have delights such as deepCopy, veryDeepCopy, veryVeryDeepCopy and so forth. For comparison we have convoluted tests that confuse almost everyone in later days.
My suggestion for a flexible, self-documenting, and hopefully intelligible way of handling these cases is to be more explicit. Don't implement #= if what you actually mean is #isEquivalentInCaseOfEditingMorphStructureTo: . Provide #copyMyComplicatedThingForSavingToSIXXStream instead of misappropriating #copy.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- An 8086 in a StrongARM environment.
Since 2004, the test suite for Ma Object Serializer utilizes an abstract any-Object-comparison, and has worked really well. This shows enough about how it does it -- basically brute force.
[image: comparing-object-equivalence.png]
The above is the abstract equivalence test implementation on Object which, yes, a lot of classes have to override. But, it's not hard. Intimidation by the implementation should not be the (or, even a) deciding factor about whether Text equivalence should honor attributes.
Re: copying -- there is no veryVeryDeepCopy, and there's nothing wrong with having all of #shallowCopy, #copy, #deepCopy vs. #veryDeepCopy. The former is part of the Smalltalk language that generally goes +1 level deeper than regular #copy, whereas #veryDeepCopy is used for the Prototype design pattern, which is the fundamental design property of Morphic.
- Chris
On Wed, Jan 11, 2023 at 7:37 AM Taeumel, Marcel Marcel.Taeumel@hpi.de wrote:
Hi Christoph --
At this point, let's not fiddle with Text >> #=.
I think that comparing sets of attributes with each other can be as tricky as comparing morphs, unless you restrict yourself to very simple emphasis (bold/italic/...).
Identity-vs-state will bite you for "non-literal" attributes:
- PluggableTextAttribute (i.e., compiled code, bindings, complex objects,
...)
- TextFontReference (i.e., various font properties, form-set fonts, pixel
comparison?, ...)
- TextAnchor (i.e., morphs ...)
Thus, the very mechanism of text properties is so flexible that implementing a useful comparison should be done on a case-by-case basis. For example, compare #runs if necessary and sufficient, as suggested in the comment in #=.
Just comparing the identity of text attributes is not worth breaking backwards compatibility.
Best, Marcel
Am 16.09.2020 16:44:07 schrieb Thiede, Christoph < christoph.thiede@student.hpi.uni-potsdam.de>:
Hi Levente,
hm, I think #= should be always commutative and transitive, everything else is at least confusing ...
Can't we move that "attribute invariant" comparison rather to something like Text >> #sameAs:?
Best,
Christoph
*Von:* Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Levente Uzonyi leves@caesar.elte.hu *Gesendet:* Mittwoch, 16. September 2020 15:00:28 *An:* The general-purpose Squeak developers list *Betreff:* Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences
Hi Christoph,
On Wed, 16 Sep 2020, Thiede, Christoph wrote:
Interesting, I would not have assumed that this would be only about
performance, sounded like a more profound design decision to me.
If you have a look at the comment of Text >> #=, you'll find that it's a design decision (no reasoning though):
= other "Am I equal to the other Text or String? ***** Warning ***** Two Texts are considered equal if they have the same characters in them. They might have completely different emphasis, fonts, sizes, text actions, or embedded morphs. If you need to find out if one is a true copy of the other, you must do (text1 = text2 and: [text1 runs = text2 runs])."
Though equality with Strings is not symmetric;
'foo' asText = 'foo'. "==> true" 'foo' = 'foo' asText. "==> false"
I don't know what relies on Text-String equality, but probably many things assume that Texts and Strings are somewhat interchangable (remember when you changed SHParserST80 >> #initializeVariablesFromContext, and Shout ran into errors because it expected source to be a String but got a Text?)
You can't keep equality with Strings if you change #= because you'll lose transitivity:
'foo' asText allBold = 'foo'. "==> true" 'foo' asText = 'foo'. "==> true" 'foo' asText allBold = 'foo' asText "==> false"
Should you decide to change #=, remember to change #hash as well, and rehash all hashed collections with text keys.
Levente
If no one sees a problem in changing this behavior, I can try my luck.
:-)
Best,
Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im
Auftrag von Taeumel, Marcel
Gesendet: Dienstag, 15. September 2020 16:42:11 An: squeak-dev Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its
consequences
Hi Christoph. Performance. Change it, bench it, post the results here. :-) Please
specify you machine and try it on a slow RaspPi, too.
Best, Marcel
Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <
christoph.thiede@student.hpi.uni-potsdam.de>:
Hi all, is there any old thread about the design discussion of how
Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?
Naively and without an overview of any existing components that
could rely on this implementation, I would like to question it.
Why should 'foo' asText allBold be equal to 'foo' asText
addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...
There is even a concrete client in the Trunk suffering from this
design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses
TextFontReference with a FormSetFont to display a screenshot right
in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into
the inspectors.
Even though this inspector implementation is a bit unusual, in my
opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
I'm looking forward to your opinions.
Best, Christoph
On Thu, Jan 12, 2023 at 12:00 AM Chris Muller asqueaker@gmail.com wrote:
Since 2004, the test suite for Ma Object Serializer utilizes an abstract any-Object-comparison, and has worked really well. This shows enough about how it does it -- basically brute force.
[image: comparing-object-equivalence.png]
The above is the abstract equivalence test implementation on Object which, yes, a lot of classes have to override. But, it's not hard. Intimidation by the implementation should not be the (or, even a) deciding factor about whether Text equivalence should honor attributes.
Re: copying -- there is no veryVeryDeepCopy, and there's nothing wrong with having all of #shallowCopy, #copy, #deepCopy vs. #veryDeepCopy. The former is part of the Smalltalk language that generally goes +1 level deeper than regular #copy, whereas #veryDeepCopy is used for the Prototype design pattern, which is the fundamental design property of Morphic.
I'm with you here, there is nothing wrong with these copy methods. I just point out that you have to be very cautious of your intentions when you use them.
Best, Karl
- Chris
On Wed, Jan 11, 2023 at 7:37 AM Taeumel, Marcel Marcel.Taeumel@hpi.de wrote:
Hi Christoph --
At this point, let's not fiddle with Text >> #=.
I think that comparing sets of attributes with each other can be as tricky as comparing morphs, unless you restrict yourself to very simple emphasis (bold/italic/...).
Identity-vs-state will bite you for "non-literal" attributes:
- PluggableTextAttribute (i.e., compiled code, bindings, complex objects,
...)
- TextFontReference (i.e., various font properties, form-set fonts, pixel
comparison?, ...)
- TextAnchor (i.e., morphs ...)
Thus, the very mechanism of text properties is so flexible that implementing a useful comparison should be done on a case-by-case basis. For example, compare #runs if necessary and sufficient, as suggested in the comment in #=.
Just comparing the identity of text attributes is not worth breaking backwards compatibility.
Best, Marcel
Am 16.09.2020 16:44:07 schrieb Thiede, Christoph < christoph.thiede@student.hpi.uni-potsdam.de>:
Hi Levente,
hm, I think #= should be always commutative and transitive, everything else is at least confusing ...
Can't we move that "attribute invariant" comparison rather to something like Text >> #sameAs:?
Best,
Christoph
*Von:* Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Levente Uzonyi leves@caesar.elte.hu *Gesendet:* Mittwoch, 16. September 2020 15:00:28 *An:* The general-purpose Squeak developers list *Betreff:* Re: [squeak-dev] FormInspector, or also: Text>>#= and its consequences
Hi Christoph,
On Wed, 16 Sep 2020, Thiede, Christoph wrote:
Interesting, I would not have assumed that this would be only about
performance, sounded like a more profound design decision to me.
If you have a look at the comment of Text >> #=, you'll find that it's a design decision (no reasoning though):
= other "Am I equal to the other Text or String? ***** Warning ***** Two Texts are considered equal if they have the same characters in them. They might have completely different emphasis, fonts, sizes, text actions, or embedded morphs. If you need to find out if one is a true copy of the other, you must do (text1 = text2 and: [text1 runs = text2 runs])."
Though equality with Strings is not symmetric;
'foo' asText = 'foo'. "==> true" 'foo' = 'foo' asText. "==> false"
I don't know what relies on Text-String equality, but probably many things assume that Texts and Strings are somewhat interchangable (remember when you changed SHParserST80 >> #initializeVariablesFromContext, and Shout ran into errors because it expected source to be a String but got a Text?)
You can't keep equality with Strings if you change #= because you'll lose transitivity:
'foo' asText allBold = 'foo'. "==> true" 'foo' asText = 'foo'. "==> true" 'foo' asText allBold = 'foo' asText "==> false"
Should you decide to change #=, remember to change #hash as well, and rehash all hashed collections with text keys.
Levente
If no one sees a problem in changing this behavior, I can try my luck.
:-)
Best,
Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im
Auftrag von Taeumel, Marcel
Gesendet: Dienstag, 15. September 2020 16:42:11 An: squeak-dev Betreff: Re: [squeak-dev] FormInspector, or also: Text>>#= and its
consequences
Hi Christoph. Performance. Change it, bench it, post the results here. :-) Please
specify you machine and try it on a slow RaspPi, too.
Best, Marcel
Am 10.09.2020 20:32:34 schrieb Thiede, Christoph <
christoph.thiede@student.hpi.uni-potsdam.de>:
Hi all, is there any old thread about the design discussion of how
Text>>#= works? (It does not consider attributes for quality.) Has this decision ever been questioned?
Naively and without an overview of any existing components that
could rely on this implementation, I would like to question it.
Why should 'foo' asText allBold be equal to 'foo' asText
addAttribute: TextURL new? With the same logic, we could also say that two dictionaries are equal iff they have got the same keys ...
There is even a concrete client in the Trunk suffering from this
design decision: Marcel's new FormInspector (and analogously, MorphInspector). It uses
TextFontReference with a FormSetFont to display a screenshot
right in the inspector pane. Unfortunately, the pane is never updated automatically because even if the screenshot changes, the text morph thinks the old text would equal the new one. I'd like to fix that without hacking any workaround into
the inspectors.
Even though this inspector implementation is a bit unusual, in my
opinion, it shows that the current Text >> #= implementation might not be a perfect solution.
I'm looking forward to your opinions.
Best, Christoph
squeak-dev@lists.squeakfoundation.org