[BUG] equivalence between strings and symbols
Allen Wirfs-Brock
Allen_Wirfs-Brock at Instantiations.com
Mon Apr 10 19:43:03 UTC 2000
At 10:27 AM 4/10/00 -0700, John W. Sarkela wrote:
>Originally I thought along your lines, Lex. Then I actually read
>the ANSI spec. It suggests that collections of differing class
>should never be equal.
I don't really think that this is the intent of the standard as expressed
in these words (that I probably wrote) in section 5.7.8.2:
Unless specifically refined, the receiver and operand are equivalent if all
of the following are true:
1. The receiver and operand are instances of the same class.
2. They answer the same value for the #size message.
3. For all indices of the receiver, the element in the receiver at a
given index is equivalent to the element in operand at the same index.
Note that it doesn't say anything universally about collections, it is only
speaking about objects that conform to the protocol
<sequenceReadableCollection>. It explicitly allows for the possibility that
refinements of that protocol might define different rules for equivalence,
including rules that do not require identical classes for the receiver and
the operand.
However, the standard doesn't say anything about <readableSring> refining
#= so the above applies. Hence, if if liberal strings and literal symbols
are implemented using different classes (and it's not at all obvious to me
that this is a requirement) then
'abc' = #'abc" "should answer false
#'abc' = 'abc' "should answer false
One of the reason, that there isn't a refinement of #= is that #= for
strings and symbols is that there historically been so much variation both
between and within implementations in the definition of string and symbol
=. Including the fact the definition of equality used for #>= and #<= of
strings is different than that used for #=. In many ways, the real quality
operation for all types of <readableString> objects is sameAs:. In fact, I
believe that the intent is that
'abc' sameAs: #'abc' and #'abc' sameAs: 'abc' should both answer true.
Also I not believe that the standard necessarily requires that the same
class be used to implement mutable string objects and immutable literal
string objects. Thus, it isn't necessarily clear that you can assume that
(String with:$a with: $b) = 'ab' will always answer true.
In fact, it is probably valid (according to the standard) to implement
string literals using the same class as is used to implement symbols
because <symbol> fulfills all the requirements of <readableString>
> If the behavior must change (I believe it must)
>there had better be strongly compelling reasons to go against
>standard behavior.
If we could have found a way cleanup #= for strings (and symbols) without
breaking all existing implementations and many existing applications we
probably would have done so. If the squeak community doesn't share this
imperative and has a good solution, have at it!
>Given the standard I believe that either #= in SequenceableCollection
>should check for class equality rather than species equality or else
>redefine #= in String to preclude asymmetry when compared to symbols.
Remember that the standard only defines a protocol
<sequenceReadableCollection> that specifies certain behaviors. It does not
specify an implementation. In particular, it does not say that all objects
that implement this behavior must inherit from some particular abstract
class or how the implementation of the required behavior is factored among
classes.
Allen_Wirfs-Brock at Instantiations.com
More information about the Squeak-dev
mailing list
|