What does equality (=) mean in Collection classes?

Allen Wirfs-Brock Allen_Wirfs-Brock at Instantiations.com
Wed Jan 21 18:57:01 UTC 1998


Alan Darlington wrote in message <885329898.802947090 at dejanews.com>...
>...
>Shortly thereafter, co-workers Patrick Logan and Martin McClure
>joined me, and we had a long discussion on what the correct semantics
>for = should be.  ...
>...
>>Our conclusion:  In the major commercial versions of Smalltalk
>(VA and VW), the vendors do not provide a meaningful way to
>compare collections....

The reason is that there is no single "correct semantics" for =. The correct
semantics is situational.  For collections it sometimes is "the identical
collection", sometimes it is "contains the 'same' (= or =??) values",
sometimes it is "contains the 'same' keys and values", etc.

The draft standard says this about <Object>#= : "This message tests whether
the receiver and the comparand are equivalent objects at the time of
comparision" and  "The meaning of 'equivalent cannot be precisely defined
but the intent is that two objects are considered edquivalent if they can be
used interchangeably". (Section 5.3.1.1) As a glossary term within the
standard, "equivalent" is circularly defined  to mean that the message #=
applied to two objects returns true. The standard goes on to note that there
are "temporal invariance" issues associated with the use of #=,
particularlly in the context of collections.

These extremely weak definitions are as good as I've seen anybody come up
with for #= in Smalltalk. The basic problem is that #= is used polymorphicly
as if it meant something specific when in fact it has never been
consistantly defined that way by any implementation.

>This means that customers who care about
>this have a choice of (1) adding = methods to collection classes
>(with potentially dangerous side-effects on system behavior),
>(2) adding these methods but with a different name, or
>(3) writing in-line code wherever they would ordinarily have used
>the = test (fast but terrible technique).

I would recommend #2. If a programmer is interested in a particular semantic
of comparision he should explicitly define it.  Classic Smalltalk already
does this someplaces, for example String>>sameAs:

>
>Our questions:  Why didn't the major vendors provide reasonable
>implementations of = in the collection hierarchy?  What _is_
>reasonable behavior, especially for dictionaries?  (Should there
>be a set of classes like IdentityValueIdentityDictionary,
>EqualityValueIdentityDictionary, etc.?  :-)  Should this issue
>be addressed the the Smalltalk standard?  What is the history
>from the Smalltalk 80 days that got us here?
>...

I think much of the source of this problem is the hard coding of #= as a
comparision operator in the collection classes. This makes it "too hard" to
define a collection that use a situational specific definition of
"equivalance". Some, more modern, implementations of the collection classes
have demonstrated that it is feasible to make the comparision policy (along
with other policy decisions) "pluggable" on a per instance basis. (see, for
example VSE 3.1 and some collections in VW 2.5).

If such collections are available, it then makes sense to define some
specific "equivalance" operations for collections.  For example,
#equalValues:, #identicalValues, #equalKeysAndValues:, etc.

Why isn't something like this in the standard? Because nobody put forward a
proposal based upon a proven implementation and the committee generally
wasloath to "invent". However, the standard was intentially written in such
a manner that it can accomidate implementations of this type.

Personally, I would love to see somebody create and promote (perhaps
initially in the context of Sqeak) some defacto standard collection
protocols of this type.

Allen_Wirfs-Brock at Instantiations.com





More information about the Squeak-dev mailing list