[FIX] #union: broken for sets
Richard A. O'Keefe
ok at cs.otago.ac.nz
Thu Feb 20 23:36:34 UTC 2003
In Squeak 3.0, 3.2, and 3.4alpha(5108) we find the same implementation
of #union:
Collection>>
union: aCollection
^self asSet addAll: aCollection; yourself
This is the _only_ implementation of #union: in those versions of Squeak.
There are three implementations of #asSet,
Collection>>
asSet
"Answer a Set whose elements are the unique elements
of the receiver."
^Set withAll: self
Bag>>
asSet
^contents keys
Set>>
asSet
^self
The method comment in Collection says that the answer is a Set,
but for bags it may be a Set or an IdentitySet, and for sets it
may be a Set, an IdentitySet, or a PluggableSet.
Collections other than sets always answer a new object to #asSet;
sets of all kinds never answer a new object.
The problem is that the point of #union: is to return a new object
without mutating any existing objects. But, irony of ironies, if
the receiver already is some kind of set, it IS mutated.
x := #(3 1 4 1) asSet<print it>
=> a Set(1 3 4)
x union: #(5 9 6)<print it>
=> a Set(1 3 4 5 6 9)
x<print it>
=> a Set(1 3 4 5 6 9)
The first problem to sort out is what the semantics of #asSet
is supposed to be. #asIdentitySet always returns an IdentitySet,
not some other kind of set. "as: Set" always returns a new object
that is precisely a Set, not any other kind of set. The method
comment in Collection suggests that the author thought #asSet
always returned a Set.
The simplest change that could possibly work is "add a colon".
Collection>>
union: aCollection
^(self as: Set) addAll: aCollection; yourself
This would eliminate the unintended mutation bug.
It would also make the result always be a Set, not any other kind of set.
Less happily, it would be less efficient for bags.
What I actually propose is a "bigger" change that makes much less
difference to the observable behaviour. In particular, the change
I propose
+ does not alter any existing method
+ does not change the costs or results of #union: for anything except sets.
? still allows the kind of set you get to depend on the class of the
receiver, and in the same way as before.
The correction is to add a new method:
Set>>
union: aCollection
"Answer the set theoretic union of the receiver and aCollection,
using the receiver's notion of equality and not side effecting
the receiver at all."
^self copy addAll: aCollection; yourself
More information about the Squeak-dev
mailing list
|