[BUG]Collection>>removeAll:

Ralph Johnson johnson at cs.uiuc.edu
Wed Aug 21 12:55:44 UTC 2002


To me, the definition of removeAll: is not an English description
of what it does, it is

removeAll: aCollection
  aCollection do: [:each | self remove: each]

The same for addAll:.  

Smalltalk might have been invented as a language for children,
but that was a long time ago.  For the past 25 years or so, it
has been a language for experienced programmers to build serious
systems.

That said, if a lot of people make the same kind of mistake, it
makes sense to change the system.  Collections should certainly
support a removeAll operation that makes a collection empty.
But the real problem here is that do: is not safe if the collection
is being changed.  It would be pretty simple to change do: such that
it makes a copy of the collection before iterating over it.  In fact,
we only have to make this change for growable collections.  If a
collection supports remove:, you should be able to say 
  things remove: things
and get an empty collection.  But if it doesn't support remove:,
then it can support do: without copying.

This would require changing do: in Set, Bag, Dictionary and
OrderedCollection.  Probably WeakDictionary, too.  

I see two arguments against this change.  One is that it would
break existing programs.  I find this hard to believe.  Nobody
depends on the behavior of changing a collection while you are
iterating over it.  It is certainly not well defined.  I don't
think it would break any existing programs.

The other argument is that it would make the system slower.
I bet it wouldn't have much impact. The only way to find out
is to try it, of course.  So I did.  The following experiments
were done in VW 7, not in Squeak.  Sorry.  

For each of these classes, I copied do: to basicDo: and redefined

do: aBlock
  self copy basicDo: aBlock

It turns out that Dictionary and IdentityDictionay can inherit it
from Set, so I deleted do: in those classes.

My first test was to make an OrderedCollection that contained the numbers
1 to 1000000 and to add them up with a do: loop.  Doing this with do:
took 3.3 sec on my old computer, and with basicDo: took 2.6 sec, nearly
half again as long.  That is what I expected, because this is a worst
case for the change.  I also tried a set of banchmarks that Cincom provides,
which aren't very realistic but are easy to run.  It was about 20% slower,
which was more than I expected.  I expected real programs to be about 5%
slower.  I don't have time to test it on a real program, but I'll do it
later.

It is easy to change do: so that it copies the receiver as a
default.  The only trick is not to make any mistakes that break the
system while you are doing it, because a system with a bug in
OrderedCollection>>do: will not work at all.  Change your system and
see if you mind the performance difference.

I think that if you are going to change Collections, you should go
all the way and change do:.  Don't patch removeAll: or addAll:.  They
are not the problem, and if you start trying to make them safe without
changing do: then you will make the system too complicated.

-Ralph Johnson



More information about the Squeak-dev mailing list