[BUG]Collection>>removeAll:

Richard A. O'Keefe ok at cs.otago.ac.nz
Wed Aug 21 05:45:45 UTC 2002


"David Griswold" <David.Griswold at acm.org> responded to my message:

	> The library is broken in at least two ways:
	> (1) The documentation says NOTHING about the two collections
	> having to be
	>     different.
	
	Virtually *all* collection operations in Smalltalk assume that the
	collections being operated on don't change unexpectedly during
	execution; this bug is just one of the infinite number of ways that can
	happen.  It is not documented because it is an obvious and universal
	principle which experienced programmers understand.

It is not obvious:  beginners have real trouble with it.
It is not universal:  no such problems exist or _can_ exist in Prolog,
    Erlang, or Haskell, to name but a few languages.

The fact that something is obvious to *experienced* programmers is no
excuse for leaving it out of the documentation.  Programmers are not
born experienced.

Still less is it an excuse for leaving it out of the run-time checks.
Even an experienced programmer who doesn't expect (x removeAll: x) to
work may type that by mistake when (x removeall: y) was intended.

	I'm sure you understood it yourself-
	did you really need it told to you?

I was not yet a beginner when Smalltalk-72 was created.
I was no longer a beginner when I met Smalltalk-80.
(To get my dates right, I actually got the Goldberg et al books at the IJCAI
conference in Karlsruhe, was that 1983?  I actually _played_ with Smalltalk
for the first time in about 1986, on one of those Tektronix boxes, for about
10 minutes and gave up.)

I want David Griswold and anyone else still following this thread to
understand clearly that I am not concerned about what *I* do or do not
find obvious, but about what *students* do or do not find obvious.

	You probably just didn't think that it applied to this case- but
	it does.
	
Wrong.  It was pretty obvious to me first time I read about #removeAll: that
it would be vulnerable to this kind of bungle.  But this thread isn't about
*me*, it was started by someone *else* who in all innocence tried to use
#removeAll: this way.

The operation (x removeAll: x) MAKES PERFECT SENSE.  It isn't a common thing
to try, but it is a sensible thing to try, and in any professionally
designed data structures library it is just plain going to work.

	> (2) If an operation isn't going to work, it should bring that to our
	>     attention instead of quietly producing incorrect answers.

	Software that is not used for external, checked interfaces (e.g. a GUI,
	file, or network interface) does *not* have the responsibility of
	detecting and reporting all possible violations of the software's
	interface contracts.

I strongly disagree.  I am struggling to find words strong enough to express
the vehemence of my disagreement with and dismay at such sentiments that are
at the same time sufficiently polite to actually use.

I suppose someone *could* propose this as a design principle,
but it is NOT a design principle that Smalltalk has ever followed.
If you don't believe me, try #(1) at: 0.  Squeak has a very great
deal of code that checks for usage errors and pops up error notifiers.
A lot of that checking code is not in "GUI, file, or network interface(s)".

How come so much Windows and UNIX code is vulnerable to crackers?
Buffer overruns.
Because C *does* follow the principle of not checking much.
Do we *really* want Squeak to be that fragile?

It isn't even clear to me what _is_ an "external" interface.
As a Smalltalk programmer, I _want_ to regard myself as being on
the "outside" of an interface with the Collection classes and Stream
classes and so on being on the "inside".

	It would be nice if that was possible, but it is
	simply theoretically impossible to do for imperative languages.

This is a straw man, having the logical form "because perfection is
impossible, it is idle to seek improvements."

Partly, it's the hoary old problem of Squeak documentation all over again.
I wonder whether Stephan Ducasse (sp?) has asked the authors who've agreed
to him putting up their books in electronic form whether they'd agree to a
Squeak edition?  Could be a good idea.  Some revision would help too.  You
see, even in these books, you tend NOT to find the clear advice

	1.  Do not modify a collection while you are traversing it,
	    whether you are using a Stream or an enumeration method.
	    If you really must modify it, consider traversing a copy
	    instead.

	2.  When you use an object as a key in a Dictionary or an element
	    of a Set or Bag, make sure you never change it in any way
	    until it is not a Dictionary key or Set/Bag element any more.

	When you pass the receiver to removeAll:, you have broken the
	contract, and that is your problem, not the collection's.

There is no such contract.  If the data structure designer did not see fit
to state an essential precondition anywhere in the documentation and did not
see fit to check it, and someone relies on the ACTUAL EXPLICIT CONTRACT,
then the bug is the data structure designer's bug, not his helpless victim's.

Now, I will concede this.  If someone writes their OWN code to modify a
data structure during a loop, then it SHOULD be obvious to them that they
could get into trouble.  Here is a Java example:

import java.util.*;

class Whoops {
    public static void main(String[] arguments) {
        Vector v = new Vector(arguments.length);
        for (int i = 0; i < arguments.length; i++)
            v.addElement(arguments[i]);
        Enumeration e = v.elements();
        while (e.hasMoreElements())
            v.addElement(e.nextElement());
        for (int i = 0; i < v.size(); i++)
            System.out.println(v.elementAt(i));
    }
}

If you run this with at least one command line argument, it will go into
an infinite loop.  (Actually, the Vector Enumeration involved *could* have
been designed and implemented so that it would have worked as expected,
but that's another long sad story.)

I am happy to accept THIS example as the user's fault.

I am NOT happy to accept  x removeAll: x  as the user's fault because
 - the "contract", if you can dignify a rather scrappy comment by that
   name, if you take it at all seriously, appears to claim that this
   should in fact work, because there is nothing in it to lead you to
   suppose that it might not

 - from the outside, it is obvious that the method has the *opportunity*
   to detect the special case and take special action, so contrary to what
   David Griswold is saying, there is no a priori reason to expect that
   the operation will give trouble EVEN IF YOU ARE AWARE that modifying
   a collection while some iteration THAT DOES NOT HAVE THE OPPORTUNITY
   TO KNOW ABOUT THE MODIFICATION is going on.

   This deserves emphasis.

   The reason that the Java example is a problem for any plausible
   implementation is that the Enumeration e isn't being _told_ by
   the programmer about the changes to v.
   
   But when you call a complete 'packaged' batch operation like
   x removeAll: x  and  x addAll: x  then the operation DOES have the
   opportunity to know what is going on and cope with it, because it
   is the operation itself that is making the changes, NOT the programmer
   doing it directly.

   Let me say again, I have myself implemented collection ADTs in an
   imperative language where this stuff just plain worked.  Admittedly,
   I needed a hack, which was
	function same_T(var x, y: T): Boolean;
	    external "C";
   In Smalltalk, that's not a hack, it's a built-in known to the compiler.
   (Mostly I tried to do without the hack.  Sometimes I would increment
   a variable in one object and check if it had changed in the other,
   then change it back.  I'm glad I don't need to do that in Smalltalk.)

 - Finally, the convincing proof that the contract AS STATED could in fact
   be carried out is the fact that the fixes to #removeAll:,
   #removeAllFoundIn:, #addAllFirst:, and #addAllLast: are about as
   trivial as you could wish for.

Can I possibly make this plainer?
I'm not talking about generalities, I'm talking about specific pieces
of code where there is NO contract saying "you can't do this", where
it's not even clear that anybody even thought about the possibility,
and the fix needed to make the thing work in all cases is trivial.

	Only if such a bug is so common that it becomes a real nuisance
	should error checking code that complicates or slows down
	execution of correct code be added.

It isn't even theoretically possible to know how common the symptoms of
the bug are or how much of a nuisance they are.  The people who wrote
the code, and the people who are experienced with the system, are the
very people who are LEAST qualified to make such judgements.

How many students run into this kind of problem, and just go away thinking
"Smalltalk ****s" instead of reporting it to this list or anywhere else?
We can't know.

Because the specific methods in question just QUIETLY give wrong answers,
we don't even know whether they are producing dangerously wrong (but
plausible) answers in production code.

I don't, therefore, agree that it is even close to reasonable to demand
that decisions about bug fixes be based on information we do not and cannot
have.

Remember, we are talking about specifics.  At the extreme, the class of
bugs I am concerned with in this thread is

    <collection> batchOperation: <collection> ....

where the receiver and the "batch" are the same object.
We are talking about overhead
    (c := theBatch) == self ifTrue: [
	"either self errorWontWorkOnItself
	 or c := c copy"].
Looking at the bytecodes, I see
    pushTemp: 0   (theBatch)
    storeIntoTemp: 1  (c)
    self
    send: ==
    jumpFalse: 22
so at 24 million bytescodes/second on my Mac, the cases that previously
used to work are slowed down by about 0.2 microseconds, probably less.
If this were a serious worry, it might be possible to add a
(self; send: ==; jumpFalse: <disp>) portmanteau bytecode.
Considering everything else that happens in something like #removeAll:,
this overhead is negligible.

	A thought experiment: If you think that detecting and reporting all
	these kind of bugs is the collection's responsibility,

Who said 'all'?  Not me.  As many as we can afford, yes.
CERTAINLY the ones where there is a meaningful answer that we could easily
give.

	then tell me how
	you will detect and report this similar and even more common kind of
	bug:  The contract for a Dictionary or Set includes the requirement that
	the object being hashed on not change its hash value or the results of
	any = operation on it, during the time that it is in the collection.
	For example, if a dictionary key changes its hash while it is in the
	dictionary, the dictionary is corrupted.  This is a real bug, that
	happens not infrequently, and which can be very hard to debug.
	
Well, here's the method I tried in one version of my Pascal library.
Every object had a lock count.
When you began an iteration over a collection, the lock count was incremented.
When you ended an iteration, the lock count was decremented.
When you used an object as a key in a dictionary, the lock count was
incremented.
When you removed a key from a dictionary, its lock count was decremented.
If you tried to change an object whose lock count was positive,
that was a checked run-time error.
I eventually abandoned this, because by that time I was the only person
using the library, and in the absence of exception handling, the lock
counts weren't always decremented properly.  Squeak *has* exception handling,
so the technique *could* be used in Squeak.

But it's a good example, and it's one of the reasons I try to use Symbols
as keys rather than Strings, and it's one reason why from time to time I
hack up an immutable container to make sure I _can't_ make that mistake.

	Now, according to the principles you state, it is the dictionary's
	responsibility both to note that fact in its documentation, and
	to check to make sure it doesn't happen.

To state the fact in its documentation, YES, a thousand times !YES!

To check it, only if that's feasible.

What I _would_ consider doing is providing a "repair" method which
traverses the old guts of the container stuffing things back in the
right place in new guts.  (The "Sea Cucumber" defence.)  You'd say
"yes, you may mutate keys in this dictionary, but IF you do that,
you MUST run (theDictionary repair) before invoking any other methods
on theDictionary."

	To do that, the dictionary would need
	to keep a separate record of the hash value for each key, and the
	results of any = operations applied to it, and to check to see if they
	have changed every time the collection is operated on, and then report a
	'key mutation' error if that ever happens.

Wrong.  Just off the top of my head, I can think of three other ways to do it.
In any case, I'll go along with the "operator error" cop-out IF, but ONLY IF,
the warning is clearly stated in the documentation.

        But that is clearly ridiculous to do.  

It would be surprising if a straw man _weren't_ ridiculous.

	I can come up with an endless stream of bugs like this, that are
	at least as important as the removeAll:  bug, just as easy to
	detect, and if you tried to detect them all, the libraries would
	be a slow, bloated mess.  Reducto ad absurdum.
	
You think the collection classes _aren't_ a mess right now?

A key point you keep ignoring is that in the specific cases I am concerned
with,
 - the operations MAKE SENSE when the receiver and argument are the same,
   it's just that they've been coded in such a way that they don't work,
and
 - a change to make them work in all cases is TRIVIAL in terms of coding
   effort and extremely cheap in terms of run-time overhead.

	I sympathize with your wish to have all possible assumptions that the
	libraries make documented and checked.

It is simply unprofessional not to at least *document* them.
I'm an academic now, but when I was paid to develop software, I *did*
worry about bugs like this and I *did* design them out in the first place.
In particular, the "logical update" semantics for Prolog, now part of the
ISO Prolog standard, was my design and Tim Lindholm's implementation to
prevent a very similar problem in Prolog.

As for checking, it's a tradeoff.  As I tell my students, I once instrumented
a program so thoroughly that it ran 1000 times slower, BUT that was a good
thing to do because it found the bug I'd been looking for during the past
week in 5 minutes, and it was one command-line flag to take the checking code
out again.  When I was implementing AVL trees for the first time, I made the
code check the tree for well-formedness after every "external" operation and
some "internal" operations.  Gosh did that run s l  o    w   l        y.
But boy did it find the mistakes fast.  And yes, when devising new sorting
algorithms, I _do_ check that the output is a permutation of the input; with
the right test cases it's a linear time check.  And yes, it does find mistakes.

You do not want that level of checking all the time.  But whenever someone
tells me "you don't want that much checking", I always think "but yes,
sometimes I DO".

When the checks are CHEAP, as they are in the specific cases under
consideration, why leave them out?  Every check that you can afford
is going to save SOMEONE some debugging time, and it might be you.

	I've been down this path myself
	before.  I think almost every programmer goes through a phase where he
	wants to put these sorts of runtime checks in as many places as
	possible, since detecting interfaces errors is so clearly a good thing.
	But believe me, once you get started on trying to catch all possible
	contract bugs this way at runtime (as opposed to with assertions or
	other separate consistency checks, unit tests, typecheckers, etc. that
	can be run statically or in 'stress test' mode, which is the right way
	to do this sort of thing and which I strongly encourage), there is an
	infinite series of similar bugs that are just as important to detect, so
	how are you going to choose which ones to check?

You obviously are not an Eiffel programmer.
Eiffel programmers quickly learn that you should put in quite a lot of
checks.  When you're developing the code, they save your bacon over and over
and over again.  When you want speed, turn them off (globally or classwise
or even at finer grain).  When it turns out there's a bug still there, turn
them all on again.  You don't try ALL POSSIBLE checks, but you try to have
at least class invariants, argument not nil, this not the same object as
that, cheap checks with good payoff.

	It increases the complexity and size of the libraries, that's why not.
	The point is that although this check doesn't seem like a big deal in
	itself, if you did the same for all similar kinds of bugs (just as
	important as this one), it *would* result in a problem.

If every method increased in size by one line (which is what happens in
this case), Squeak *still* wouldn't be bloated.  Code that helps methods
to produce correct answers instead of garbage answers is not bloat and is
not wasted complexity.

	So unless this test can convincingly pass the "this bug is
	important" test, then it doesn't deserve to go in.

Who decides?  Remember, the "experts" are the people LEAST able to
decide how important such a bug is.

	If everyone thinks that it really is important enough, that's
	fine- put it in.

Nonononono.  The experts must not be involved in the decision because
it's not a problem for THEM so they don't really believe that it's a
problem for anyone.

	But you *do* have to jump over that "Very Important Bug"
	hurdle to justify putting it in the common execution path,
	because otherwise you are on the slippery slope to having to do
	validity checks on every argument to every method in the system.

	Programmers simply can't expect objects to fully check for
	semantically correct interface usage at runtime, given today's
	technology.

Let's get real here.

We are not just talking about a set of specific methods where there
is *nothing* they could reasonably do in a certain boundary case,
we are talking about a set of specific methods where there is a perfectly
sensible, even obvious, correct answer but they do not produce it.
We're talking about a proposed correction whose effect is that the methods
never produce worse answers, produce correct answers in more cases, and
introduce negligible overhead.

Programmers *CAN* expect objects to respond correctly to meaningful
message sends, and that's what's at issue here.

	I think a better place for this sort of test is in some place
	out of the production execution path, such as in an assertion
	that can be turned on for testing and off for production, with
	no runtime cost when it is turned off.  A easy mechanism for
	doing that in Smalltalk would be nice.

Well, we have the "self assert: [...]" mechanism.  Perhaps some way of
turning that off would do.  But it isn't relevant here, because there
are correct results to produce, and the present code produces incorrect
results.




More information about the Squeak-dev mailing list