The standard does *not* support - a removeAll: a - [was: Re: [BUG] Collection>>removeAll:]

Richard A. O'Keefe ok at cs.otago.ac.nz
Mon Sep 2 07:39:42 UTC 2002


Preliminary:
    This thread is not about #removeAll:, but about how to read standards.

"Andrew C. Greenberg" <werdna at mucow.com> wrote:
	But the ANSI standard does not, in fact, admit such a dodge.  Indeed, 
	it DEFINES removeAll: as "the equivalent" to an iteration over the 
	elements of the parameter,

No, it doesn't.
More precisely, as I have already pointed out, it doesn't
define it as equivalent to an INTERLEAVED iteration and removals.

	as a call of #remove for each such element.  
	Specifically, it says that, "[t]he operation is defined to be 
	equivalent to removing each element of oldElements from the receiver."  

Right.  I depend completely on that for my reading.

In particular, I claim that

    removeAll: oldElements
	|togo|
	
	togo := OrderedCollection new.
	oldElements do: [:each | togo add: each].
	togo do: [:each | self remove: each].
	^"some unspecified result"

implements precisely the semantics required by the ANSI standard.
I also claim that

    oldElements copy do: [:each | self remove: each]

implemenents precisely the semantics required by the ANSI standard.
"each element of oldElements" is visited; the text says nothing about
sending any enumeration message to oldElements itself.

What's more, the ANSI standard very carefully refrains from saying that
there is any call to #do: whatsoever.  It says "removing each element
of oldElements", but for example it says nothing about the order in
which they are removed.  It doesn't even say that #remove: is called.
It says that it is "EQUIVALENT to removing each element ... from the
receiver using the #remove:  message...", which only requires it to have
the same effect as using #remove:  would have had.  (It is the fact that
it doesn't say that #do: is used which means we have no reason to expect
a Stream argument to be accepted.)  This is the kind of language that
other standards use with the intent of allowing radically different
implementations that have the same externally observable effect.

And for Set, Bag, Dictionary, and OrderedCollection, we know perfectly
well that it is possible to get precisely the effect (precisely the
sequence of #remove: calls if we choose to take them literally) without
ever iterating over the argument as such in the case under consideration.

		3) The ANSI standard section 5.7.5.5 explicitly defines "removeAll: 
	oldElements" as an iteration over the elements of oldElements, sending 
	for each such element, say 'each', the message #remove: each.

It _does_ mention an iteration over a sequence of elements;
it does _not_ mention an iteration over the oldElements object itself.
And that's a big difference.  The way it seems that such care was taken to
avoid calling for a #do: over oldElements itself speaks against Greenberg's
reading.  And the way such care _seems_ to have been taken never ever to
require the iteration and the removals to overlap in time also speaks
against that reading.  If a simple #remove:-inside-#do: were intended,
why didn't they just say "it's exactly as if you had [explicit code]"?
When someone goes to such pains to avoid the obvious, you are entitled
to assume that they don't intend all the implications of the obvious.
(And if they _did_ intend that, it's not the reader's fault.)

	While one can quibble over the truth of proposition 3, 

Nobody can reasonably quibble over the truth of proposition 3.
But to support your argument, you need the text to say that
    THE REMOVALS MUST OVERLAP IN TIME AN ACTUAL ITERATION OVER THE
    ARGUMENT OBJECT (not a copy).
And proposition 3 does not involve that at all.

	one must accept, at least, that it is a fair argument,

Fair, yes.  Honest, yes.  Radically flawed, in that the desired
conclusion doesn't even come close to following from it, that too.
In particular, the implementation

    removeAll: oldElements
	oldElements copy do: [:each | self remove: each]

fulfils _every_ requirement of the standard.

I'm told that in international law, there is a principle that treaties
ought to be interpreted to favour the weaker party.  That is certainly
how the Treaty of Waitangi (very important in this country) is interpreted.
Apply that principle to standards, and you see that vendors are the
stronger party (very very much so if the ANSI committee would not have
considered requiring vendors to make a one-line fix) and users are the
weaker party.  If there is a natural reading of a standard that provides
more useful guarantees to the users, and a reading that lets vendors get
away with quietly returning nonsense answers, the better reading is the
one that favours the customers.

In particular, if there is a group who have taken on themselves the
responsibility for identifying undefined/UNSPECIFIED cases, they have
accepted the responsibility for identifying ALL of them, and if some
part of the specification can be read naturally as leading to a defined
result, WHATEVER the operation in question, whether I personally like the
consequences or not (and I've been a language implementor, although not
a Smalltalk implementor), then it should be read as defined, not as
undefined.

The overwhelming context for #removeAll: (and perhaps more importantly,
for #addAll:) is a standard where the authors have gone to the extreme
of inventing new notations in order to spell out which cases are
undefined, and purport to have given full lists of undefined and
erroneous cases for these particular operations.  We are entitled to
_believe_ them about this.

I must refer again to Wirfs-Brock, who told us that the committee
overlooked these cases, and probably _would_ have listed them as
undefined (rather than make implementors introduce one-line fixes).
I must also point out again that the authors' intentions or lack of
them are not, in fairness, allowed to influence one's reading.
What the authors _would_ have done had they thought of it is not
what they _did_ do, and when interpreting a standard, you have to
deal with what they _did_ do.

	(I, for one, find it compelling, but reasonable
	people, apparently, may differ.)

Did you know how #removeAll: worked _before_ you read the standard?
Are you sure that does not influence your reading?
Because I have as yet been unsuccessful in finding _anyone_ who reads
the standard as allowing, let alone requiring, a 'hole' in the specification.
And believe me, I've tried.  By now I've asked about 60 people.

	Accordingly, it seems to me that the
	argument that "failure" of "x removeAll:  x" is clearly and in
	black-and-white a bug is, at best, overstated.
	
Well, the claim that it was a bug never did depend on the ANSI
specification.  It depends on the Smalltalk textbooks and manuals never
ever mentioning any exception.  It depends on the fact that the present
behaviour of removing odd elements and leaving even ones in place is not
what anyone expects.  Had the ANSI standard said anything which could
have been interpreted by anyone not already familiar with
implementations as leaving this case unspecified, that would have been
that.  It would have been a nonsense result, but it would have been
_official_ nonsense.  The thing would still have _been_ a bug, but a
refusal to fix it on the grounds that "ANSI Smalltalk says we don't have
to" could have been defended, but even then it would have been necessary
to say "It is time the reference manuals and textbooks were amended."

Now we have the far more serious question:

    Is a Smalltalk programmer forced to concede that the ANSI Smalltalk
    standard must be interpreted in the light of details of how the
    operations were implemented prior to the standard, hence always in
    the favour of implementors?
 
I find it fascinating that I have yet to find anyone who sees the
apparently obvious, even "compelling", hole-in-the-specification
in the ANSI Smalltalk text, _except_ for some people on this list.
Even when you point it out to them, people here say something like
"well, but that's a bit twisted".  There's a sort of groan in it,
like someone who has just heard a really stupid pun.

This obviously raises the question of what _else_ people read differently
about the ANSI Smalltalk standard.



More information about the Squeak-dev mailing list