[BUG]Collection>>removeAll:

Richard A. O'Keefe ok at cs.otago.ac.nz
Fri Aug 23 04:32:30 UTC 2002


Sender: Allen_Wirfs-Brock at mail.instantiations.com wrote:
	#addAll: and #removeAll: where the receiver and argument are the same 
	object are clearly special cases as the "correct" semantics of those cases 
	is far from obvious.
	...
	ANSI Smalltalk neglected to define these cases.

The people who wrote the ANSI Smalltalk standard may have failed to
think about these cases, but at least in the 1.9 draft they managed
to produce a specification that DOES define these cases, DOES define
them the way I want them, and CAN be implemented cheaply.

This is the way that other standards are interpreted.
If the standard as written does specify an effect,
and the effect isn't really outrageously silly,
then that is the effect that the standard requires.

A well known example is that the ANSI C standard came with a lengthy
Rationale which was very helpful in understanding it.  The ISO C
"badge engineered" version dropped the Rationale.  It was discovered
that the text of the ISO C 89 standard, taken at face value, made
a long-standing C idiom which was actually used as an example of
something that should work quite illegal, so that C compilers did not
have to support it and C interpreters were perfectly within their
rights to grind to a halt and whinge if they met it at run time.
The decision was that the standard's specification was clear and
implementable and there was a fully standard workaround, so yes,
bye bye old technique, off to gaol you go.

The C99 standard has a special patch in the syntax to make an analogue
of the old technique available again.

	I consider this to be an oversight.  If somebody had brought 
	this to our attention we would have said something about these special 
	cases.

Perhaps it is lucky that nobody did.  As I have now shown several different
ways, these methods can be made to behave usefully in this case, without
any measurable overhead.
	

I will say that while shortcuts may be forgivable in a project moving
at high speed where the code is expected to be massively changed in another
year or two, a standards committee is expected to consider fine details,
especially when the standard comes out 18 years after the language was
defined.  In particular, for languages with side effects, "what happens
if there is aliasing" is a painfully obvious question which needs
addressing in a standard.

Fortran addressed it by saying "watch out for this, if you get caught,
it was your fault."  C addressed it by saying "if you trust a fairly
simple model of what's going to happen, that's what you'll get, and
too bad about high performance on large arrays", C99 addressed it by
saying "you can promise that there won't be aliasing, and the compiler
may believe you, but if you lie to the compiler it's your fault."
Ada 83 addressed it by saying "don't do that; if you get caught, it's
your fault".

A blanket warning near the front:

    If a message send is defined to modify some object,
    and that object is either the receiver or one of the arguments,
    the result is in general UNDEFINED.

would have answered the purpose.  I didn't find such a thing in the 1.9
draft. There could still have been arguments about particular cases.




More information about the Squeak-dev mailing list