[BUG]Collection>>removeAll:

Marcel Weiher marcel at metaobject.com
Thu Aug 22 15:31:05 UTC 2002


On Thursday, August 22, 2002, at 04:43  Uhr, Richard A. O'Keefe wrote:

> 	To me, the definition of removeAll: is not an English description
> 	of what it does, it is
> 	
> 	removeAll: aCollection
> 	  aCollection do: [:each | self remove: each]
> 	
> That is NOT a definition.
> It is an IMPLEMENTATION.

NO NEED TO SHOUT.  ;-)

So tell me, why can a specific implementation not be a definition?  
Sorry, rhetorical question:  of course it can.

> Not only that, it is a buggy implementation.

An underdocumented one.

> On this view, it would be impossible for any bugs to exist anywhere,
> because "the definition of" anything would not be "an English 
> description
> of what it does" but the code itself.

When in doubt, that is quite often the "more correct" of the two.

[class story]

> They didn't see why Squeak couldn't document this too.

We all know that Squeak is underdocumented.  No big news here.

> That could be done, but there isn't the slightest need to.
> When a programmer invokes an enumeration method, it is reasonable to
> expect that programmer to live by the rule "don't YOU change what YOU
> are iterating over."  The issue is iterations that the programmer
> DOESN'T know about.

> 	The other argument is that it would make the system slower.
> 	I bet it wouldn't have much impact. The only way to find out
> 	is to try it, of course.  So I did.
> 	[about 20% slower on some Cincom benchmarks]
>
> I tried changing OrderedCollection>>do: to
>     do: aBlock
> 	self asArray do: aBlock
> and the result of 0 tinyBenchmarks went UP (24.7 bc/sec -> 25.4 
> bc/sec).
> Does 0 tinyBenchmarks do any OrderedCollection iterations,

No.

>  or is this noise?

Yes.

> 1. Designers accepting "done change a collection you are iterating 
> over"
>    as a reasonable constraint on use.
>
>    I've cited Java above partly in order to defend this decision.  
> Making
>    copies to make #do: safe may seem reasonable on a 250MHz 64MB 
> machine;

Actually, I'd contend that it still isn't reasonable, because making a 
copy of something just to iterate over it is just as unexpected as not 
being able to removeAll:self.  What if the collection doesn't like 
being copied, or require a deep-copy?  What if it is huge and copying 
makes us run out of memory?  What if the collection is a multi-gigabyte 
disk-database?

>    it clearly WASN'T such a good idea back on the Dandelions and 
> Dorados.

yup.

>    Other languages and other libraries have similar restrictions.
>    This _is_ something an experienced programmer can be expected to 
> know
>    about in general, even without Smalltalk experience, and beginners
>    can be expected to grasp once it is pointed out to them.
>
> 2. Designers writing the simplest clearest code they can for operations
>    but not thinking about all the boundary cases.  (x-x and x+x are
>    perfectly reasonable in arithmetic; x\x and xUx are perfectly 
> reasonable
>    in set theory; but they are a sort of boundary case.)
>
>    My evidence that they didn't think about this is that they don't 
> seem
>    to have documented their decision.

Probably.

>  I seriously doubt whether there
>    WAS a decision to make the x removeAll: x or x addAll: x cases go
>    insane.

Yes, that is doubtful.

>   Had anyone thought about these cases, they would have realised
>    that at the very least returning the argument object every time was 
> a
>    bad idea; returning the same object (as opposed to an object with 
> the
>    right state that MAY be the same object) only works when the two are
>    different.
>
> 3. From the *outside*, the operations don't look dangerous.  There are
>    many possible ways to implement #removeAll: and #addAll: and their
>    relatives, and many of these ways do not trigger the do+mutate bug.
>
> The result of these factors is that the programmer sees an operation 
> that
> COULD work, the documentation suggests that it SHOULD work, but it 
> doesn't,
> and it doesn't detect that it didn't work either.

Yup.

> Part of the story, I think, is that Smalltalk-80 was just one in a 
> series
> of experimental systems which were being developed and replaced quite
> rapidly.  I don't suppose that the original authors were expecting the
> process to stop with Smalltalk-80.

I think it is quite well documented that at least some of the original 
authors expected everything BUT that to happen, and in fact especially 
wanted it not to happen with Smalltalk-80.

>   If you are expecting to have to throw
> away or massively revise the code in another year or two, you won't be
> inclined to put a lot of effort into documentation.  But Smalltalk-80 
> HAS
> lasted, and we are seeing some of the consequences.

Yup.

> Come to think of it, when is that big Morphic cleanup going to happen?
> I hope this thread hasn't done anything to delay _that_!

;-)

Marcel




More information about the Squeak-dev mailing list