[BUG]Collection>>removeAll:

Boris Gaertner Boris.Gaertner at gmx.net
Thu Aug 15 18:35:32 UTC 2002


Bergel Alexandre <bergel at iam.unibe.ch> asked:
To: <squeak-dev at lists.squeakfoundation.org>

> Hello,> 
> Is it a bug ?
> 
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> t := #(1 2 3 4) asOrderedCollection.
> t removeAll: t
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> => an OrderedCollection(2 4)
> 

Bob Jarvis answered:
 Try this:

>t := #(1 2 3 4) asOrderedCollection.
>z := t removeAll: t copy.

>t will now be an empty OrderedCollection.  
>z will be an OrderedCollection containing #(1 2 3 4).

The point is that   removeAll: aCollection is implemented
in this way:

removeAll: aCollection 
    aCollection do: [:each | self remove: each].

In Alexandres example, aCollection is identical to the
method receiver  (self == aCollection holds).
Consequently, the  remove:  modifies the receiver of
the do:  and this violates the general rule, that one must
not modify a collection while its elements are enumerated.

Bobs solution ensures that the receiver of the message  do: 
in  removeAll:  remains unmodified.

It is interesting to try Alexandres example with otherSmalltalks.

VisualWorks Non-Commercial Vs. 3.0 from  May 8, 1998
runs into the debugger !  I do not have a newer version of
VisualWorks.

Dolphin Version 2.1 PatchLevel 2 (Summer '99 Edition)
returns the empty collection which is what you expected.


At home I cannot try IBM Smalltalk; I will do that on Monday,
when I have to recommence work after the end of my holidays.


All three Smalltalks implement  removeAll:  in class 
Collection and all three systems implement it in the same way:

removeAll: aCollection 
 "Remove each element of aCollection from the receiver. 
  If successful for each, answer aCollection.
  Otherwise create an error notification.
  ArrayedCollections cannot respond to this message."

 aCollection do: [:each | self remove: each].
 ^ aCollection

The implementations of  do: differ:

VisualWorks uses:

do: aBlock 
   firstIndex to: lastIndex do:
    [:index | aBlock value: (self basicAt: index)]

firstIndex and lastIndex are evaluated only once and this 
causes an error when  lastIndex is decremented during enumeration.

Dolphin and Squeak use a better do:  :

do: aBlock 
 "Override the superclass for performance reasons."
 | index |
   index _ firstIndex.
   [index <= lastIndex]
     whileTrue: 
      [aBlock value: (array at: index).
       index _ index + 1]

repeated evaluation of  lastIndex  avoids the worst.

So why does Dolphin return the empty collection and our Squeak
does not?

Both systems process the array of values from left to right.
When Squeak removes a value, it shifts all following values
one position to the left. That is why the element that follows
a removed item is placed on an already processed position.

Dolphin also shifts values, but removal of the value at
postion firstIndex is handled differently:
 
removeIndex: removedIndex
  removedIndex = firstIndex
       ifTrue:
         [array at: firstIndex put: nil.
          firstIndex := firstIndex + 1.
          ^self].  "this makes the difference! "

  array 
   replaceFrom: removedIndex 
   to: lastIndex - 1 
   with: array 
   startingAt: removedIndex+1.
  array at: lastIndex put: nil.
  lastIndex _ lastIndex - 1.


The OrderedCollection of Squeak offers a
slightly differend method for object removal::

  | t  |

t := #(1 2 3 4) asOrderedCollection.
 t removeAllSuchThat: [:x |  t includes: x].
t

Here  t  is the empty collection!

A look at the definition of  removeAllSuchThat  shows that 
this method does not move elements that are still not processed.
That is brilliant ! 
Imitation of  removeAllSuchThat: gives this improved
version of  removeAll:  in  OrderedCollection:

removeAll: aCollection 
 " Remove each element of aCollection from the receiver
   Answer aCollection. "

 | n |
 n _ firstIndex.
 firstIndex to: lastIndex do: [:index |
     (aCollection includes: (array at: index)) ifFalse: [
   array at: n put: (array at: index).
   n _ n + 1]].
 n to: lastIndex do: [:index | array at: index put: nil].
 lastIndex _ n - 1


But we could also do this:

removeAll: aCollection 
 " Remove each element of aCollection from the receiver. 
   Answer aCollection "

    ^aCollection == self
      ifTrue:
        [firstIndex to: lastIndex do: [:n | array at: n put: nil].
         firstIndex := 1. lastIndex := firstIndex - 1.
         aCollection
        ]
     ifFalse:
        [super removeAll: aCollection].

 
This is even better: When  the receiver is identical to
aCollection, the receiver is immediateely emptied.
Inclusion tests are not necessary.


Finally, we could better the situation by shifting the
processed part of the values array and not the part 
that is still not processed:

OrderedCollection>>
removeIndex: removedIndex
 array 
   replaceFrom: firstIndex + 1
   to: removedIndex
   with: array 
   startingAt: firstIndex.
 array at: firstIndex put: nil.
 firstIndex _ firstIndex + 1.

With four possible improvements at hand I tend to say
that Alexandre found a bug. (When I began to write this
message I intended to write something like: "Not a bug, but
you know that you must not modify a collection while its
elements are enumerated." Now I think that this is not
a qualified statement.)

Greetings
Boris



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20020815/27732339/attachment.htm


More information about the Squeak-dev mailing list