[BUG]Collection>>removeAll:
Richard A. O'Keefe
ok at cs.otago.ac.nz
Sun Aug 25 23:29:28 UTC 2002
Stephan Rudlof <sr at evolgo.de> wrote
What about:
Collection>>removeAll: aCollection
self isEmpty
ifFalse: [self assert: [self ~~ aCollection].
aCollection do: [:each | self remove: each]].
^ aCollection
Would also be a shortcut for removing some - potentially huge - collection
from an already emptied one...
On Saturday I tried a few benchmarks.
Remember how sokme people were alleging that adding the test
aCollection == self ifTrue: [" handle it somehow "].
"more or less previous code"
would be too slow?
Thanks to the O(self size * aCollection size) behaviour of #removeAll:,
it really does turn out that even for relatively small collections, it
is practically impossible to measure any slowdown.
Not surprising really, because it turns out that for
removeAllSafely: aCollection
aCollection == self ifTrue: [^self removeAllSafely: aCollection copy].
^aCollection do: [:each | self remove: each]
the total overhead for the normal is only two bytecodes, because changing
from
aCollection do: [...].
^aCollection
to ^aCollection do: [...]
turns out to save two bytecodes.
The real surprise came when I discovered that I could speed #removeAll:
up by 30% with very little effort. (The technique is really really obvious
and the Self compiler would do it automatically. I leave it as an exercise
for the reader.)
So:
(1) #removeAll: is quadratic, although it could be linear time
for all the classes where it currently makes sense.
(Yes, I am aware that it uses #= for deciding what to remove,
not #==. I am also aware of what we could easily do about that.)
(2) Even given its structure, there is a trivial change that would deliver
the same results in all cases that could speed it up by 30% across the
board.
I do NOT think that it is appropriate for people to worry too much about
hacks to #removeAll: that do not address any of its real problems, nor that
it is reasonable to complain overmuch about an added overhead of 2 bytecodes.
The end result of my benchmarking was a version of #removeAll: that could
handle the (x removeAll: x) case while running 30% faster for all other
cases. So much for overhead!
More information about the Squeak-dev
mailing list
|