The standard does *not* support - a removeAll: a - [was: Re: [BUG]

Richard A. O'Keefe squeak-dev at lists.squeakfoundation.org
Mon Sep 2 08:06:51 UTC 2002


Jesse Welton wrote:
	> Richard, would you say that the standard
	> requires #do: to work properly when the receiver is modified during
	> the iteration, as well?
	
The text for #do: is 5.7.1.13:
    For each element of the receiver, operation [the block] is evaluated
    [invoked] with the element as the parameter.  Unless specifically
    refined, the elements are not traversed in a particular order.
    Each element is visited exactly once.  Conformant protocols may
    refine this message to specify a particular ordering.

If you change what the elements _are_, then this definition runs into
trouble.  "each element" at what time?

Now it's clearly _possible_ to make a 'snapshot' of the collection state
and iterate over that.  Indeed, that's precisely what Prolog does and what
relational databases do.  Prolog even does it _without_ making a copy.
(Aren't virtual copies wonderful?)

However, in this case the iteration process (whether using a copy or
otherwise) and the changes _have_ to be interleaved, and it is possible
for the block argument to probe intermediate states of the receiver.
Those intermediate states are NOT spelled out.  So code cannot depend
on what they are.  (In the case of #removeAll: and #addAll:, whatever
intermediate states there might be are _not_ accessible.  Even if there
is an error, the state at the time of error is completely undefined.)

I would say that the most natural way to read this is
 - you MAY modify the receiver in any way you like as long as you don't
   change what the elements are.  For example, if Smalltalk arrays had
   lower bounds, shifting the lower bound should be OK.
 - if you change what the elements are, you are asking for trouble
   because the specification becomes unclear.
 - implementors: you don't have to support changes, but you should look
   for cheap ways of making your code robust just in case there are some.

Had I been one of the authors, I would have insisted on putting in
an explicit warning that the effect is undefined.

Of course, what is in question when *user-written* code *has* to be
interleaved with an *explicit* iteration is not in question when
implementor-written code is invoked without any explicit iteration
being involved and no interleaving is required.

Stephan Rudlof <sr at evolgo.de> wrote:
	What should be the semantics then?  E.g.:
	
	| a num |
	num _ 1.
	a := OrderedCollection with: num.
	a do: [:e | a size < 100 ifTrue: [a add: (num _ num+1)]].
	a
	
This is (1) a user-written change (2) interleaved with (3) an explicit
iteration.  I regard the ANSI specification as sufficiently ambigous
in this case to admit both a 'copy semantics' implementation and an
'undefined effect' implementation.  So it's not defined.

To be honest, once upone a time I would have _expected_ it to invoke the
block once, because I _expected_ OrderedCollection>>do: to have been

    firstIndex to: lastIndex do: [:index | aBlock value: (array at: index)]

and I would have expected _that_ to make a copy of the value of lastIndex.
But #do: isn't like that, and #to:do: doesn't work that way anyway.
(This seems to violate the ANSI specification of #to:do:; no _object_ is
being changed but a variable whose value would have been copied had this
been a normal message send.)




More information about the Squeak-dev mailing list