I wrote that : oldElements copy do: [:each | self remove: each] : implemenents precisely the semantics required by the ANSI standard.
"Andrew C. Greenberg" werdna@mucow.com wrote: Except on collections where it doesn't: Richard's reasonable code presumes that the collection answers to copy in the manner expected. This code fails, for example, on a LinkedList. When we are talking about the ANSI standard, it pays to remember that there is no such thing as a LinkedList class in ANSI. Now #copy _is_ an ANSI method, and 5.3.1.6 says Return a new object that must be as similar as possible to the receiver in its initial state and behavior. [Except that] Any operation that changes the state of the new object should not as a side effect change the state or behavior of the receiver. Similarly, any change to the receiver should not as a side-effect change the new object. If the receiver is an identity object, return the receiver. and the Rationale warns explicitly that there _isn't_ any promise that x copy = x.
Now let's look at LinkedList. In Squeak 3.2 there is no #copy method defined in the LinkedList class. (By the way, in the Squeak 3.2 scroll-bars, the "menu" box is in an EXTREMELY inconvenient place. Whenever I try to scroll up I find myself with a menu instead and have to spend more time than I care to twitching the mouse around trying to get the "scroll up" box. The "menu" button should be _above_ the "scroll up" box.) The #copy method is inherited from Object. LinkedList(Object)>>copy ^self shallowCopy LinkedList(SequenceableCollection>>shallowCopy ^self copyFrom: 1 to: self size LinkedList(SequenceableCOllection)>>copyFrom: start to: stop |newSize| newSize := stop - start + 1. ^(self species new: newSize) replaceFrom: 1 to: newSize with: self startingAt: start (a LinkedList) species is LinkedList. LinkedList class(Behavior)>>new: bombs out, "LinkedList cannot have variable sized instances".
Is this the fault of my suggested implementation of #removeAll:? No, as I wrote, it fully conforms to the ANSI specification. The implicit assumption is "given any collections which otherwise conform to the ANSI specification." And LinkedList is so far from doing that, in almost _any_ operation, that it cannot be said to conform to the ANSI <collection> protocol at all.
However, one _could_ add a new kind of Link. Here's one that I used for testing:
Link subclass: SimpleLink instanceVariables: 'value' ...
"method category 'accessing'" value ^value value: anObject value := anObject printOn: aStream aStream nextPutAll: 'a SimpleLink('; print: value; nextPut: #)
The problem is that LinkedList does not have a #copy that conforms to the ANSI specification. So let's give it one.
LinkedList "category 'copying'">> copy |result| result := self species new. self do: [:each | result addLast: each copy]. ^result
I was going to present an example using #collect:, but (a) that blows up for the same reason that #copy used to, (b) it's not even clear that it makes sense. The block would have to accept a Link as argument and return a different Link as result. Oh well, a collection is allowed to have type restrictions, so here's #collect: after all.
LinkedList "category 'enumerating'>> collect: aBlock |result| result := self species new. self do: [:each | result addLast: (aBlock value: each)]. ^result
Now we can do a := LinkedList new. a add: (SimpleLink new value: #x). a add: (SimpleLink new value: #y). a add: (SimpleLink new value: #z). b := a copy. b addFirst: (SimpleLink new value: #b). a removeFirst. {a. b}
The output is #(a LinkedList(a SimpleLink(#y) a SimpleLink(#z)) a LinkedList(a SimpleLink(#b) a SimpleLink(#x) a SimpleLink(#y) a SimpleLink(#z)))
which shows that we now have a LinkedList>>copy that conforms to the ANSI specification [almost, see last paragraph].
Will this make
oldElements copy do: [:each | self remove: each]
work? No, because the "elements" of a copy of a LinkedList are, must be, different objects from the "elements" of the original. (Yet another way in which LinkedLists fail to act like collections.) Of course, the ANSI specification of #copy doesn't say that have to be, just that the result is "as similar as possible", and since a Link _cannot_ belong to two lists at the same time, this _is_ as close as possible.
The basic problem here is that a key aspect of the design of LinkedList appears to have been copied from Simula 67. In the absence of generic (Ada/Eiffel term; for C++ read 'template') classes, and in the presence of static type checking, the only way you could have a general purpose list class in Simula 67 was by having a Link class which you had to make your own subclasses of. There was no need for that in Smalltalk, and the result is that LinkedList is utterly unlike every other Collection. Squeak has three kinds of collections: - general-purpose collections that can hold any kind of element at all (execpt, in some cases, nil) A non-nil object may belong to any number of general-purpose collections at the same time. - compact collections that can hold only values that are themselves immutable such as numbers and characters, restricted to a particular type. A number or character may belong to any number of compact collections (provided they accept that kind of number or character) at the same time. - LinkedList, which can hold only Links. A Link may only belong to a single LinkedList. It turns out that some "core" collection methods CANNOT be implemented for LinkedList.
Let's briefly consider all the ANSI <collection> messages:
allSatisfy: aBlock works anySatisfy: aBlock works asArray Ka-BOOM! [1] asBag works asByteArray does not and must not work [2] asOrderedCollection works asSet works asSortedCollection does not and must not work [3] asSortedCollection: sortBlock works collect: aBlock Ka-BOOM! [4] detect: aBlock works detect: aBlock ifNone: exceptionBlock works do: aBlock works [5] do: aBlock separatedBy: sepBlock Ka-BOOM! [6] includes: anElement Ka-BOOM! [7] inject: value into: aBlock works isEmpty works notEmpty works occurrencesOf: anItem works rehash works [8] reject: aBlock Ka-BOOM! [9] select: aBlock Ka-BOOM! [9] size works
[1] This needs the LinkedList>>at: method I posted a while back. Note that ANSI requires asArray to *work* and it easily could. Another implementation: LinkedList "category 'converting'">> asArray |result index| result := Array new: self size. index := 0. self do: [:each | result at: (index := index + 1) put: each]. ^result
[2] Because a Link isn't an integer between: 0 and: 255.
[3] Because #<= is not defined on Link. If we add SimpleLink "category 'comparing'">> <= anotherValuable ^value <= anotherValuable value then it can work the way you would expect. [4] As noted earlier. A working implemention can be found above.
[5] With a whopping big caveat. The only use of LinkedList I can find in Squeak 3.2 is process queues. Process queues may change as a results of events, so #do: is totally vulnerable to a LinkedList changing while the iteration is running. This means that anyCollection removeAll: someProcessQueue never _could_ be relied on to work in Squeak.
[6] You _could_ define #do:separatedBy: once and for all in terms of #do:. Collection>>#do:separatedBy: does exactly what you would expect and _would_ work perfectly for LinkedList. Except that SequenceableCollection>>do:separatedBy: (last changed at the end of 2000) assumes not merely that SequenceableCollections can be indexed, but that this is the cheapest way to traverse them. If you adopt my LinkedList>>#at: then the #do:separatedBy: in SequenceableCollection will work, but it will take quadratic time when linear time is available. Simplest fix: copy the definition from Collection down into LinkedList.
[7] #at: again. SequenceableCollections are collections that have a natural order; too many methods in SequenceableCollection make the invalid assumption that "has an order" == "is cheap to index". This leads to weirdness like Heap having _two_ orders: the order used by #at: and #do:, and the different order used by #removeFirst. Proof by demonstration: h := Heap new. h addAll: #(3 1 4 1 5 9 2 6 5). h do: [:each | Transcript space; print: each]. Transcript endEntry ==> 1 1 2 3 5 9 4 6 5 but r := OrderedCollection new. [h isEmpty] whileFalse: [r addLast: h removeFirst]. r ==> an OrderedCollection(1 1 2 3 4 5 5 6 9) Trying to grapple with a collection with two incompatible orders makes my head hurt.
[8] Because it's inherited from ProtoObject, where it's defined to do nothing. Note, however, that this is supposed to be available for general use in ANSI; Squeak collections put it in the 'private' category. In ProtoObject it's in 'objects from disk'.
[9] SequenceableCollection>>select: assumes that if a collection has a natural order, then you can create a nil-initialised object of that type of a given size. Get past that and you run headlong into #at:. I was about to show you how to implement these so that they do work, when I realised that #select: and #reject: *CANNOT* be implemented in accord with the ANSI specification for LinkedList.
Why? Because the result is supposed to contain a selection of the elements of the receiver. If the result were ever non-empty, there would have to be some element *identically* present in both the receiver and the result. But that is impossible for a Link.
So here is the conclusion:
(A) #select: and #reject: not only don't work for LinkedList in Squeak 3.2, they _cannot_ be made to work in any way that would conform to the ANSI specification of #select: and #reject: for collections nor to one's common-sense expectations.
(B) Because LinkedList does not implement the <collection> protocol, it is not a collection class for the purposes of deciding which other ANSI methods should accept it as an argument.
(C) #removeAll: and #addAll:, amongst others, are only specified for arguments that support the full <collection> protocol. Since LinkedList does not and cannot do that, #removeAll: and #addAll: are NOT DEFINED for a LinkedList argument.
(D) Therefore, if "oldElements copy do: [:each | self remove: each]" does not work when oldElements is a LinkedList, that does NOT constitute a failure to conform precisely to the ANSI semantics of #removeAll:.
(E) This is not an argument that #removeAll: _shouldn't_ work for a LinkedList argument, only that the ANSI specification doesn't require it to. However, oneLinkedList removeAll: anotherLinkedList could change things only when the two lists are the same object, which takes us right back to an argument I don't want to get back into.
For the same reason, I am NOT proposing the additions to LinkedList shown above as a [FIX]. (Because they take existing methods of core collection classes that currently give nonsensical results or none, and make them give results I regard as sensible, but the new methods are not guaranteed to be compatible with other Smalltalks.)
Wow, I am impressed Richard - this is really interesting reading. May I come with a suggestion to all who are following this discussion?
Why don't we start a "rework the whole darn Collection mess" project and let Richard take a first hit at it. You (Richard) already have new implementations for Set etc, various improvements here and there, new LinkedList implementation in the works and so on. And we have the source from StrongTalk which might contain good stuff to borrow etc. This would be a new Squeak centric project that wouldn't try to be ANSI compliant (unless it isn't a problem) nor portable.
If you started to "maintain" such a ChangeSet (perhaps by simply copying the Collection classes as a start, renaming them to NCollection, NSet and so on (so that this works in versions less than 3.3 with Modules) and then chopping away at those) we could load that in parallell in an image, use them if we like to and compare them with the old ones. We could also run performance tests and unit tests on both sets of classes to make sure they both "comply" to our tests.
What do you all say?
It would make a perfect testbed also for improvements etc while still keeping "the good ol' stuff" so that noone can get worried about ANSI, portability etc.
And in 3.3a we can publish this a dropin Module instead of the old Collections for anyone to use who want's to.
regards, Göran
squeak-dev@lists.squeakfoundation.org