[FIX] SortedCollectionFix-sr

Stephan Rudlof sr at evolgo.de
Thu Sep 26 17:24:44 UTC 2002


Richard,

Richard A. O'Keefe wrote:
> 	- >>addLast: should be forbidden as >>addFirst is already;
> 
> Why should addLast: or addFirst: be given a complete veto?
> 
> Why not
>     addFirst: anItem
>         self isEmpty ifTrue: [^super addFirst: anItem].
> 	(sortBlock
> 	    ifNil: [anItem <= self first]
> 	    ifNotNil: [sortBlock value: anItem value: self first]
> 	) ifFalse: [
> 	    self error: 'new item would be out of order'
> 	].
> 	^super addFirst: anItem.
> 
>     addLast: anItem
> 	self isEmpty ifTrue: [^super addLast: anItem].
> 	(sortBlock
> 	    ifNil: [self last <= anItem]
> 	    ifNotNil: [sortBlock value: self last value: anItem]
> 	) ifFalse: [
> 	    self error: 'new item would be out of order'
> 	].
> 	^super addLast: anItem.
> 
> This way we DON'T have to invent a new method to use when someone
> wants to build up a SortedCollection one element at a time and knows
> they can do it in the correct order, and we still don't allow these
> methods to spoil the order of the collection.
> 
> 
> 

The short answer
----------------
The current situation for SortedCollections
- is inconsistent: >>addFirst: is forbidden, >>addLast: not;
- semantically incorrect: using >>addLast: easily leads to a not sorted
SortedCollection.

I've wanted to fix this.


The longer answer
-----------------

Prenote: your suggestion is *semantically* correct. But there are
psychological/practical reasons against introducing it.

Why does a programmer use #addLast:? He wants to build a Collection in an
explicit order.
Why does a programmer use a SortedCollection? He wants to have a Collection,
which automatically ensures an order of added elements.

Using >>addLast: for a SortedCollection means to try to give an explicit
order  to something, which has an automatically 'built in' one. This mostly
is a programmer error. And it violates encapsulation: I have to know, which
built in order the SortedCollection has, to use >>addLast: correctly along
your suggestion. By using >>add: instead, there would be no problem.

So there remains the performance thing: using SortedCollection>>add:
multiple times means, that each object has to be sorted in immediately. It
is probably more performant to have a private add method, which just adds an
element, while leaving the SortedCollection in an inconsistent state, and
>>reSort it afterwards, when all elements have been added. That has been my
idea of introducing the 'bypass' method for clients which really know about
the internals.

BTW: In the ANSI standard (revision 1.9) *protocol* <SortedCollection>
doesn't conform to protocol <OrderedCollection>, which is the only one
defining >>addFirst: and  >>addLast:. This makes sense to me.



Implementation notes
--------------------

To the private SortedCollection>>addLastWithoutSorting: there could also be
added
- >>addWithoutSorting: and
- >>addFirstWithoutSorting:;
for - very few! - clients using the *internals*. After using one of these
methods >>reSort should be called to ensure consistency.
Note: I haven't proved the implicit assumption here, that >>reSort'ing an
already sorted SortedCollection is faster than adding elements in arbitrary
order.

Another - cleaner - variant avoiding these internal methods: collecting the
to be added elements in another - not sorted! - Collection and calling
SortedCollection>>addAll: for them.



Greetings,

Stephan
-- 
Stephan Rudlof (sr at evolgo.de)
   "Genius doesn't work on an assembly line basis.
    You can't simply say, 'Today I will be brilliant.'"
    -- Kirk, "The Ultimate Computer", stardate 4731.3




More information about the Squeak-dev mailing list