[FIX] SortedCollectionFix-sr

Andreas Raab Andreas.Raab at gmx.de
Thu Sep 26 18:17:11 UTC 2002


Hi Stephan,

While I agree mostly with your reasoning (not entirely though because I
think Richard has a number of equally good arguments wrt. to consistency
of collection operations) there's one issue I strongly disagree with:

> So there remains the performance thing: using SortedCollection>>add:
> multiple times means, that each object has to be sorted in 
> immediately. It is probably more performant to have a private
> add method, which just adds an element, while leaving the
> SortedCollection in an inconsistent state, and reSort it afterwards,
> when all elements have been added. That has been my
> idea of introducing the 'bypass' method for clients which 
> really know about the internals.

You don't have to use #add; you can use #addAll: which will choose the
right way to add the elements dynamically. Your proposal is actually
_really_ bad for optimization since it assumes "outside knowledge" about
the way SortedCollection is implemented and renders any attempt to
really optimize _it_ (instead of lots of clients guessing what a
reasonably good optimization would be) pretty much useless. Spreading
out "meager client side optimizations" while breaking the polymorphic
protocol is never a good idea - if you want to optimize then choose a
central, high-level, intent revealing place (such as, in fact,
SortedCollection>>addAll:) and don't ever have the client
second-guessing the structure and behavior of the object in question.

Cheers,
  - Andreas

> -----Original Message-----
> From: squeak-dev-admin at lists.squeakfoundation.org 
> [mailto:squeak-dev-admin at lists.squeakfoundation.org] On 
> Behalf Of Stephan Rudlof
> Sent: Thursday, September 26, 2002 7:25 PM
> To: squeak-dev at lists.squeakfoundation.org
> Subject: Re: [FIX] SortedCollectionFix-sr
> 
> 
> Richard,
> 
> Richard A. O'Keefe wrote:
> > 	- >>addLast: should be forbidden as >>addFirst is already;
> > 
> > Why should addLast: or addFirst: be given a complete veto?
> > 
> > Why not
> >     addFirst: anItem
> >         self isEmpty ifTrue: [^super addFirst: anItem].
> > 	(sortBlock
> > 	    ifNil: [anItem <= self first]
> > 	    ifNotNil: [sortBlock value: anItem value: self first]
> > 	) ifFalse: [
> > 	    self error: 'new item would be out of order'
> > 	].
> > 	^super addFirst: anItem.
> > 
> >     addLast: anItem
> > 	self isEmpty ifTrue: [^super addLast: anItem].
> > 	(sortBlock
> > 	    ifNil: [self last <= anItem]
> > 	    ifNotNil: [sortBlock value: self last value: anItem]
> > 	) ifFalse: [
> > 	    self error: 'new item would be out of order'
> > 	].
> > 	^super addLast: anItem.
> > 
> > This way we DON'T have to invent a new method to use when someone
> > wants to build up a SortedCollection one element at a time and knows
> > they can do it in the correct order, and we still don't allow these
> > methods to spoil the order of the collection.
> > 
> > 
> > 
> 
> The short answer
> ----------------
> The current situation for SortedCollections
> - is inconsistent: >>addFirst: is forbidden, >>addLast: not;
> - semantically incorrect: using >>addLast: easily leads to a 
> not sorted
> SortedCollection.
> 
> I've wanted to fix this.
> 
> 
> The longer answer
> -----------------
> 
> Prenote: your suggestion is *semantically* correct. But there are
> psychological/practical reasons against introducing it.
> 
> Why does a programmer use #addLast:? He wants to build a 
> Collection in an
> explicit order.
> Why does a programmer use a SortedCollection? He wants to 
> have a Collection,
> which automatically ensures an order of added elements.
> 
> Using >>addLast: for a SortedCollection means to try to give 
> an explicit
> order  to something, which has an automatically 'built in' 
> one. This mostly
> is a programmer error. And it violates encapsulation: I have 
> to know, which
> built in order the SortedCollection has, to use >>addLast: 
> correctly along
> your suggestion. By using >>add: instead, there would be no problem.
> 
> So there remains the performance thing: using SortedCollection>>add:
> multiple times means, that each object has to be sorted in 
> immediately. It
> is probably more performant to have a private add method, 
> which just adds an
> element, while leaving the SortedCollection in an 
> inconsistent state, and
> >>reSort it afterwards, when all elements have been added. 
> That has been my
> idea of introducing the 'bypass' method for clients which 
> really know about
> the internals.
> 
> BTW: In the ANSI standard (revision 1.9) *protocol* <SortedCollection>
> doesn't conform to protocol <OrderedCollection>, which is the only one
> defining >>addFirst: and  >>addLast:. This makes sense to me.
> 
> 
> 
> Implementation notes
> --------------------
> 
> To the private SortedCollection>>addLastWithoutSorting: there 
> could also be
> added
> - >>addWithoutSorting: and
> - >>addFirstWithoutSorting:;
> for - very few! - clients using the *internals*. After using 
> one of these
> methods >>reSort should be called to ensure consistency.
> Note: I haven't proved the implicit assumption here, that 
> >>reSort'ing an
> already sorted SortedCollection is faster than adding 
> elements in arbitrary
> order.
> 
> Another - cleaner - variant avoiding these internal methods: 
> collecting the
> to be added elements in another - not sorted! - Collection and calling
> SortedCollection>>addAll: for them.
> 
> 
> 
> Greetings,
> 
> Stephan
> -- 
> Stephan Rudlof (sr at evolgo.de)
>    "Genius doesn't work on an assembly line basis.
>     You can't simply say, 'Today I will be brilliant.'"
>     -- Kirk, "The Ultimate Computer", stardate 4731.3
> 
> 




More information about the Squeak-dev mailing list