[squeak-dev] SortedCollections

Levente Uzonyi leves at elte.hu
Sun May 8 18:47:17 UTC 2011


On Sun, 8 May 2011, Chris Muller wrote:

>> The case of #asSortedCollection has a simple solution: don't use it. There's
>> 99.9% chance that you don't need a SortedCollection. If you want to sort a
>> Collection, then use #sort, #sorted, #sort: or #sorted:.
>
> Something in my gut does not like SortedCollections either, and I have
> even gone into some of my domains and replaced SortedCollections with
> OrderedCollections and put in some strategically-placed #sort:'s.

The main problem with SortedCollection is that it's designed to 
automatically resort it's elements when it's modified. There was no direct 
method for sorting other collections in Squeak, so people used 
#asSortedCollection to sort their collections. This has several drawbacks:
1) in most cases you don't need another collection
2) the sorting algorithm of SortedCollection
  a) is not adaptive and doesn't have O(n*log(n)) worst case runtime
  b) is suboptimal (quickly degrades to O(n^2)) if equal elements are in 
the collection.
  c) is not stable
3) adding elements one-by-one to the SortedCollection (which is a very 
common pattern in Squeak code) is really slow (O(n^2))

Why is #sort better and how can we solve remaining problems?
1) the new collection is not avoided by #sort, because it uses merge sort,
which is not in-place, so a transient copy of the collection is created 
either way.
2a) is not fully solved by #sort yet, but replacing the merge sort 
implementation with one that's adaptive it on my list. Since merge sort 
has O(n*log(n)) runtime guarantee, this is only a minor problem.
2b) is solved by #sort, because merge sort doesn't have this problem
2c) same as above
3) the slowdown can be reduced in most cases by making SortedCollection 
lazy

>
> Since doing that, I am now second-guessing myself; I needed another
> placed call to #sort: and the question immediately came to my mind:
> Why shouldn't this encapsulate the sorting behavior automatically
> whenever it is sent an order-dependent message (like enumeration)?

Because that's what SortedCollection is for.


Levente

>
>



More information about the Squeak-dev mailing list