[FIX] SortedCollectionFix-sr
Stephan Rudlof
squeak-dev at lists.squeakfoundation.org
Fri Sep 27 13:51:41 UTC 2002
Richard,
your proposal implementing SortedCollection with 'lazy data structures' is
very interesting. It removes the burden of all thinking about adding to a
SortedCollection performantly ( ;-) ).
Any volunteer out there to give it a try?
One question is:
How is the relation between changes of the SortedCollection and looking
>>at: it, since some time is spend for the checks there. If accessing it
outweights the changes by magnitudes there could be a *small* performance
penalty.
But I guess 'normally' the proposal gives better performance and it is
always possible to convert it into a plain old OrderedCollection for
accessing it very often, if someone worries about this.
Greetings,
Stephan
Richard A. O'Keefe wrote:
<...>
> I think it's time to talk about lazy data structures.
>
> We could add elements fast to SortedCollections like this.
>
> 1. The internal representation is an array partitioned thus:
> +----+------------+--------+----+
> |1 |a z| x| n|
> +----+------------+--------+----+
>
> elements 1..a-1 : all nil
> elements a..z : sorted according to sortBlock
> elements z+1..x : added elements not sorted yet
> elements x+1..n : all nil
>
> Where I have written "a" and "x" you should understand the existing
> instance variables of OrderedCollection (inherited by SortedCollection)
> being used instead; the "z" instance variable is new.
>
> 2. add: anItem
> ^super addLast: anItem
> addFirst: anItem
> self shouldNotImplement
> addLast: anItem
> self shouldNotImplement
> addAll: aCollection
> ^super addAllLast: aCollection
> addAllFirst: aCollection
> self shouldNotImplement
> addAllLast: aCollection
> self shouldNotImplement
>
> No, I'm not being inconsistent here. I'm saying "let someone build
> a sorted collection in linear time if they can provide the elements
> in sorted order." With this kind of implementation, #add: _is_
> adequate for the job.
>
> 3. A strictly private method would exist:
>
> pvtEnsureAllSorted "rather like current addAll:"
> x = z ifTrue: [^array].
> self pvtSortAddedElements. "sorts z+1..x"
> self pvtMerge. "merges a..z with z+1..x"
> z := x.
> ^array
>
> Ideally pvtSortAddedElements would use a method which is linear when
> the elements are sorted already; Dijkstra's smooth-sort is
> just one of many algorithms which are good at that, bottom-up
> natural merge is easier to understand and works very well in practice.
>
> Using bottom-up natural merge, if you add M elements to a Sorted
> Collection having N elements and then look at the result,
> the adds would cost O(M) time,
> pvtEnsureAllSorted would cost O(M) for pvtSortAddedElements
> and O(M+N) for pvtMerge, for a total O(N+M).
> Building an entire collection from N=0 would thus be O(M).
>
> 4. Other methods would call pvtEnsureAllSorted, e.g.
>
> at: index
> self assert: [index between: 1 and: x-a+1].
> ^self pvtEnsureAllSorted at: index-a+1
>
> do: aBlock
> self pvtEnsureAllSorted.
> ^super do: aBlock
>
> 5. Setting the sortblock would simply enlarge the unsorted area:
>
> sortBlock: aBlock
> sortBlock := aBlock ifNil: [nil] ifNotNil: [aBlock fixTemps].
> z := a-1.
>
>
> This is the clean way to do it. The class invariant is weakened from
> "the elements are always sorted" to "the elements are always sorted
> whenever anyone else looks; when noone else is looking they are in a
> convenient configuration for sorting." From the outside, it looks as
> though the elements are always sorted.
<...>
--
Stephan Rudlof (sr at evolgo.de)
"Genius doesn't work on an assembly line basis.
You can't simply say, 'Today I will be brilliant.'"
-- Kirk, "The Ultimate Computer", stardate 4731.3
More information about the Squeak-dev
mailing list
|