[ENH] Faster Sorted Collections

Daniel V. Oppenheim music at watson.ibm.com
Mon Dec 6 02:22:40 UTC 1999


Thanks Stephen -- this is very useful! I wasn't aware that activating a 
block is that much more expensive than the same code in a method. I'm 
curious -- can block access be optimized? In Visual Works, for example, 
would your equivalent fix produce roughly the same improvement?

Danny Oppenheim


At 02:30 PM 12/3/99 -0800, you wrote:

>Hello all,
>
>SortedCollections assign a default sort block if none is provided. Since
>90% of these blocks are the same, we can save one block activation per
>sort comparison if we code the comparison directly for collections where
>the sort block is nil. The attached change set implements this; the
>speed-up is quite significant for large collections (>100% for 50000
>items in random order).
>
>--
>
>stp
>   Stephen Travis Pope
>   stp at create.ucsb.edu -- http://www.create.ucsb.edu/~stp'From Squeak 2.4a 
> of April 21, 1999 on 23 April 1999 at 5:33:11 am'!
>
>!SortedCollection methodsFor: 'private' stamp: 'stp 04/23/1999 05:32'!
>defaultSort: i to: j
>         "Sort elements i through j of self to be nondescending according to
>         sortBlock."     "Assume the default sort block ([:x :y | x <= y])."
>
>         | di dij dj tt ij k l n |
>         "The prefix d means the data at that index."
>         (n _ j + 1  - i) <= 1 ifTrue: [^self].  "Nothing to sort."
>         "Sort di,dj."
>         di _ array at: i.
>         dj _ array at: j.
>         (di <= dj) "i.e., should di precede dj?"
>                 ifFalse:
>                         [array swap: i with: j.
>                         tt _ di.
>                         di _ dj.
>                         dj _ tt].
>         n > 2
>                 ifTrue:  "More than two elements."
>                         [ij _ (i + j) // 2.  "ij is the midpoint of i and j."
>                         dij _ array at: ij.  "Sort di,dij,dj.  Make dij 
> be their median."
>                         (di <= dij) "i.e. should di precede dij?"
>                            ifTrue:
>                                 [(dij <= dj) "i.e., should dij precede dj?"
>                                   ifFalse:
>                                         [array swap: j with: ij.
>                                         dij _ dj]]
>                            ifFalse:  "i.e. di should come after dij"
>                                 [array swap: i with: ij.
>                                 dij _ di].
>                         n > 3
>                           ifTrue:  "More than three elements."
>                                 ["Find k>i and l<j such that dk,dij,dl 
> are in reverse order.
>                                 Swap k and l.  Repeat this procedure 
> until k and l pass each other."
>                                 k _ i.
>                                 l _ j.
>                                 [[l _ l - 1.  k <= l and: [dij <= (array 
> at: l)]]
>                                    whileTrue.  "i.e. while dl succeeds dij"
>                                   [k _ k + 1.  k <= l and: [(array at: k) 
> <= dij]]
>                                    whileTrue.  "i.e. while dij succeeds dk"
>                                   k <= l]
>                                    whileTrue:
>                                         [array swap: k with: l].
>         "Now l<k (either 1 or 2 less), and di through dl are all less 
> than or equal to dk
>         through dj.  Sort those two segments."
>                                 self defaultSort: i to: l.
>                                 self defaultSort: k to: j]]! !
>
>!SortedCollection methodsFor: 'private' stamp: 'stp 04/23/1999 05:33'!
>sort: i to: j
>         "Sort elements i through j of self to be nondescending according to
>         sortBlock."
>
>         | di dij dj tt ij k l n |
>         sortBlock ifNil: [^self defaultSort: i to: j].
>         "The prefix d means the data at that index."
>         (n _ j + 1  - i) <= 1 ifTrue: [^self].  "Nothing to sort."
>         "Sort di,dj."
>         di _ array at: i.
>         dj _ array at: j.
>         (sortBlock value: di value: dj) "i.e., should di precede dj?"
>                 ifFalse:
>                         [array swap: i with: j.
>                         tt _ di.
>                         di _ dj.
>                         dj _ tt].
>         n > 2
>                 ifTrue:  "More than two elements."
>                         [ij _ (i + j) // 2.  "ij is the midpoint of i and j."
>                         dij _ array at: ij.  "Sort di,dij,dj.  Make dij 
> be their median."
>                         (sortBlock value: di value: dij) "i.e. should di 
> precede dij?"
>                            ifTrue:
>                                 [(sortBlock value: dij value: dj) "i.e., 
> should dij precede dj?"
>                                   ifFalse:
>                                         [array swap: j with: ij.
>                                         dij _ dj]]
>                            ifFalse:  "i.e. di should come after dij"
>                                 [array swap: i with: ij.
>                                 dij _ di].
>                         n > 3
>                           ifTrue:  "More than three elements."
>                                 ["Find k>i and l<j such that dk,dij,dl 
> are in reverse order.
>                                 Swap k and l.  Repeat this procedure 
> until k and l pass each other."
>                                 k _ i.
>                                 l _ j.
>                                 [[l _ l - 1.  k <= l and: [sortBlock 
> value: dij value: (array at: l)]]
>                                    whileTrue.  "i.e. while dl succeeds dij"
>                                   [k _ k + 1.  k <= l and: [sortBlock 
> value: (array at: k) value: dij]]
>                                    whileTrue.  "i.e. while dij succeeds dk"
>                                   k <= l]
>                                    whileTrue:
>                                         [array swap: k with: l].
>         "Now l<k (either 1 or 2 less), and di through dl are all less 
> than or equal to dk
>         through dj.  Sort those two segments."
>                                 self sort: i to: l.
>                                 self sort: k to: j]]! !
>
>!SortedCollection methodsFor: 'private' stamp: 'stp 04/23/1999 05:36'!
>indexForInserting: newObject
>
>         | index low high |
>         low _ firstIndex.
>         high _ lastIndex.
>         sortBlock isNil
>                 ifTrue: [[index _ high + low // 2.  low > high]
>                         whileFalse:
>                                 [((array at: index) <= newObject)
>                                         ifTrue: [low _ index + 1]
>                                         ifFalse: [high _ index - 1]]]
>                 ifFalse: [[index _ high + low // 2.  low > high]
>                         whileFalse:
>                                 [(sortBlock value: (array at: index) 
> value: newObject)
>                                         ifTrue: [low _ index + 1]
>                                         ifFalse: [high _ index - 1]]].
>         ^low! !
>
>!SortedCollection class methodsFor: 'instance creation' stamp: 'stp 
>04/23/1999 05:34'!
>new: anInteger
>         "The default sorting function is a <= comparison on elements."
>
>         ^(super new: anInteger) "sortBlock: [:x :y | x <= 
> y]"           "nil sortBlock OK"! !
>
>
>(StringHolder new textContents:
>         ('
>         For a demonstration of the faster sorted collections, evaluate 
> the following expression
>
>"       | co ra sc |
>         co := OrderedCollection new.
>         ra := Random new.
>         50000 timesRepeat: [ co add: ra next].
>         sc := SortedCollection new.
>         Transcript cr; show: ''Fast: '',  (Time millisecondsToRun: [sc 
> addAll: co]) printString, '' msec''; cr.
>         sc := SortedCollection sortBlock: [:x :y | x <= y].
>         Transcript show: ''Slow: '', (Time millisecondsToRun: [sc addAll: 
> co]) printString, '' msec''; cr.
>"
>
>'))     openLabel: 'Faster SortedCollections test'!

---

Dr. Daniel V. 
Oppenheim 

Computer Music Center
	IBM T.J. Watson Research Center	     phone: (914) 945-1989
	P. O. Box 218 (or Route 134)	     fax:   (914) 945-3434
	Yorktown Heights, NY 10598	     www.research.ibm.com/music





More information about the Squeak-dev mailing list