MarkingKey

Richard A. O'Keefe ok at cs.otago.ac.nz
Wed Jun 12 02:02:10 UTC 2002


Niko Schwarz <niko.schwarz at gmx.net> wrote:
	Hmm, Insertion Sort would be a good example, as Insertion Sort is a part of 
	many advanced sorting algorithms.
	
I have studied more variants of insertion sort than most people
know exist.  (Hmm.  I really MUST finish that paper.)

    If I had not done that, I'd need to check for j<= 0, which would make the 
    implementation significantly slower.

No, it wouldn't.  To start with, insertion sort should never be used for
big arrays (unless you know they have only one or two elements out of place).
If you can measure the speedup, the array is too big and you should be using
a different method.

To continue, what if you have to sort an Array, which has a fixed size?
There just isn't anywhere you can PUT the sentinel.
You cannot stretch a Pascal or C array to make room for an extra element;
while you can stretch (some) Lisp arrays top, while you want to post a
sentinel at the bottom.  You *can* stretch OrderedCollections at either
end, but that costs you far far more in element access time than posting
a sentinel at the bottom can possibly save you.

But the really interesting thing about insertion sort is that you can
get a perfectly suitable sentinel from the elements of the array itself.

    Element is some type, I don't care what.
    int less(Element, Element) is a function or macro that
	returns non-zero if the first argument should precede the second.

    void isort(Element *a, int n) {
	Element t;
	int i, j;

	if (n <= 1) return;
	t = a[0];
	i = 0;
	for (j = 1; j < n; j++)
	    if (less(a[j], t)) t = a[j], i = j;
	if (i != 0) {
	    for (j = i; j > 0; j--) a[j] = a[j-1];
	    a[0] = t;
        }
        /* Now a[0] <= a[j] for all 1<=j<n */
	for (i = 1; i < n; i++) {
	    t = a[i];
	    for (j = i; less(t, a[j-1]); j--) a[j] = a[j-1];
	    a[j] = t;
	}
    }

	I don't think modifiying at: put: and so on is a serious option, as they're 
	the very heart of these algorithms, modifying them would probably cost a lot 
	of performance.

Ahem.  If you are THAT worried about performance, you should not be
sorting an OrderedCollection.  Look at the code for OrderedCollection>>at:

    at: anInteger
	(anInteger < 1 or: [anInteger + firstindex - 1 > lastIndex])
	    ifTrue:  [self errorNoSuchElement[
	    ifFalse: [^array at: anInteger + firstIndex - 1]

	Once for calculating index+1, and then the overhead of the
	method call would be serious, cos i found out that at:  index is
	a primitive, which is inlined.
	
Not for OrderedCollections it isn't.  The version of #at: which Array
(for example) inherits from Object is a primitive, but OrderedCollection
overrides it with the method shown above, which is NOT a primitive.

    For the fastest way to insertion sort an OrderedCollection in Smalltalk:
1.  copy the elements into an Array (just the right size!)
    using array := oc asArray.

2.  sort the array using (the Smalltalk equivalent of) the C code isort()
    I displayed above.

3.  stuff the array back into the OrderedCollection.
    oc setContents: array
    would do the trick, but that's a private method, so
    oc replaceFrom: 1 to: oc size with: array
    is the one to use.

The time you'll save by using a fast #at: will more than compensate for
any speedup you'd have got by using a Sentinel, and in fact the isort()
code above *does* use a sentinel on every pass except the first, it's
just a sentinel carefully selected from the data.




More information about the Squeak-dev mailing list