Sorting Algorithm so far...

Richard A. O'Keefe ok at atlas.otago.ac.nz
Fri Nov 16 01:40:43 UTC 2001


"Christopher K Roach" <chrisroach at btinternet.com> wrote:
	
	quicksort: begin to: end
	    "This is an implementation of the Quick Sort algorithm, which
	     is O(n*log(n)), where n is the size of the collection to
	     be sorted."
	
In the name of sanity, *WHY*?
First off, quick sort is NOT O(n*log(n)), it is O(n**2),
and this worst case *does* arise in practice.
Second, quick sort is extraordinarily fragile; it is very
easy to end up with something that has quite bad performance.
(See Bentley and McIlory "Engineering a Quicksort" in Software
Practice & Experience if you _really_ have to use "quick" sort for
anything).
Third, Squeak *already* contains sorting methods,
there is no need to write a new one.

The classic trick question for novice Smalltalk programmers is	
"Use Smalltalk to sort these strings", and the Right Way To Do It
is just #('first string' 'second string' 'and so on') sort.
	
Given data in 'data.raw', to produce sorted results in 'data.ord',

	|s c n r|
	"get the ASCII NUL character to use as field separator"
	n := Character value: 0.
	"open the source file"
	s := StandardFileStream oldFileNamed: 'data.raw'.
	"create an empty collection"
	c := OrderedCollection new.
	[s atEnd] whileFalse: [
	    "read a record as a string without the line terminator"
	    r := s nextLine.
	    "replace commas by NULs"
	    r replaceAll: $, with: n.
	    "add the transformed record to the collection"
	    c add: r].
	"sort the collection using <= for comparison"
	c := c asSortedCollection.
	"open the destination file"
	s := StandardFileStream newFileNamed: 'data.ord'.
	"write out each element of the sorted collection"
	c do: [:t |
	    "t is a transformed record.  Undo the transformation"
	    t replaceAll: n with: $,.
	    "write the reconstructed record to the output stream"
	    s nextPutAll: t; cr].
	"close the output stream"
	s close.

Excluding comments, that's just 14 lines.  No new class is needed (although
this must find a home in SOME class).

I'm a little puzzled as to why comma (rather than, say, TAB) was chosen
as the field separator.  It is not uncommon for titles to contain commas.




More information about the Squeak-dev mailing list