Sorting Algorithm so far...
Richard A. O'Keefe
ok at atlas.otago.ac.nz
Fri Nov 16 01:40:43 UTC 2001
"Christopher K Roach" <chrisroach at btinternet.com> wrote:
quicksort: begin to: end
"This is an implementation of the Quick Sort algorithm, which
is O(n*log(n)), where n is the size of the collection to
be sorted."
In the name of sanity, *WHY*?
First off, quick sort is NOT O(n*log(n)), it is O(n**2),
and this worst case *does* arise in practice.
Second, quick sort is extraordinarily fragile; it is very
easy to end up with something that has quite bad performance.
(See Bentley and McIlory "Engineering a Quicksort" in Software
Practice & Experience if you _really_ have to use "quick" sort for
anything).
Third, Squeak *already* contains sorting methods,
there is no need to write a new one.
The classic trick question for novice Smalltalk programmers is
"Use Smalltalk to sort these strings", and the Right Way To Do It
is just #('first string' 'second string' 'and so on') sort.
Given data in 'data.raw', to produce sorted results in 'data.ord',
|s c n r|
"get the ASCII NUL character to use as field separator"
n := Character value: 0.
"open the source file"
s := StandardFileStream oldFileNamed: 'data.raw'.
"create an empty collection"
c := OrderedCollection new.
[s atEnd] whileFalse: [
"read a record as a string without the line terminator"
r := s nextLine.
"replace commas by NULs"
r replaceAll: $, with: n.
"add the transformed record to the collection"
c add: r].
"sort the collection using <= for comparison"
c := c asSortedCollection.
"open the destination file"
s := StandardFileStream newFileNamed: 'data.ord'.
"write out each element of the sorted collection"
c do: [:t |
"t is a transformed record. Undo the transformation"
t replaceAll: n with: $,.
"write the reconstructed record to the output stream"
s nextPutAll: t; cr].
"close the output stream"
s close.
Excluding comments, that's just 14 lines. No new class is needed (although
this must find a home in SOME class).
I'm a little puzzled as to why comma (rather than, say, TAB) was chosen
as the field separator. It is not uncommon for titles to contain commas.
More information about the Squeak-dev
mailing list
|