[Biosqueak] The germ of an idea

Helge Horch Helge.Horch at munich.netsurf.de
Sat Sep 1 15:43:30 UTC 2001


At 00:05 01.09.2001 -0700, Jan Bottorf wrote:
>For example, a multiple parallel state machine pattern search (i.e. a
>parser) might beat a bunch of passes with simple pattern match searches.

It certainly does!  I use it at work. :-)

>It also seems possible the intersection of the bioinformatics wizards with 
>the
>advanced parser wizards is almost an empty set.

I know it was just an example, but for this specific example (simultaneous 
exact pattern search), see if you can locate the paper by SMITH, R.: "A 
finite state machine algorithm for finding restriction sites and other 
pattern matching applications"; in: CABIOS Computer Applications in the 
Biosciences, vol. 4 (1988), #4, pp. 459-465.  (I seem to have lost my copy 
-- IIRC, he built one DFA from the pattern list.)

Also, the Dr. Dobb's article on the implementation of fgrep was circulating 
in labs, sometime in the early 80s.  And I sure have encountered the Dragon 
Book in some unexpected places.

>Smalltalk is just super good at implementing fancy algorithms.

And rewarding, too:  I fondly recall my Smalltalk/V implementation of the 
above algorithms running rings around "compiled" simplistic site finders.

>[...] a question to ask: is it REALLY a very serious number cruncher 
>problem, or is that just what current algorithms end up being.

Well, profile searching (i.e. for fuzzy and gapped patterns) is still an 
active research area.  Fun, fame, fortune, etc. ;-)

Cheers,
Helge




More information about the Squeak-dev mailing list