Re: [Biosqueak] The germ of an idea

28 Aug 2001


      John
I have been playing around with bioinformatics and squeak.  So far I
have a class heirarchy for dna, protein, and coding sequences with lots
of useful methods, mostly things like correction formulae, codon usage,
base frequencies, etc.  In addition, I have a class that parses the
Genbank flat file format and can produce sequences from a menu of the
documented features.  I have used this a bit for my own research, but am
now planning to teach a computational genetics course to our biology
undergrads using Squeak...so the pressure is on to turn all this into
something useful and fun.  Right now I can drop sequence morphs onto a
playfield and produce things like dot matrix plots and phylogenetic
trees, but visually it ain't much.  The speed issue hasn't been a
problem because I'm not trying to do anything very ambitious.  I ported
this stuff from my lisp code, where I used calls to C routines (like
clustal) to do hard calculations.  Perhaps something similar will solve
the speed problem with squeak.
If any of this sounds useful, I could package it up in its current
state.
John Gillespie
jhgillespie@ucdavis.edu
"John Tobler" squeaker@diganet.com wrote:
...
The last thing I personally need right now is to tangle with a major
non-paying project, so I have decided to start on the design and development
of a Biosqueak initiative.  It rubs me the wrong way to see
Bioeveryotherlanguage.org and not to see our beloved Squeak represented.  I
would appreciate any and all comments, signs of interest, etc.  I will
probably get rolling with implementing some of the simpler bioinformatics
routines, following models already available in Biopython, Bioperl, and
Bioruby.  Anyone else who is interested in applying Squeak to bioinformatics
problems is most welcome to join in.
I am guessing that we will face some formidable obstacles.  As Heiko
Schaefer pointed out in a recent post, "... little emphasis has been given
so far on numerical work with squeak."  Will someone please correct me if
this assessment is unfair?  It looks like some related work is underway by
the Numerics group at the Camp Smalltalk connected with the ESUG conference
in Essen.  Hopefully, something approaching the capability of Numeric Python
(NumPy) will magically appear just before we need it for Biosqueak. I am
also sure that bioinformatics processing will fully test Squeak's mettle on
text searching and pattern recognition.  Where do we find support for
regular expressions and the like?  I anticipate that trying to solve such
real world problems as sequence searching, sequence allignment, and protein
structure prediction will point out areas where we can improve Squeak's
reach and performance.  There should be a challenge or two here to keep
hardy pioneers and somewhat unstable test pilots interested.
Anyway, I intend to get started.  This is just a "heads up" that Biosqueak
is out there somewhere on the vast horizon.
More later,
John Tobler
squeaker@diganet.com
johntobler@earthlink.net