[Biosqueak] The germ of an idea

Tue Aug 28 20:21:15 UTC 2001

John

I have been playing around with bioinformatics and squeak.  So far I
have a class heirarchy for dna, protein, and coding sequences with lots
of useful methods, mostly things like correction formulae, codon usage,
base frequencies, etc.  In addition, I have a class that parses the
Genbank flat file format and can produce sequences from a menu of the
documented features.  I have used this a bit for my own research, but am
now planning to teach a computational genetics course to our biology
undergrads using Squeak...so the pressure is on to turn all this into
something useful and fun.  Right now I can drop sequence morphs onto a
playfield and produce things like dot matrix plots and phylogenetic
trees, but visually it ain't much.  The speed issue hasn't been a
problem because I'm not trying to do anything very ambitious.  I ported
this stuff from my lisp code, where I used calls to C routines (like
clustal) to do hard calculations.  Perhaps something similar will solve
the speed problem with squeak.

If any of this sounds useful, I could package it up in its current
state.

John Gillespie
jhgillespie at ucdavis.edu

"John Tobler" <squeaker at diganet.com> wrote:
> The last thing I personally need right now is to tangle with a major
> non-paying project, so I have decided to start on the design and development
> of a Biosqueak initiative.  It rubs me the wrong way to see
> Bioeveryotherlanguage.org and not to see our beloved Squeak represented.  I
> would appreciate any and all comments, signs of interest, etc.  I will
> probably get rolling with implementing some of the simpler bioinformatics
> routines, following models already available in Biopython, Bioperl, and
> Bioruby.  Anyone else who is interested in applying Squeak to bioinformatics
> problems is most welcome to join in.
> 
> I am guessing that we will face some formidable obstacles.  As Heiko
> Schaefer pointed out in a recent post, "... little emphasis has been given
> so far on numerical work with squeak."  Will someone please correct me if
> this assessment is unfair?  It looks like some related work is underway by
> the Numerics group at the Camp Smalltalk connected with the ESUG conference
> in Essen.  Hopefully, something approaching the capability of Numeric Python
> (NumPy) will magically appear just before we need it for Biosqueak. I am
> also sure that bioinformatics processing will fully test Squeak's mettle on
> text searching and pattern recognition.  Where do we find support for
> regular expressions and the like?  I anticipate that trying to solve such
> real world problems as sequence searching, sequence allignment, and protein
> structure prediction will point out areas where we can improve Squeak's
> reach and performance.  There should be a challenge or two here to keep
> hardy pioneers and somewhat unstable test pilots interested.
> 
> Anyway, I intend to get started.  This is just a "heads up" that Biosqueak
> is out there somewhere on the vast horizon.
> 
> More later,
> 
> John Tobler
> squeaker at diganet.com
> johntobler at earthlink.net