[OT] Writing a parser for BASIC in Smalltalk/Squeak ?

Lex Spoon lex at cc.gatech.edu
Thu Jan 3 17:43:04 UTC 2002


"Andy Stoffel" <Andrew.Stoffel at jenzabar.net> wrote:
> This is very off-topic (I think anyway)... 
> 
> I was wondering if anyone has any links/sources/information pointers
> on writing a parser for other languages in Smalltalk ? Even just a 
> starting point would be useful... 
> 

I can give some general pointers, if that is what you are looking for. 
I dunno nothing about this Compaq Basic thing, though!


First, Smalltalk doesn't change writing interpreters any more than it
changes much of any program.  It's a perfectly fine language for it. 
Second, writing an interpreter shares a lot with writing a compiler, and
there are more resources around on writing compilers.  Most intro
compiler books will be very helpful, and the compiler newsgroup
(comp.compilers) will be a good source of friendly help.


Here are some broad phases you will need to worry about:

First, you will need a very solid understanding of the language.  You
need to know it better than a programmer does.


Second, you will need to develop some sort of virtual machine, to keep
track of a program as it runs.  At the least, you will need to keep
track of the values of variables, and of which statement (or part of a
statement?) is currently being executed.  This is the part I'd focus on
the most, for an initial implementation -- don't cheese on it, or you
can waste a lot of time.  You might even want to post your ideas to
comp.compilers (or wherever) before you start on implementing it.

Third, you will need a parser, to translate from program text into a
form that is ready for program consideration.  That is, turn a string
like '2 + 14' into a BinaryOperationExpression object.  Languages
designed within the last decade or three tend to have an initial neat
and tidy "scanning" phase for dividing the string into individual words
(e.g., '2 + 14' turns into #('2' '+' '14')), and they tend to have a
neat and tidy BNF grammer to describe the legal sequences of words, so
that tools like T-Gen can really go to town.  If your language is so
old, however, don't be too surprised if T-Gen doesn't just do everything
for you -- you may well have to code some parts by hand that people
rarely do nowadays.

Fourth, you might need to translate your parse trees into something the
VM can understand.  Or maybe not -- if you can get away with processing
the parse trees directly, then you might want to try and do so for your
first implementation.


Overall, good luck and have fun!  Writing language processors is as good
as programming gets, IMHO.  :)

-Lex




More information about the Squeak-dev mailing list