Losing my latin

Sat Jan 5 16:17:22 UTC 2002

Doug Way <dway at riskmetrics.com> wrote:
> Henrik Gedenryd posted some changes recently which improve the current parser somewhat, at least to separate the optimizing parse (for compiling) from non-optimizing (for pretty-printing, etc.).  (See http://groups.yahoo.com/group/squeak/message/36808)  I think we should try to include these in the next round of harvesting.
> 
> On the other hand, I'm not sure if those changes are a big enough step toward a really analysis-friendly parser or not.  I haven't done a lot of work with parsers.

> > For the moment I will try to continue with the current parser and stabilize
> > Gutenberg but this is true that I will certainly look at the RBparser
> > because I would like to have something that is documented (not patched
> > several times) and that I can understand.
> 
> If the Refactoring Browser becomes a relatively standard tool for Squeak (maybe when the modules stuff happens), then the RBparser might become the de facto alternate parser.  (Of course anyone is free to roll their own parser, but it'd be good to have one well-supported analysis-friendly parser.)

>From my experience working with the RB parser -
* It is pretty easy to coopt for new uses. 
Keeping the nodes' start and end position is very valuable for making
the parser useful in interacting with the code - replacing one message
send by another, for example, is not a matter of textual search, but can
be done precisely.

* It doesn't (yet) go quite far enough. 
It could benefit from small silly helper methods like
isInstanceVariable, isTemporaryVariable, isClassVariable. It has some
protocols like these (#isVariable), but completing these would be very
useful for many things (like a highlighting pretty printer)

* It is basically pretty clean, but it's been modified for Squeak syntax
extensions (yes, Stef, we know... ;-) so it's testing is mostly
thorough, but probably not quite complete.

* It has a working Visitor framework, so adding algorithms is pretty
easy.

* It's not big - 114k parser, 26k scanner, 45k for additional parse tree
matching framework. As a rough yardstick, the Squeak parser (which, of
course, also does dialects, emits byte codes and so forth) is 220k.

* If people use it in their code, I'll make it a module separate from
the other 400k of the RB model.

> - Doug Way
>   dway at riskmetrics.com

Daniel Vainsencher