SmaCC Question

Colin Putney cputney at wiresong.ca
Mon May 24 02:35:42 UTC 2004


On May 23, 2004, at 8:05 PM, David T. Lewis wrote:

> On Sun, May 23, 2004 at 07:48:57PM -0500, Colin Putney wrote:
>> Hi folks,
>>
>> A question for the SmaCC gurus out there. I'm trying to build a parser
>> that will be versioned in Monticello, so I want to keep the code in 
>> the
>> parser class as much as possible, and out of the grammar. If I read 
>> the
>> tutorial correctly, this ought to be possible.
>
> I'm not sure I understand the problem you are trying to solve, but as
> for the grammar, it is stored safely in the form of strings in class
> side methods of the generated parser and scanner. So this should be
> manageable in whatever version control system you want to use, no
> problem.
>
> As for modifying (and keeping versions of) the generated code, don't
> do that. Think of those strings in those class side methods as the
> source code. Anything labeled as "generated" is like object code that
> you should not touch. However, you can easily add your own methods to
> the generated classes, keeping your hand-written methods in version
> control. You can regenerate your parser and scanner classes with SmaCC
> any time, and it will not clobber your hand-written methods. Works
> a champ.

Yes, modifying generated methods with tools other than SmaCC is exactly 
what I want to avoid.

This issue is this: I want the parser to generate an AST based on the 
grammar. (It's a Smalltalk grammar, actually. The parser will 
eventually be used in OmniBrowser.) But SmaCC doesn't generate parsers 
that produce ASTs. For each production rule in the grammar you have to 
supply a reducing action - Smalltalk code to handle the elements of the 
production rule.

Now, one way to supply the code for the reducing action is in the 
grammar. You stick a Smalltalk expression inside the {}, and SmaCC will 
generate methods incorporating the expression. The other way is to 
write the method yourself, and have SmaCC call it directly.

I prefer the second scenario, because putting the code to the parser in 
the grammar tends to obfuscate it. It makes both the code and the 
grammar more difficult to read. In terms of versioning, it's better to 
have the "canonical" source code spread out over several methods in the 
browser rather than glommed into a comment inside a single method. It 
gives a finer granularity for versioning, so merges, for example, are 
much more likely to succeed without conflicts.

John Brant mailed me off-list to say that what I need is this:

Node: <token> NonTerminal1 NonTerminal2 {#doSomething:};

My problem was that I hadn't yet implemented #doSomething:, so SmaCC 
was refusing to generate a parser that would spew MNUs. Problem solved.

Cheers,

Colin




More information about the Squeak-dev mailing list