SmaCC Question
Colin Putney
cputney at wiresong.ca
Mon May 24 02:35:42 UTC 2004
On May 23, 2004, at 8:05 PM, David T. Lewis wrote:
> On Sun, May 23, 2004 at 07:48:57PM -0500, Colin Putney wrote:
>> Hi folks,
>>
>> A question for the SmaCC gurus out there. I'm trying to build a parser
>> that will be versioned in Monticello, so I want to keep the code in
>> the
>> parser class as much as possible, and out of the grammar. If I read
>> the
>> tutorial correctly, this ought to be possible.
>
> I'm not sure I understand the problem you are trying to solve, but as
> for the grammar, it is stored safely in the form of strings in class
> side methods of the generated parser and scanner. So this should be
> manageable in whatever version control system you want to use, no
> problem.
>
> As for modifying (and keeping versions of) the generated code, don't
> do that. Think of those strings in those class side methods as the
> source code. Anything labeled as "generated" is like object code that
> you should not touch. However, you can easily add your own methods to
> the generated classes, keeping your hand-written methods in version
> control. You can regenerate your parser and scanner classes with SmaCC
> any time, and it will not clobber your hand-written methods. Works
> a champ.
Yes, modifying generated methods with tools other than SmaCC is exactly
what I want to avoid.
This issue is this: I want the parser to generate an AST based on the
grammar. (It's a Smalltalk grammar, actually. The parser will
eventually be used in OmniBrowser.) But SmaCC doesn't generate parsers
that produce ASTs. For each production rule in the grammar you have to
supply a reducing action - Smalltalk code to handle the elements of the
production rule.
Now, one way to supply the code for the reducing action is in the
grammar. You stick a Smalltalk expression inside the {}, and SmaCC will
generate methods incorporating the expression. The other way is to
write the method yourself, and have SmaCC call it directly.
I prefer the second scenario, because putting the code to the parser in
the grammar tends to obfuscate it. It makes both the code and the
grammar more difficult to read. In terms of versioning, it's better to
have the "canonical" source code spread out over several methods in the
browser rather than glommed into a comment inside a single method. It
gives a finer granularity for versioning, so merges, for example, are
much more likely to succeed without conflicts.
John Brant mailed me off-list to say that what I need is this:
Node: <token> NonTerminal1 NonTerminal2 {#doSomething:};
My problem was that I hadn't yet implemented #doSomething:, so SmaCC
was refusing to generate a parser that would spew MNUs. Problem solved.
Cheers,
Colin
More information about the Squeak-dev
mailing list
|