parser for context sensitive grammars?

John Brant brant at refactory.com
Mon Jan 24 16:21:50 UTC 2005


Ferdinand Strixner wrote:
> 
> obviously SmaCC seems only to handle LALR(1) or LR(1) context free 
> grammars. Does anyone know a parser generator that can handle grammars 
> like this:
> 
> <subject>: Egon | Erna | Emma;
> <object>: Egon | Erna | Edna;
> <verb>: sees;
> <point>: \.;
> <space>: \s;
> 
> Sentence: <subject> <space> <verb> <space> <object> <point>;
> 
> (SmaCC compiles this grammar but doesn't recognize sentences like "Egon 
> sees Erna" but only "Emma sees Edna" and perhaps because of the order of 
> the token definitions "Egon sees Edna".)

You can get around this problem by overriding the #actionForCurrentToken 
method:
actionForCurrentToken
	| ids action |
	ids := currentToken id.
	1 to: ids size
		do:
			[:i |
			action := self actionFor: (ids at: i).
			(action bitAnd: self actionMask) = self errorAction ifFalse: [^action]].
	^self errorAction

This method tests all overlapping token values to see if any of them are 
valid. Depending on your grammar this may be good enough.

There is also some support for scanner states, but it hasn't been 
fleshed out (and I don't know if it is in the Squeak version). For 
example, you could have something like:

%states object;
object <object>: Egon | Erna | Edna;
<subject>: Egon | Erna | Emma;
<verb>: sees;
<point>: \.;
<space>: \s;

Sentence: Subject <space> <verb> <space> <object> <point>;
Subject : <subject> {scanner state: #object. '1'};



John Brant



More information about the Squeak-dev mailing list