An alternative FFI/Parser proposal
andreas.raab at gmx.de
Sun Aug 20 19:54:58 UTC 2006
Hi Andrew -
I like it. This certainly addresses all my points.
Andrew Tween wrote:
> Hi all,
> Having had a chance to ponder this, I think it could be evolved into a good
>> Instead of insisting that the FFI syntax needs change, let's assume that
>> there may be differences in the pragma formats. That has been true in
>> the past and in my experience it is likely that it will be true and
>> valuable in the future, too (like for example, if somebody wants to use
>> blocks in a pragma).
> Given that the FFI syntax isn't going to change; it is a *certainty* that there
> are differences in pragma formats. So any solution should allow for that. If FFI
> is allowed to have its own format, then my imaginary XYZ extension should also
> be allowed to.
>> A good way of dealing with these differences would be if a client could
>> register specific pragmas which are parsed in a client dependent manner.
>> So that, for example, the FFI would register <apicall:> and <cdecl:> for
>> FFI specs and Tweak may register <on:> for parsing method triggers[*].
>> In this case, Parser could simply invoke the proper client for a
>> registered pragma, pass it a stream and let it decide what to do. Given
>> a sufficient interface for client and Parser, this would leave the
>> entire responsibility with the client instead of Parser, but Parser
>> could still provide a default implementation.
> The implementation could be simplified by allowed only the tokens after the
> first keyword to vary, rather than everything in a <..>. This is enough for FFI,
> and retains at least part of the pragma syntax. e.g.
> <a: i * j b: k > is allowed.
> <a: i > j b: k> is not allowed (can't have embedded > )
> <a b c d> is not allowed (can only be free form if first token is keyword)
> (If an extension wants to allow embedded > then it can specify that they are
> doubled i.e. >> )
> The parser can now collect the data from the stream for each <...> construct.
> i.e. record the start point, skip all tokens until '>' is reached, and then
> store the source from start to end. These are then recorded as AngleConstructs
> (or whatever).
> For example,
> <a: i * j b: k> produces this AngleConstruct
> selector: #a:
> source: 'a: i * j b: k' )
> (Note that the b: keyword part does NOT form part of the selector)
> Having parsed all "pragmas" as free form angleConstructs, the parser then
> decides what to do with each one.
> angleConstructs do:[angle |
> self compilerExtensions
> detect:[extension |
> (handled := extension canHandle: angle)
> (extension compile: angle for: self)
> (realPragma := extension pragmaFor: angle)
> ifNotNil:[pragmas addLast: realPragma]].
> ifNone:["error - no handler for this <...> "]].
> with some extra error handling etc.
> The key point is that there is a sequence of compilerExtensions, and so there is
> a precedence. Currently the order will be
> Each handler performs its own parse of the AngleConstruct's source.
> Each handler can determine whether to record each of the angle constructs it
> handles as a real Pragma object (or specialized kind of Pragma etc).
> So, for example, <a: i * j b: k> could be stored as...
> MyPragma(selector: #a:b: , arguments: #( 'i * j' 'k' ))
> or as
> MyPragma(selector: #a: , arguments: #( 'i * j' #b: 'k'))
> If a handler chooses to add a Pragma (or specialized form of Pragma), then all
> the searching senders free stuff will be utilised.
> This could be extended to allow a handler to add any number of real Pragmas. For
> <a: 1 ; b:2 ; c: 3> might result in Pragmas (#a #(1)) , (#b #(2)) and (#c
> FFI compilation (rightly) fails if FFI is not installed, because the call syntax
> is such that an FFI call can never be a valid Pragma. The distinct syntax can
> therefore be seen as an advantage, rather than a disadvantage.
> As Andreas has previously stated (in this, or another, thread) the specialized
> Pragmas can also deal with decompilation.
>> [*] The main reason for Tweak to parse triggers separately is to provide
>> semantic checks. For example, the <on: event in: signaler> annotation
>> requires the signaler to be a field of the receiver. Being able to hook
>> into the parse in this way can be useful for other kinds of semantic
>> Unfortunately, there are also a couple of gotchas with the proposal:
>> Most importantly, it requires that any parser can hand off the current
>> input stream to a client and continue after it's getting the stream
>> back. Not sure if all parsers could easily do that. In addition, the
> My suggestion avoids that problem. I am not sure what the cost is - efficiency
>> client would need to have sufficient access to the parser to perform
>> whatever action it requires, including (potentially) correction or error
>> handling. This may be tricky since the existing parsers have no common
>> protocols for that. Lastly there is an issue with what exactly should
> I've only sketched out a protocol in one direction. I haven't considered how the
> extensions talk to the parser. But I don't think it would be too complex. What
> should the interface be?
>> happen if we're trying to parse a pragma but lack the proper support
>> (like parsing an FFI spec without the FFI being present). I'd rather
>> have it if that the parser is aware of such problems and can raise an
>> error instead of trying to get the user to use something that won't work
>> anyway, but this may not be possible.
> I think that the 'trick' is to ensure that a non-pragma syntax is used, so that
> compilation fails when the extension is missing (assuming, of course, that if
> the extension is missing, then so is its parser/compiler extensions)
>> In any case, this is a clear alternative that offers the same benefits
>> of the original proposal ("clean", "extensible") while avoiding
>> fundamentally breaking the FFI for no good reason. If anyone were to
>> implement that proposal it would certainly find my support.
>> - Andreas
More information about the Squeak-dev