Negative precedence rules (was RE: Why we should remove {} from Squeak)
Mark van Gulik
ghoul6 at home.com
Tue Oct 2 05:19:15 UTC 2001
Yikes! I scanned my site and, well, I guess I never wrote an HTML
document about my negative precedence mechanism. I'll remedy that soon,
but for now I'll (briefly) describe the mechanism.
Method names in Avail can be decomposed into tokens that are either:
(1) an alphabetic followed by zero or more alphanumerics (e.g.,
"foo19"),
(2) a single punctuation character (excluding underscore and
whitespace) (e.g., "+"),
(3) a space character, which is only useful to separate
alphanumeric tokens (e.g., "first name")
(4) an underscore, which indicates where an argument will be at a
message send.
Some example method names from Avail are:
"_+_" - infix addition
"Print_" - unary prefix, outputs argument to console.
"cos_" - prefix cosine
"_!" - postfix factorial
"Halt" - a handy 0-ary method
"_[_]" - tuple subscripting, like Squeak's #at:
"_(_)" - block invocation, like Squeak's #value:
"_[_.._]" - like Squeak's #copyFrom:to: (e.g., myTuple[1..3])
"_<=_<=_" - like #between:and:, but with arguments in the
'natural' order
"Method_is_" - defines a multimethod -- takes a name and a (typed)
block.
An Avail statement is either a declaration or a message send, terminated
by a semicolon. Every Avail expression is type-checked statically.
Even though there may be many ways of parsing a given Avail
subexpression, an Avail statement must be parseable in exactly one way
that satisfies type-checking.
Consider the message "_+_". If I don't specify any precedence rules,
the expression "1+2+3" can be parsed in two different ways: (1+2)+3 or
1+(2+3). Therefore, the statement "Print 1+2+3" would be invalid, and
the compiler would indicate the ambiguity by showing the two possible
parses.
For some messages, this is exactly the right behavior -- for example, it
might be too misleading to allow exponentiation to be grouped
right-to-left, so we could require all double-exponent expressions to be
explicitly parenthesized.
Since we know integer addition is associative (NB: floating point
addition might *not* be considered associative), we can choose to group
them however we want. Since Avail's interpreter is similar to
Smalltalk's, we should choose a left-associative parse to minimize the
context's stack depth. To do so, we simply forbid the parse that we
don't want, by using the "_can't have_" message:
"_+_" can't have <{}, {"_+_"}>;
This says that if you parse a "_+_" send whose right side is
another "_+_" send, you should eliminate that particular parse as
invalid. Ironically (given the original topic of this thread), I
use "<_>" to construct tuples, and "{_}" to construct sets. No magic
required. Heck, I don't even support negative integers -- I just send a
positive integer the negation message "-_", as in "-5", or "- 5", or
"-x".
In the above expression, the first "_+_" is the left argument of "_can't
have_". It indicates what message name should have a restriction
imposed on its arguments. The other argument of "_can't have_" is the
tuple containing two sets, the first set being the restrictions to place
on "_+_"'s left argument, and the second set being the restrictions to
place on "_+_"'s right argument. As you can see, we don't want the
right argument of "_+_" to be another "_+_", which prevents "1+2+3" from
being interpreted as "1+(2+3)", leaving just "(1+2)+3" as the only legal
parse.
If we're not content with Presburger arithmetic, we can add the
multiplication message "_*_". The precedence declaration looks like:
"_+_" can't have <{}, {"_+_"}>;
"_*_" can't have <{"_+_"}, {"_*_", "_+_"}>;
That corresponds precisely with our usual computerese order of
operations. Avail supports addition, subtraction, multiplication,
division, exponentiation (right-grouped), negation (with a precedence
similar to multiplication), absolute value "|_|", factorial "_!", etc.
In the case of absolute value, I disallow nested absolute values, which
are almost certainly not intended:
"|_|" can't have <{"|_|"}>;
Expression such as "e^-x^2/2" work out quite nicely. Much more pleasant
than Squeak's:
"(x squared negated / 2) exp"
So far all of this could have been done with ordinary "positive"
precedence rules. The difference is in what was *not* said. Say I have
an "_ at _" message that constructs points. Also assume I don't have
automatic conversion of numbers into points or vice-versa. Then both of
the expressions:
1+2 @ 3+4
1 at 2 + 3 at 4
are perfectly valid Avail expressions that produce points (3 at 7 and 4 at 6,
respectively). I did not need to define a precedence relation
between "_+_" and "_ at _", and in fact leaving it undefined was *exactly*
the right answer. Note that the usual scheme of assigning precedence
numbers to operators is totally incapable of supporting both of the
above expressions at the same time. That's because if you use
precedence numbers you'll get a full order, but if you use my negative
precedence scheme you can get *any* precedence graph.
Besides simply keeping the parser's nose out of business it can't (and
shouldn't be asked to) understand, it also keeps precedence modular. My
number arithmetic rules don't need to know about string concatenation or
point construction, or association construction, or anything that's
added next week. Similarly, most of the time the point construction
code couldn't care less about arithmetic or associations or any of
that. This satisfies a fundamental design axiom for me: Loading a
module should have no effect on a running system, other than making more
things visible to a programmer. *Installing* a module is another
story...
While my solution in Avail may not be directly applicable to (the
current state of) Squeak, I think the underlying premises are crucial to
consider, even if for no other reason than to know what has been lost by
a particular design choice.
On Monday, October 1, 2001, at 09:37 pm, Andreas Raab wrote:
> Mark,
>
>> That *would* also be the case in my language Avail
>> ( http://www.ericsworld.com/Mark/HTML/Avail.html ), but then
>> I invented the concept of negative precedence rules.
>> It's probably worth a look for both Selfers and Squeakers.
>> Ordinary precedence rules in something big and flat like Squeak
>> or Self (ignoring the module systems for both) would be ludicrous
>> (e.g., which should bind tighter, "->" or ",", and why?).
>
> Can you say a bit more about "negative precedence rules"?! A Google
> search
> only revealed the page on Avail syntax stating "There is also a
> facility for
> setting up precedence relations between operations (e.g., + and *) to
> assist
> with disambiguation of expressions." but that's all I could find.
>
> Thanks,
> - Andreas
>
>
>
More information about the Squeak-dev
mailing list
|