Negative precedence rules (was RE: Why we should remove {} from Squeak)

Mark van Gulik ghoul6 at home.com
Tue Oct 2 05:19:15 UTC 2001


Yikes!  I scanned my site and, well, I guess I never wrote an HTML 
document about my negative precedence mechanism.  I'll remedy that soon, 
but for now I'll (briefly) describe the mechanism.

Method names in Avail can be decomposed into tokens that are either:
	(1) an alphabetic followed by zero or more alphanumerics (e.g., 
"foo19"),
	(2) a single punctuation character (excluding underscore and 
whitespace) (e.g., "+"),
	(3) a space character, which is only useful to separate 
alphanumeric tokens (e.g., "first name")
	(4) an underscore, which indicates where an argument will be at a 
message send.
Some example method names from Avail are:
	"_+_"            - infix addition
	"Print_"         - unary prefix, outputs argument to console.
	"cos_"           - prefix cosine
	"_!"               - postfix factorial
	"Halt"            - a handy 0-ary method
	"_[_]"            - tuple subscripting, like Squeak's #at:
	"_(_)"            - block invocation, like Squeak's #value:
	"_[_.._]"        - like Squeak's #copyFrom:to: (e.g., myTuple[1..3])
	"_<=_<=_"    - like #between:and:, but with arguments in the 
'natural' order
	"Method_is_"  - defines a multimethod -- takes a name and a (typed) 
block.

An Avail statement is either a declaration or a message send, terminated 
by a semicolon.  Every Avail expression is type-checked statically.  
Even though there may be many ways of parsing a given Avail 
subexpression, an Avail statement must be parseable in exactly one way 
that satisfies type-checking.

Consider the message "_+_".  If I don't specify any precedence rules, 
the expression "1+2+3" can be parsed in two different ways:  (1+2)+3 or 
1+(2+3).  Therefore, the statement "Print 1+2+3" would be invalid, and 
the compiler would indicate the ambiguity by showing the two possible 
parses.

For some messages, this is exactly the right behavior -- for example, it 
might be too misleading to allow exponentiation to be grouped 
right-to-left, so we could require all double-exponent expressions to be 
explicitly parenthesized.

Since we know integer addition is associative (NB: floating point 
addition might *not* be considered associative), we can choose to group 
them however we want.  Since Avail's interpreter is similar to 
Smalltalk's, we should choose a left-associative parse to minimize the 
context's stack depth.  To do so, we simply forbid the parse that we 
don't want, by using the "_can't have_" message:

	"_+_" can't have <{}, {"_+_"}>;

This says that if you parse a "_+_" send whose right side is 
another "_+_" send, you should eliminate that particular parse as 
invalid.  Ironically (given the original topic of this thread), I 
use "<_>" to construct tuples, and "{_}" to construct sets.  No magic 
required.  Heck, I don't even support negative integers -- I just send a 
positive integer the negation message "-_", as in "-5", or "-  5", or 
"-x".

In the above expression, the first "_+_" is the left argument of "_can't 
have_".  It indicates what message name should have a restriction 
imposed on its arguments.  The other argument of "_can't have_" is the 
tuple containing two sets, the first set being the restrictions to place 
on "_+_"'s left argument, and the second set being the restrictions to 
place on "_+_"'s right argument.  As you can see, we don't want the 
right argument of "_+_" to be another "_+_", which prevents "1+2+3" from 
being interpreted as "1+(2+3)", leaving just "(1+2)+3" as the only legal 
parse.

If we're not content with Presburger arithmetic, we can add the 
multiplication message "_*_".  The precedence declaration looks like:

	"_+_" can't have <{},      {"_+_"}>;
	"_*_" can't have <{"_+_"}, {"_*_", "_+_"}>;

That corresponds precisely with our usual computerese order of 
operations.  Avail supports addition, subtraction, multiplication, 
division, exponentiation (right-grouped), negation (with a precedence 
similar to multiplication), absolute value "|_|", factorial "_!", etc.  
In the case of absolute value, I disallow nested absolute values, which 
are almost certainly not intended:

	"|_|" can't have <{"|_|"}>;

Expression such as "e^-x^2/2" work out quite nicely.  Much more pleasant 
than Squeak's:
	"(x squared negated / 2) exp"

So far all of this could have been done with ordinary "positive" 
precedence rules.  The difference is in what was *not* said.  Say I have 
an "_ at _" message that constructs points.  Also assume I don't have 
automatic conversion of numbers into points or vice-versa.  Then both of 
the expressions:
	1+2 @ 3+4
	1 at 2 + 3 at 4
are perfectly valid Avail expressions that produce points (3 at 7 and 4 at 6, 
respectively).  I did not need to define a precedence relation 
between "_+_" and "_ at _", and in fact leaving it undefined was *exactly* 
the right answer.  Note that the usual scheme of assigning precedence 
numbers to operators is totally incapable of supporting both of the 
above expressions at the same time.  That's because if you use 
precedence numbers you'll get a full order, but if you use my negative 
precedence scheme you can get *any* precedence graph.

Besides simply keeping the parser's nose out of business it can't (and 
shouldn't be asked to) understand, it also keeps precedence modular.  My 
number arithmetic rules don't need to know about string concatenation or 
point construction, or association construction, or anything that's 
added next week.  Similarly, most of the time the point construction 
code couldn't care less about arithmetic or associations or any of 
that.  This satisfies a fundamental design axiom for me:  Loading a 
module should have no effect on a running system, other than making more 
things visible to a programmer.  *Installing* a module is another 
story...

While my solution in Avail may not be directly applicable to (the 
current state of) Squeak, I think the underlying premises are crucial to 
consider, even if for no other reason than to know what has been lost by 
a particular design choice.



On Monday, October 1, 2001, at 09:37 pm, Andreas Raab wrote:

> Mark,
>
>> That *would* also be the case in my language Avail
>> ( http://www.ericsworld.com/Mark/HTML/Avail.html ), but then
>> I invented  the concept of negative precedence rules.
>> It's probably worth a look for both Selfers and Squeakers.
>> Ordinary precedence rules in something big and flat like Squeak
>> or Self (ignoring the module systems for both) would be ludicrous
>> (e.g., which should bind tighter, "->" or ",", and why?).
>
> Can you say a bit more about "negative precedence rules"?! A Google 
> search
> only revealed the page on Avail syntax stating "There is also a 
> facility for
> setting up precedence relations between operations (e.g., + and *) to 
> assist
> with disambiguation of expressions." but that's all I could find.
>
> Thanks,
>   - Andreas
>
>
>




More information about the Squeak-dev mailing list