Litteral arrays parsing

nicolas cellier ncellier at ifrance.com
Wed May 24 00:35:42 UTC 2006


Le Mercredi 24 Mai 2006 01:16, Wolfgang Helbig a écrit :
> Nicolas,
>
> you asked:
> >Le Mardi 23 Mai 2006 16:42, Noury Bouraqadi a écrit :
> >> Hi,
> >>
> >> A stupid question, why evaluating
> >> #("comment") leads to an empty array instead of an array with a single
> >> element #'"comment"'?
> >
> >This one is a bad behaviour indeed, a side effect of Scanner/Parser
> > internal implementation... (Ascii 30 being used with meaning "end of
> > input").
>
> As long as it is "internal", I can't see anything wrong with it.
>

Hi Wolfgang,
Just try:
    (Compiler evaluate: '#') inspect.
and you will see this ascii 30 dangerously leaking from internal...

If # alone were really a valid syntax, then:
    (Compiler evaluate: '# inspect').
should inspect it...

It does not, because space is just ignored:
    (Compiler evaluate: '# inspect') inspect.

So as extra sharp signs:
    (Compiler evaluate: '# # # # inspect') inspect.

Do you agree with such behavior ?

> >Behind #, i would expect a letter [a-z][A-Z], a string quote ', or an
> > opening parenthesis (. Maybe a second # in Dolphin Smalltalk extension...
> >
> >What else does make sense according to Smalltalk formal definition?
>
> According to the syntax diagrams in the Book (choose the book's color from
> blue, yellow or purple), the sharp character may occur as the first
> character of an array constant or a symbol constant. In these positions it
> is followed by a left parenthesis, if it marks an array constant, otherwise
> it marks a symbol constant and is followed by a letter, a special character
> or a minus character. Remember, special characters are the ones that make a
> binary selector.
>

Oh yes, i should not have forgotten... #* #-
In latest squeak, also work with any number of special characters like #***.
In VW you can have a ByteArray with #[ 0 0 ]

> Inside a string or a comment, the sharp character may be followed by any of
> the 95 graphic characters.
>
> And finally, inside a character constant, the sharp character may be
> followed by any character.
>

I do not understand this sentence. Isn't it the dollar that is used in 
character constants ?
Or is it inside a literal array like #( ^x:=y at z ), in which case each 
character is interpreted as a single character symbol...

For fun, note that Squeak does not complain when you write
# $a

> This holds for the language as defined formally by the syntax diagrams, but
> not for the Smalltalk programming language as described informally by the
> Blue Book, where "any character" may occur inside comments, strings and
> character constants, that is not only the graphic characters but ASCII
> control characters as well, like carriage return, horizontal tabulator or
> record separator which is ASCII 30.
>
> And this again differs from the language as accepted by the compiler in the
> V2 image of Smalltalk-80. For example, the ASCII 0 character inside a
> character constant gets you an index error. But this is another thread :-)
>

You mean using ascii value as an index in the scanner character table?
I started with st-80 v2.3 but just don't remember such details...

> Greetings,
> Wolfgang
>
> --
> Weniger, aber besser.

Nicolas




More information about the Squeak-dev mailing list