[squeak-dev] Re: Deltastreams update

Göran Krampe goran at krampe.se
Fri Mar 13 14:44:03 UTC 2009


Hi!

Ralph Johnson wrote:
> On Thu, Mar 12, 2009 at 4:44 PM, Göran Krampe <goran at krampe.se> wrote:
>> ...so while not exactly Smalltalk code that can be fed to the regular
>> Compiler it would still be parsed and executed like a series of #perform: to
>> a builder object. It would be fast (modulo speed of #perform:), secure (you
>> can't run arbitrary code) and avoids limits of Compiler.
> 
> With a little care, a format could be both normal Smalltalk code AND
> something that would be easy to parse.  For example, it could have
> only keyword messages and strings, perhaps boolean literals and
> integers, but no binary messages, assignment, or array literals.
> Thus, you could first implement a parser by just reading in the string
> and evaluating it, and then you could build a real parser.  This would
> make it easy to develop test-first, since you can focus first on
> writing out the objects, and use the trivial way of reading them in to
> test it.

I have just implemented this little parser calling it "Tirade" and it is 
a small subset of Smalltalk that only differs in not having a receiver 
to the left, could be easily added though. I also did a quick and dirty 
benchmark, it is about 3-4 times faster than Compiler, no real profiling 
done yet. I did implement brace arrays but of course no expressions 
allowed, and also associations. And yes, there are tests. :)

Let me include the current class comment here which describes it:

Tirade - a long angry speech or scolding. Synonyms: diatribe, harangue, rant

Tirade is a fast parser for a "bastard subset of Smalltalk" that is 
intended for file formats.

The concept is that Tirade parses the input stream which consists of a 
sequence of Smalltalk messages with literals as arguments - expressions 
are not allowed. These messages are simply sent to a builder object 
supplied by you. Tirade uses the return value from the builder as the 
receiver of the next message, which means you can partition your 
protocol over multiple builders if needed.

Tirade is almost a strict subset of Smalltalk BUT there is no receiver 
to the left. The receiver is the builder according to the above logic.

The following example shows all allowed constructs which include:
- Unary and keyword messages (no binary) without receiver. No cascades. 
Period mandatory.
- nil/true/false pseudo variables. No thisContext, self or super.
- String and Integer literals. No scientific notation. Single quotes are 
doubled.
- Brace arrays of above, including nesting.
- Associations between the above.
- Smalltalk comments, but only between messages.
- Whitespace just like in Smalltalk.

This more or less matches the capability of JSON I think. Example input:

"You can use Smalltalk comments in the input, but only on its own line!"
"#start will be sent to builder, receiver is not written out, note 
period at end."
start.

"Keyword message using String and Integer."
protocol: 'alpha' version: 23.

"Strings follow normal Smalltalk escape rules, whitespace before, after, 
inbetween is ok."
	author:                 'Joe ''the tiger'' Schmoe'.

"true, false, nil are fine to use."
humpty: true dumpty: false sat: nil on: 'a wall'.

bracearray: {
	'asdasd'->123. {12. 34}->'asdasd'.
	123. true. false. nil. {'123123'.-123}}.




More information about the Squeak-dev mailing list