New subject: Structure of objects and execution (with effects on visual programming, and reversible debugging, )

25 Oct 1998

      Hi Dan,
That'll teach me not to throw comments about randomly ;-). What I
actually have in mind is something between an abstract syntax tree (with
symbol table) and a compiled method. Create the tree (no transformations
or optimizations done in the parsing to create the AST - not even for
#ifTrue: and kin) and "flatten" it into a RPN representation encoding
the node type and and corresponding identifier string symbol into
bytecodes. Actually I guess it _is_ a simple compression scheme --
because you would have a array of identifier strings, each string with a
corresponding bytecode for its index. The pseudo-compiled code would
simply be a bytearray of node-type/array-index "pairs" (ignoring arrays
larger than 256 for the moment). The nice thing would be that this
representation is independent of implementation and it is completely
shareable -- all info is included for both compilation and for source
reconstruction and could be transmitted reasonably efficiently. It
wouldn't matter if you were compiling to another bytecode set, native
code, threaded code,.... This representation could also be used in the
image for a type of "lazy compilation", where the core methods and
classes of Squeak are fully compiled, but the rest could be represented
in this form, to be fully compiled only when finally sent a message.
On comments: if comments were considered as a peculiar type of language
construct rather than just another form of whitespace, then it could be
inserted into the tree along with the rest of the nodes; the
CommentNodes could act as placeholders and be placed at the proper
position in the tree for source reconstruction.
Since there are fewer that 16 node types, an encoding of NNNNIIII
IIIIIIII could be used for up to 4K identifiers in a single method
(yuck!).
Just some vague notions I've been rolling around a bit. I'm sure there
are a number of gotchas to doing it this way that I haven't thought of.
-- DWight
Dan Ingalls wrote:
...
Dwight Hughes dwighth@ipa.net wrote ...
...
He is wanting to improve the prettyPrinting
to the point that it would be the preferred option (and be customizeable
of course) -- allowing most of the source to be drastically reduced in
size, down to basically parse trees.
Just to be clear about a couple of things...
The prettyPrinting with comments would save space because METHODS are extremely compact; not because Parse Trees are;  they are not.  In fact I believe parse trees are the worst of the three (text, parse trees, methods) in compactness.  What's nice is that the decompiler can build parse trees from methods, and those, in turn can (re)generate source code.
...and that's not to say I'm opposed to working with parse trees either.  The original Fabrik (and ThingLab before it) built parse trees from its graphical structure and then generated code from the result.
    - Dan

P.S. There are other nice things about parse trees, such as being able to execute them for type.  A simple type inference mechanism can be built this way.  One should be able to execute the methods themselves for type, but the compilation of ifTrue: and the other optimized control constructs prevents it.

Re: Structure of objects and execution (with effects on visual programming, and reversible debugging, )