[ANN] Chuck type inferencer
cputney at wiresong.ca
Tue Jun 1 14:07:08 UTC 2004
On May 28, 2004, at 4:46 PM, Lex Spoon wrote:
> Tim Rowledge <tim at sumeru.stanford.edu> wrote:
>> "Lex Spoon" <lex at cc.gatech.edu> wrote:
>>> The major memory requirement, it turns out, is that AutoChuck is
>>> around parse trees for every method. So I'm trying to eithre get
>>> rid of
>>> that requirement or make the parse trees take less memory.
>> It might be useful to recall one of Eliot's battle cries -
>> "CompiledMethods are flattened parsetrees"
That's an interesting position given his work on the AOStA bytecode
optimizer. My guess would be that the more you optimize the bytecodes,
the less they look like the parse tree that created them.
> It is very tempting to try and use the compiled methods themselves as
> the compressed parse trees. That would mean there is zero overhead
> "keeping around parse trees". However, I don't know if I will
> this idea myself. There is one very annoying issue: the decompiler
> does not always produce *exactly* the parse tree that was compiled to
> begin with.
This is something I've been working on addressing for the next version
of OmniBrowser. Essentially, the next version of OB will operate on an
AST that begins with Package at the top and extends down through Class
and Method to sub-method parse nodes like MessageSend and Literal.
One of the goals with this is to get consistency between different
representations of the code, and so the AST is designed to be as
lossless as possible when converting back and forth between
representations. The SourcePrinter for example, is essentially a null
pretty-printer; it reconstructs the source code almost exactly the way
the programmer entered it. (And this, incidentally, is the reason I
didn't just reuse the RB parse nodes. They don't preserve enough
Some things I plan to do with this:
- Hook it up to the closure compiler and generate bytecodes from the AST
- Adapt the closure decompiler to generate an AST from bytecodes
- Create a binary serialization format. There is some interesting
literature on compressing ASTs. Essentially the abstract grammar is
used as a statistical model for an arithmetic encoder. This gives
better compression than, for example, gzipped bytecodes. This could be
a very nice way to distribute code - more compact and faster to load
- Enhance Monticello. This representation of code is much richer than
Monticello's current list of definitions, and let us implement features
that are currently difficult or impossible. Tree-diffing will be an
important part of this, and might allow us to implement RB-like
features in Monticello. (For example, you could compare two versions of
a package and get a diff expressed in terms of refactorings: "Method A
extracted from B, Method X pushed up to class Y.")
- Allow structured annotations and additions to the AST. This could be
used for all sorts of things. Stéphane's idea of attaching type
information, for example, might be interesting in relation to Chuck.
So this could be useful for Chuck in terms of either decompilation from
byte codes or AST compression.
Going in the other direction, I'm interested in exploring ways that
tools like OmniBrowser and Monticello can take advantage of the type
information provided by Chuck. The thing that leaps immediately to mind
is inter-package dependencies. I bet it would be feasible to deduce the
minimal set of packages required to run, say, the unit tests for a
package. (And the minimal code from the "as yet unpackaged" base image
I'm also imagining a panel in OB for static type analysis. It could be
a little like the test runner - buttons for analysing the code in
various scopes - method, class, package etc and a list for displaying
errors. Click on an error and the brower jumps to the relevant method
and highlights the dangerous send. Or how about enhancing SLint with
All in all, I'm pretty excited by the possibilities that Chuck opens
up. Thanks Lex!
More information about the Squeak-dev