[ANN] Chuck type inferencer

Tue Jun 8 16:15:17 UTC 2004

On Jun 7, 2004, at 5:58 PM, Bryce Kampjes wrote:

> Colin Putney writes:
>>
>> Yes, "Parse trees are structured CompiledMethods". I'd love to see
>> Squeak executing ASTs with the hotspots optimized by Exupery.

> An AOStA style approach with two levels of byte-code could make
> a lot of sense. The first bytecode is highlevel, the flattened
> AST, basically Squeak bytecode without jumps so no inlining of
> conditional (ifTrue:ifFalse: etc) and loop control flow methods.
> With a second low level bytecode that's closer to Squeaks current
> full byte-code, with maybe a type check bytecode to make inlining
> fast, to inline down to all done at the Smalltalk above bytecode
> level could provide a workable fast execution engine for what, I
> think, you are describing.

Well, not quite. I'm actually suggesting that the abstract syntax tree 
would be executed directly. The interpreter would traverse the tree, 
updating the state of the activation context as it goes. This would be 
slower than bytecode interpretation, but not unreasonably so - this is 
how the Ruby interpreter works, for example.

> The double bytecode inlining interpreter would, probably, be faster
> for normal code than our current engine. Sends are very expensive so
> removing them should provide a decent speed improvement. It's a
> design, as Marcus suggested, that is very much in the AOStA spirit,
> with both a high level bytecode and a low level bytecode.

I just had a bizarre idea. An AST interpreter with PICs!

PICs actually make even more sense in an AST, since all you have to do 
is point to the method you're inlining. No need to rewrite the send 
site, and deoptimization is easier.

The dual byte code idea is certainly workable, and would probably be an 
improvement over the status quo, but I don't think AOStA would be ideal 
for Squeak. Optimizing bytecode makes a lot of sense in VW, where they 
have a non-optimizing JIT compiler to native code at the VM level. But 
let's go Blue Sky here. Let's ignore, for the moment, that the only 
thing we actually have working right now is a somewhat optimized 
byte-code interpreter, and imagine that we have the resources to 
implement whatever we want.

The simplest, cleanest, most OO, way of executing code would be to have 
the VM execute an AST directly. This eliminates the need for a compiler 
- a parser is sufficient to translate source into something executable.
Now that's kind of slow, so let's optimize the code that gets run 
frequently. The most effective way would be to compile to native code. 
Bytecode is a nice compromise, in that it's fairly high-level 
semantically, and fairly efficient to execute, but if you're going to 
introduce multiple levels of executable forms, why not differentiate 
them more?

In general I like the idea of progressive levels of optimization, 
triggered by an execution counter for each method. The first few rounds 
of optimization would be tree-based: PICs, dead code removal, whatever 
can be done at the semantic level that the AST represents. Then at some 
level, native code generation kicks in, and we generate machine code 
that reflects the optimizations we've already done. As the execution 
counter hits new thresholds, we progressively optimize that until we 
run out of ways to speed things up.

> Of course it would be nice to use a compiler that went all the way
> down to machine code but it's always best to know that you can succeed
> without relying on other projects meeting their milestones. A backup
> plan doesn't need to be executed, it just needs to reduce the total
> risk. Once everything is done and working together there is no risk
> left.

Well, the backup plan is the status quo. The bytecode interpreter we 
have is fast enough for most purposes, and we can certainly improve the 
dev tools, implement slim binaries for distribution, do type inference 
and so on without changing that.

And frankly, this is all just speculation on what might be possible. My 
priority is to make OmniBrowser and Monticello ready for production use 
with features that exploit the AST I've been working on. I figure that 
will take will take me well into next year. I may get into messing with 
the interpreter after that.

> How is AOStA going? Is there a decent bytecode level inliner that
> could be borrowed for other projects?

Markus mentioned off list that he had translation to SSA form and back 
more-or-less working, but I don't think it does inlining.

> Bryce
> Who does hope, and is working to make, Exupery ready before it
> is needed for your work.

Bryce, please don't mistake my enthusiasm about Exupery for pressure to 
get it done faster. I know you only have so much spare time, and 
Exupery is a non-trivial project, to say the least. Not only that, but 
it's a lot further along than any of the stuff I've been talking about 
in this thread.

So consider this congratulations and encouragement, rather than 
pressure.

Colin