Re: Animorphic ST (Strongtalk) released!

20 Jul 2002


      On Thu, Jul 18, 2002 at 03:03:30PM -0400, Stephen Pair wrote:
...
First, are you you talking about adding JIT to Squeak?  Or something
more?  What does Jitter do?  How does it relate to this discussion?
The idea of Jitter is to translate *every* method before it is run.
The good thing about this is that you can retire the Interpreter alltogether
(J3 replaces the interpret()-loop) and so you don't have to think about
syncronizing the state of the Interpreter and the Jit-compiler (which
was one of the main difficulties of the old threaded-code Jitter, if
I understood Ian correctly).
Problem with this approach is that the compile time of the Jitter has
to be fast: Many methods (especially during startup) are not called very
often, so the time you invest into generating good code is actually wasted.
This is especially true for startup: You'd have to compile *lots* of stuff
before the first thing is happening on the screen.
So such a compiler can't optimize the code very much, it simply has not
time to do complex things.
Tim's approach allows to wait with translation until there is time to 
do it. He suggested to have a single flag in each method, but you could
actually do more: 1) do profiling to find the hotspots and 2) get some
typeinformation.
All "modern" systems seem to work somewhat like that: They have a *very*
fast compiler (very much like J3, maybe even generating worse code) that
is called if a method is run the first time. This compiler adds profiling
code and uses PICs (Polymorphic Inline Caches) that allow to collect
information about what types are actually used at runtime. 
Then a slow, highly optimizing compiler is called for all hotspots, that is
methods that are run very often. This compiler can use the typeinfos
collected in the PIC to do very clever optimizations that are even not
possible with a static (e.g. C++) compiler.
...
Also, once a method is compiled to machine code, why would there be any
need to keep the bytecodes around (as Tim descibes)?  Can't you just
throw away the old bytecodes at that point?  Ah...but perhaps it's so
that the image can remain portable to different architectures...is that
it?
Bytecodes... I don't like bytecodes. Actually they are a bad idea if
you are generating native code: A modern jit-compiler needs a representation
of the code that is exactly the one that the smalltalk-compiler had just 
before it generated the bytecodes. 
So the first thing a Jit does is decompiling... it has to spend time
to regenerate stuff we allready had.
So IMHO it would be interesting to look at other representations for
compiled methods than bytecodes: Something very simple, no optimizations
(like ifTrue: inlining, special sends, eg...) Just a serialised AST.
But Squeak of course needs an interpreter: Why not treat the Interpreter
as just another target for the JIT-Compiler? Then every new optimization
we implement for the compiler would even help to make squeak running
faster in interpreted mode (actually, all most optimizations could work
for an interpreter, even Inline Caches, Inlining of sends, eg...).
This would essentially allow us to "late bind" the real-world Interpreter
and change the bytecodes used from one release to another.
The clean and simple representation for code in compiledmethods could
maybe used for other things, too: Maybe this could be a nice thing
for Alan's ideas of a very clean Squeak Kernel. (An interpreter for
that would be very simple, but very slow). And the eToy and omniuser
tiles could be generated from this representation... maybe we could even
think about trowing away the source (:-)).
just some strange ideas...
Marcus
-- 
Marcus Denker marcus@ira.uka.de  -- Squeak! http://squeakland.org