[Newcompiler] Optimizing compiled methods for performance [was: A day optimizing]

Wed May 30 08:02:31 UTC 2007

On 29.05.2007, at 23:38, Klaus D. Witzel wrote:

> [Stef, here's my attempt to sum up the discussion; history is at  
> the bottom]
>
> It all began when Damien asked for faster code for his Nile package  
> (so if necessary do blame him :) Nile makes heavy use of Traits and  
> features a crystal clear component plan. We exchanged a couple of  
> messages and each [message, pun intended] made the code run faster :)
>
> The Nile streams are now faster than the ST-80 (+ the Squeak hacks  
> which accumulated over time) streams and here are the results  
> Damien posted on Monday:
>
> #next is	 4% faster
> #nextPutAll: is	 9% faster
> #next: is	35% faster
> #nextPut: is	50% faster
>
> impressive. awesome. moves the same amount of data in less time  
> *and* has better features.
>

Very cool!

> When discussing which construct saved what bytecode, Stef asked,  
> would it be possible that having a smarter compiler we avoid to  
> write shit code like
>
> 	position := position + (size := 1)
>
> but write it in a readable way and the compiler does its job?
>
> Roel stepped in and said that's about the simplest thing any  
> optimizing compilers can do. He mentioned building a data-glow  
> graph [ed.: MIT's using that term] on which to do some form of  
> pattern matching. This is a huge active area of research but Roel  
> hasn't seen something in the context of blockclosures, for example.
>

Yes, there is lots of research on compiler optimizations. My brain  
has lots of knowledge buried somewhere from the compiler construction  
lectures... e.g. on SSA form and how to optimize using it.
(We do have an SSA Framwork for Squeak, called "AOStA").

> Then I threw in concerns about acceptabilty: the debugger must know  
> about optimizations (for example when a temp var was optimized  
> away, as we did for the Nile streams). Debugger and compiler must  
> be kept in sync when doing clever optimizations.
>
> Roel sketched three things to be associated with an optimized method:
> 1- Optimized bytecode that are interpreted,
> 2- A non-optimized format that is used when debugging (this could  
> be (unoptimized) bytecodes but will mostly likely be the meta- 
> information for the optimized interpreted form),
> 3- A string with the source code.
>
> Other items mentioned by Roel like the use of AST and Rewriteable  
> Reference Attribute Grammars (see Ecoop'04 paper), a grammar format  
> for specifying rewrites of ASTs, can be found in the history  
> section further down.

I will check that.
>
> Markus suggested to keep ASTs (instead of non-optimized version as  
> bytecode) as meta data for the debugger's job and pointed us to
>
> - http://www.iam.unibe.ch/~scg/Research/Reflectivity/
>
> and mentioned that it gets extremely tricky when merging methods,  
> e.g. inlining.
>

So my overall feeling is that as soon as we plan to give up a simple  
one-to-one mapping between Bytecode and Text, should not use
the Bytecode anymore in the debugger directly. Right now, the  
Debugger interprets bytecode, but this is not needed. It could interpret
an AST instead.

Then we do not need to map the optimized code to the old one in the  
same way: There is only one problem to solve: Entering the debugger.
So when e.g. an exception occurs, the debugger needs to debug a  
method. So it needs to re-create a stack that looks like a non-optimized
stack. This is tricky for some cases,

-> e.g. when methods get inlined, there is only one stack frame in  
the vm for many conceptual ones.
     ("Dynamic Deoptimization", see Self)

-> In a method the compiler could have optimized away variables. How  
can we support to debug those methods? Can we recontruct
     the state of those variables?

The points in the program flow where we can enter the debugger and  
state recontruction is needed are not many. Squeak checks for
interrupts only on message sends and backwards-jumps.

So, to sum up: To support real optimization, we need a System that  
has the un-optimized code available. This un-optimized version
should be as high-level as possible and it should support meta-data  
*per node*, not only per method.

Bytecode in such a system is just like binary code in a JIT compiler:  
hidden from the programer to some extend, and the debugger does
not care. If the VM has a JIT, such a system would not need to bother  
with bytecode at all.

	Marcus

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3947 bytes
Desc: not available
Url : http://lists.squeakfoundation.org/pipermail/newcompiler/attachments/20070530/76f134a0/smime.bin