[squeak-dev] Cog VM status update

Wed Dec 17 03:30:30 UTC 2008

Folks -

I just read Eliot's most recent blog post about the Cog VM and it 
reminded me how difficult it must be for others to see where this 
project stands. So here is a bit of an update on the current status:

As you may recall we started this project in spring this year by hiring 
Eliot for the express purpose of building us a new VM that would speed 
up execution of our products. We decided to structure the work into 
stages at the end of each there would be a tangible deliverable i.e., a 
new VM that could be run and benchmarked.

The first stage in this process is what we call the "Closure VM". It is 
nothing more (and nothing less) than a Squeak VM with closures and the 
required support (compiler, decompiler, debugger etc). Given past 
experience, we had originally expected this stage to cost us some speed 
(up to 20% were estimated) since closure support has a cost which at 
that stage is hard to offset with other improvements. However, thanks to 
a truly ingenious bit of engineering done by Eliot in the design for the 
closure implementation the resulting speed difference was negligible. 
Since there was no speed penalty we decided to jump ship earlier than we 
originally anticipated and the Closure VM has been the regular shipping 
VM with Qwaq products since September this year.

The second stage in the process is the "Stack VM". It is a Closure VM 
that executes on the native stack and transparently maps contexts from 
and to stack frames as required. The VM itself is still an interpreter 
so any speed improvements come purely from the more efficient 
organization of the stack layout (no allocations, overlapping frames 
etc). For those of you having been around for long enough it is 
equivalent to what Anthony Hanan did a few years ago, except that it 
hides the existence of the native stack entirely and gives the 
programmer the naive view of just dealing with linked frames (contexts). 
The original expectations for the resulting speedups by Eliot were a 
little higher than we've seen in practice, but are in line with the 
results that Anthony got: approx. 30% improvements across the board in 
macro benchmarks. The work on the Stack VM was completed last month, we 
are currently rolling it out internally and the next product release 
will ship with the Stack VM.

The third stage which has just begun is what we call the "Simple JIT VM" 
(well, really it doesn't have a name yet, I just made it up ;-) Its 
focus is send performance (as we see send performance as the single 
biggest current bottleneck). It will sport a very simple JIT w/ inline 
caches with the idea being to bring up send performance to the point 
where it's no longer the single biggest bottleneck, then measure 
performance again and figure out what the next best target is. I am not 
going to speculate on performance (we have been wrong every single step 
of the way ;-) but both Eliot and I do think that we'll see some nice 
improvements in application performance here.

The fourth stage is a bit more speculation at this point because the 
concrete direction depends on what the results of stage 3 really show 
the new bottleneck to be. We have various candidates lined up: Very high 
on the list is a delayed code generator which can dramatically improve 
the code quality. Next to it are changes in the object format moving to 
a unified 32/64bit header model which would dramatically simplify some 
tests for inline caching and primitives etc. However, since this work is 
driven by product performance, it is possible (albeit unlikely at this 
point) that the focus might shift towards FFI speed or float inlining. 
There is no shortage of possible directions, the main issue will be to 
figure out what the bottlenecks at that point are and how to address 
them most efficiently.

Stage four won't be the end of it, but from where we are this is how far 
we've planned at this point. And if you want to know all the gory 
details about the stuff that Eliot's working on, please do check out his 
blog at:

   http://cogblog.mirandabanda.org/

Cheers,
   - Andreas