[squeak-dev] Cog VM status update
norbert at hartl.name
Wed Dec 17 09:58:41 UTC 2008
thanks for sharing. This was really informative and interesting.
>From the reading is sounds damn impressive.
On Tue, 2008-12-16 at 19:30 -0800, Andreas Raab wrote:
> Folks -
> I just read Eliot's most recent blog post about the Cog VM and it
> reminded me how difficult it must be for others to see where this
> project stands. So here is a bit of an update on the current status:
> As you may recall we started this project in spring this year by hiring
> Eliot for the express purpose of building us a new VM that would speed
> up execution of our products. We decided to structure the work into
> stages at the end of each there would be a tangible deliverable i.e., a
> new VM that could be run and benchmarked.
> The first stage in this process is what we call the "Closure VM". It is
> nothing more (and nothing less) than a Squeak VM with closures and the
> required support (compiler, decompiler, debugger etc). Given past
> experience, we had originally expected this stage to cost us some speed
> (up to 20% were estimated) since closure support has a cost which at
> that stage is hard to offset with other improvements. However, thanks to
> a truly ingenious bit of engineering done by Eliot in the design for the
> closure implementation the resulting speed difference was negligible.
> Since there was no speed penalty we decided to jump ship earlier than we
> originally anticipated and the Closure VM has been the regular shipping
> VM with Qwaq products since September this year.
> The second stage in the process is the "Stack VM". It is a Closure VM
> that executes on the native stack and transparently maps contexts from
> and to stack frames as required. The VM itself is still an interpreter
> so any speed improvements come purely from the more efficient
> organization of the stack layout (no allocations, overlapping frames
> etc). For those of you having been around for long enough it is
> equivalent to what Anthony Hanan did a few years ago, except that it
> hides the existence of the native stack entirely and gives the
> programmer the naive view of just dealing with linked frames (contexts).
> The original expectations for the resulting speedups by Eliot were a
> little higher than we've seen in practice, but are in line with the
> results that Anthony got: approx. 30% improvements across the board in
> macro benchmarks. The work on the Stack VM was completed last month, we
> are currently rolling it out internally and the next product release
> will ship with the Stack VM.
> The third stage which has just begun is what we call the "Simple JIT VM"
> (well, really it doesn't have a name yet, I just made it up ;-) Its
> focus is send performance (as we see send performance as the single
> biggest current bottleneck). It will sport a very simple JIT w/ inline
> caches with the idea being to bring up send performance to the point
> where it's no longer the single biggest bottleneck, then measure
> performance again and figure out what the next best target is. I am not
> going to speculate on performance (we have been wrong every single step
> of the way ;-) but both Eliot and I do think that we'll see some nice
> improvements in application performance here.
> The fourth stage is a bit more speculation at this point because the
> concrete direction depends on what the results of stage 3 really show
> the new bottleneck to be. We have various candidates lined up: Very high
> on the list is a delayed code generator which can dramatically improve
> the code quality. Next to it are changes in the object format moving to
> a unified 32/64bit header model which would dramatically simplify some
> tests for inline caching and primitives etc. However, since this work is
> driven by product performance, it is possible (albeit unlikely at this
> point) that the focus might shift towards FFI speed or float inlining.
> There is no shortage of possible directions, the main issue will be to
> figure out what the bottlenecks at that point are and how to address
> them most efficiently.
> Stage four won't be the end of it, but from where we are this is how far
> we've planned at this point. And if you want to know all the gory
> details about the stuff that Eliot's working on, please do check out his
> blog at:
> - Andreas
More information about the Squeak-dev