[Vm-dev] Making a Slower VM

Sun Feb 23 17:59:50 UTC 2014

On Sun, Feb 23, 2014 at 08:45:16AM -0800, Eliot Miranda wrote:
> 
> Hi David,
> 
> On Feb 23, 2014, at 8:22 AM, "David T. Lewis" <lewis at mail.msen.com> wrote:
> 
> > 
> > On Sun, Feb 09, 2014 at 10:23:37AM -0800, tim Rowledge wrote:
> >> 
> >> On 09-02-2014, at 10:07 AM, David T. Lewis <lewis at mail.msen.com> wrote:
> >>> 
> >>> I think someone mentioned it earlier, but a very easy way to produce an
> >>> intentionally slow VM is to generate the sources from VMMaker with the
> >>> inlining step disabled. The slang inliner is extremely effective, and turning
> >>> it off produces impressively sluggish results.
> >> 
> >> Does that actually work these days? Last I remember was that turning
> >> inlining off wouldn?t produce a buildable interp.c file. If someone has
> >> had the patience to make it work then I?m impressed.
> > 
> > You're right about one thing, it required a lot of patience ;-)
> > 
> > I did manage to get it working though, and the results are in VMMaker-dtl.342.
> > 
> > This turned out to be a useful exercise, as I flushed out a couple of type
> > declaration bugs along the way.
> > 
> > The major issue was that the refactoring of object memory and interpreter
> > into separate class hierarchies (which is a very good thing IMHO) requires
> > the use of accessor methods, and this leads to name conflicts in the generated
> > code if those accessor methods are not fully inlined.
> > 
> > I went with the approach of naming the accessors getFoo and setFoo: as well
> > as, for the case of array access, fooAt: and fooAt:put:. This is not very
> > pleasing from a readability point of view, but it is simple and it works.
> > 
> > If I compile a VM with inlining disabled and compiler optimization turned
> > off, the result is about 1/8th the speed of the same interpreter VM built
> > normally.
> 
> But more to the point, what's the speed with the same level of optimization as the normal VM?
> 

I did not test this very carefully, but I saw this:

Normal interpreter VM:
0 tinyBenchmarks. '906194690 bytecodes/sec; 25262862 sends/sec'
0 tinyBenchmarks. '905393457 bytecodes/sec; 25413364 sends/sec'
0 tinyBenchmarks. '906997342 bytecodes/sec; 25786444 sends/sec'

No slang inlining, normal gcc optimization:
0 tinyBenchmarks. '452696728 bytecodes/sec; 15353518 sends/sec'
0 tinyBenchmarks. '459192825 bytecodes/sec; 15759973 sends/sec'
0 tinyBenchmarks. '458370635 bytecodes/sec; 15639770 sends/sec'

No slang inlining, no gcc optimization:
0 tinyBenchmarks. '205457463 bytecodes/sec; 7075541 sends/sec'
0 tinyBenchmarks. '206451612 bytecodes/sec; 7182476 sends/sec'
0 tinyBenchmarks. '206952303 bytecodes/sec; 7218843 sends/sec'

This is less of a difference than I expected for turning off the slang inlining.
Either the gcc optimization has gotten better, or my memory has gotten worse,
because I thought I remembered getting a bigger difference the last time
I tried this (a long time ago).

I could slow the VM down quite a bit more if I use the MemoryAccess package.
By itself, MemoryAccess will have no performance impact, but if you turn
off slang inlining it should slow things down considerably. Perhaps that
is what I am remembering from the earlier test. Unfortunately some bit
rot has set in on MemoryAccess, so I'll have to fix that before I can confirm.

> and does this affect the internalFoo inlining?  Does this VM have everything that uses localSP & localIP inclined in interpret or are localSP & localIP no longer local to interpret?
> 

There is no inlining in the interpret() loop, and the gnuification step is
skipped. I believe that the localSP and localIP usage is uneffected, so yes
they would still be local to interpret().

Dave