[Vm-dev] Performance of primitiveFailFor: and use of primFailCode
David T. Lewis
lewis at mail.msen.com
Mon May 23 11:55:01 UTC 2011
A correction to the code that I quoted below: The generated code
before and after the change looks like this (sorry I forgot a "foo"):
Testing success status, original:
if (foo->successFlag) { ... }
Testing success status, new:
if (foo->primFailCode == 0) { ... }
Setting failure status, original:
foo->successFlag = 0;
Setting failure status, new:
if (foo->primFailCode == 0) {
foo->primFailCode = 1;
}
Dave
On Sun, May 22, 2011 at 11:54:18AM -0400, David T. Lewis wrote:
>
> I have been trying to gradually update trunk VMMaker to better align
> with oscog VMMaker (an admittedly slow process, but hopefully still
> worthwhile). I have gotten the interpreter primitives moved into class
> InterpreterPrimitives and verified no changes to generated code. This
> greatly reduces the clutter in class Interpreter, so it's a nice change
> I think.
>
> My next step was to update all of the primitives to use the #primitiveFailFor:
> idiom, in which the successFlag variable is replaced with primFailCode
> (integer value, 0 for success, 1, 2, 3... for failure codes). This would
> get us closer to the point where the standard interpreter and stack/cog
> would use a common set of primitives. A lot of changes were required for
> this, but the resulting VM works fine ... except for performance.
>
> On a standard interpreter, use of primFailCode seems to result in a
> nearly 12% reduction in bytecode performance as measured by tinyBenchmarks:
>
> Standard interpreter (using successFlag):
> 0 tinyBenchmarks. '439108061 bytecodes/sec; 15264622 sends/sec'
> 0 tinyBenchmarks. '433164128 bytecodes/sec; 14740358 sends/sec'
> 0 tinyBenchmarks. '445993031 bytecodes/sec; 15040691 sends/sec'
> 0 tinyBenchmarks. '440999138 bytecodes/sec; 15052960 sends/sec'
> 0 tinyBenchmarks. '445993031 bytecodes/sec; 14485815 sends/sec'
>
> After updating the standard interpreter (using primFailCode):
> 0 tinyBenchmarks. '393241167 bytecodes/sec; 14066256 sends/sec'
> 0 tinyBenchmarks. '392036753 bytecodes/sec; 15040691 sends/sec'
> 0 tinyBenchmarks. '393846153 bytecodes/sec; 14272953 sends/sec'
> 0 tinyBenchmarks. '400625978 bytecodes/sec; 14991818 sends/sec'
> 0 tinyBenchmarks. '393846153 bytecodes/sec; 15176750 sends/sec'
>
> This is a much larger performance difference than I expected to see.
> Actually I expected no measurable difference at all, and I was just
> testing to verify this. But 12% is a lot, so I want to ask if I'm
> missing something?
>
> The changes to generated code generally take the form of:
>
> Testing success status, original:
> if (successFlag) { ... }
>
> Testing success status, new:
> if (foo->primFailCode == 0) { ... }
>
> Setting failure status, original:
> successFlag = 0;
>
> Setting failure status, new:
> if (foo->primFailCode == 0) {
> foo->primFailCode = 1;
> }
>
> My approach to doing the updates was as follows:
> - Replace all occurrences of "successFlag := true" with "self initPrimCall",
> which initialize primFailCode to 0.
> - Replace all "successFlag := false" with "self primitiveFail".
> - Replace all "successFlag ifTrue: [] ifFalse: []" with
> "self successful ifTrue: [] ifFalse: []".
> - Update #primitiveFail, #failed and #success: to use primFailCode rather
> than successFlag.
> - Remove successFlag variable.
>
> Obviously I don't want to publish the code on SqS/VMMaker, but I can mail
> an interp.c if anyone wants to see the gory details (It is too large to
> post on this mailing list though).
>
> Any advice appreciated. I suspect I'm missing something basic here.
>
> Thanks,
> Dave
More information about the Vm-dev
mailing list