[Vm-dev] Performance of primitiveFailFor: and use of primFailCode

Eliot Miranda eliot.miranda at gmail.com
Mon May 23 20:44:48 UTC 2011


Hi David,

    the difference looks to me to do with the fact that successFlag is flat
and primErrorCode is in the VM struct.  Try generating a VM where either
primFailCode is also flat or, better still, all variables are flat.  In my
experience the flat form is faster on x86 (and faster with both the intel
and gcc compilers; not tested with llvm yet).  BTW, if you use the Cog
generator it'll generate accesses to variables which might be in the VM
struct as GIV(theVariableInQuestion) (where GIV stands for global
interpreter variable), and this allows one to choose whether these variables
are kept in a struct or kept as separate variables at compile-time instead
of generation time, as controlled by the USE_GLOBAL_STRUCT compile-time
constant, e.g. gcc -DUSE_GLOBAL_STRUCT=0 gcc3x-interp.c.

HTH
Eliot

On Sun, May 22, 2011 at 8:54 AM, David T. Lewis <lewis at mail.msen.com> wrote:

>
> I have been trying to gradually update trunk VMMaker to better align
> with oscog VMMaker (an admittedly slow process, but hopefully still
> worthwhile).  I have gotten the interpreter primitives moved into class
> InterpreterPrimitives and verified no changes to generated code. This
> greatly reduces the clutter in class Interpreter, so it's a nice change
> I think.
>
> My next step was to update all of the primitives to use the
> #primitiveFailFor:
> idiom, in which the successFlag variable is replaced with primFailCode
> (integer value, 0 for success, 1, 2, 3... for failure codes). This would
> get us closer to the point where the standard interpreter and stack/cog
> would use a common set of primitives. A lot of changes were required for
> this, but the resulting VM works fine ... except for performance.
>
> On a standard interpreter, use of primFailCode seems to result in a
> nearly 12% reduction in bytecode performance as measured by tinyBenchmarks:
>
> Standard interpreter (using successFlag):
>  0 tinyBenchmarks. '439108061 bytecodes/sec; 15264622 sends/sec'
>  0 tinyBenchmarks. '433164128 bytecodes/sec; 14740358 sends/sec'
>  0 tinyBenchmarks. '445993031 bytecodes/sec; 15040691 sends/sec'
>  0 tinyBenchmarks. '440999138 bytecodes/sec; 15052960 sends/sec'
>  0 tinyBenchmarks. '445993031 bytecodes/sec; 14485815 sends/sec'
>
> After updating the standard interpreter (using primFailCode):
>  0 tinyBenchmarks. '393241167 bytecodes/sec; 14066256 sends/sec'
>  0 tinyBenchmarks. '392036753 bytecodes/sec; 15040691 sends/sec'
>  0 tinyBenchmarks. '393846153 bytecodes/sec; 14272953 sends/sec'
>  0 tinyBenchmarks. '400625978 bytecodes/sec; 14991818 sends/sec'
>  0 tinyBenchmarks. '393846153 bytecodes/sec; 15176750 sends/sec'
>
> This is a much larger performance difference than I expected to see.
> Actually I expected no measurable difference at all, and I was just
> testing to verify this. But 12% is a lot, so I want to ask if I'm
> missing something?
>
> The changes to generated code generally take the form of:
>
> Testing success status, original:
>        if (successFlag) { ... }
>
> Testing success status, new:
>        if (foo->primFailCode == 0) { ... }
>
> Setting failure status, original:
>        successFlag = 0;
>
> Setting failure status, new:
>        if (foo->primFailCode == 0) {
>                foo->primFailCode = 1;
>        }
>
> My approach to doing the updates was as follows:
> - Replace all occurrences of "successFlag := true" with "self
> initPrimCall",
>  which initialize primFailCode to 0.
> - Replace all "successFlag := false" with "self primitiveFail".
> - Replace all "successFlag ifTrue: [] ifFalse: []" with
>  "self successful ifTrue: [] ifFalse: []".
> - Update #primitiveFail, #failed and #success: to use primFailCode rather
>  than successFlag.
> - Remove successFlag variable.
>
> Obviously I don't want to publish the code on SqS/VMMaker, but I can mail
> an interp.c if anyone wants to see the gory details (It is too large to
> post on this mailing list though).
>
> Any advice appreciated. I suspect I'm missing something basic here.
>
> Thanks,
> Dave
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20110523/f04f4051/attachment.htm


More information about the Vm-dev mailing list