[Vm-dev] Performance of primitiveFailFor: and use of primFailCode

Eliot Miranda eliot.miranda at gmail.com
Tue May 24 16:07:30 UTC 2011


On Mon, May 23, 2011 at 8:42 PM, David T. Lewis <lewis at mail.msen.com> wrote:

>
> On Mon, May 23, 2011 at 07:30:09PM -0400, David T. Lewis wrote:
> > On Mon, May 23, 2011 at 02:33:52PM -0700, Eliot Miranda wrote:
> > >
> > > On Mon, May 23, 2011 at 2:08 PM, David T. Lewis <lewis at mail.msen.com>
> wrote:
> > > >
> > > >  Testing success status, original:
> > > >        if (foo->successFlag) { ... }
> > > >
> > > >  Testing success status, new:
> > > >        if (foo->primFailCode == 0) { ... }
> > > >
> > > >  Setting failure status, original:
> > > >         foo->successFlag = 0;
> > > >
> > > >  Setting failure status, new:
> > > >        if (foo->primFailCode == 0) {
> > > >                foo->primFailCode = 1;
> > > >        }
> > > >
> > > > So in each case the global struct is being used, both for successFlag
> > > > and primFailCode. Sorry for the confusion. In any case, I'm still
> left
> > > > scratching my head over the size of the performance difference.
> > > >
> > >
> > > One thought, where are successFlag and primFailCode in the struct?
>  Perhaps
> > > the size of the offset needed to access them makes a difference?
> >
> > In both cases they are the first element of the struct, so that
> > cannot be it.
> >
> > I think I had better circle back and redo my tests. Maybe I made
> > a mistake somewhere.
> >
>
> No mistake, the performance problem was real.
>
> Good news - I found the cause. Better news - this may be good for a
> performance boost on StackVM and possibly Cog also.
>

thanks!


>
> The performance hit was due almost entirely to
> InterpreterPrimitives>>failed,
> and perhaps a little bit to #successful and #success: also.
>
> This issue with #failed is due to "^primFailCode ~= 0" which, for purposes
> of C translation, can be recoded as "^primFailCode" with an override in
> the simulator as "^primFailCode ~= 0". This produces a significant speed
> improvement, at least as fast as for the original interpreter
> implementation
> using successFlag.
>

Note that with the Cog code generator and for the purposes of the simulator
this can read

failed
<api>
^self cCode: [primFailCode] inSmalltalk: [primFailCode ~= 0]

The Cog inliner maps self cCode: aCBlock inSmalltalk: anStBlock to aCBlock
at TMethod creation time, hence avoiding the inability to inline
cCode:inSmallalk:.  See MessageNode>>asTranslatorNode: in the Cog VMMaker.
 I'll integrate as such in Cog.


> I expect that the same change applied to StackInterpreter may give a
> similar
> 10% improvement (though I have not tried it). I don't know what to expect
> with Cog, but it may give a boost there as well.
>
> Changes attached, also included in VMMaker-dtl.237 on SqueakSource.
>
> Dave
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20110524/2b03f865/attachment-0001.htm


More information about the Vm-dev mailing list