Interpreter>>pushReceiverVariableBytecode

Ian Piumarta squeak-dev at lists.squeakfoundation.org
Sat Sep 7 02:06:07 UTC 2002


On Fri, 6 Sep 2002, Tommy Thorn wrote:

> Alas, I haven't had time to perform any detailed meassurements to answer 
> these questions.

In gnuify you might like to try changing the line

  print "			BREAK;";

to

  print "			break;";

then delete squeak and vm/gnu-interp.[co], and recompile.  This will turn
off the jump through the dispatch table (reinstating the branch back to
the loop head and the redundant range check in the switch) while leaving
all the other GNU-specific optimisations in place, making a before/after
comparison more-or-less valid.  (Avoiding the range check in the switch is
trickier if you want to eliminate just the indirect jump through the
table.)

> Once you use computed goto's you need to hold bytecodeDispatchTable in
> a register

Registers are cheap (and often plentiful).

> or (worse still) load the constant each time.

Memory accesses are undesirable.

> Does the
> saving of one (unconditional) jump back to the interpreter loop pay
> off the added register pressure?

Branches are disasterous.

On a (rather slow) PowerPC 740:

  before: '40790312 bytecodes/sec; 1356217 sends/sec'
  after:  '31714568 bytecodes/sec; 1239999 sends/sec'

I can't be bothered recompiling on a pentium (but in general it takes
branches from the "merely disasterous" to "entirely new heights of utterly
catastrophic" and I would expect the difference to be more pronounced).

Admittedly, to make the comparison totally honest, when turning off the
jump table then the range check in the switch should be defeated too.  
(I'm too lazy to bother with this, sorry.)

I can't comment at all on ARM-specific issues since I've never used one.

Over-and-out,

Ian




More information about the Squeak-dev mailing list