On the effect of branch mispredictions in the Squeak VM
Tim Olson
tim at io.com
Mon Jul 7 12:45:59 UTC 2003
"Andreas Raab" <andreas.raab at gmx.de> wrote:
| Hi John,
|
| > My numbers using mac vm 3.5.2b1 (500Mhz G3) are:
| >
| > 10 0 0
| > 100 0 0
| > 1000 1 1
| > 10000 8 7
| > 100000 77 78
| > 1000000 797 781
| > 10000000 7817 7861
|
| Just curious: Do you "gnuify" the VM before compiling it[*]? The above
looks
| as if you might still be using the switch-based dispatch (in which
case
| there should be no difference between the two versions as the branch
| prediction will _always_ be wrong no matter how you arrange the
bytecodes).
| If you don't gnuify it, I'd recommend doing so - like I said it bought
me a
| factor of two in speed.
I suspect there will be an improvement, but not as much as in Andreas'
results (was that a Pentium-4?) The PowerPC 750 (G3) was designed with
this kind of table-driven dispatch code in mind, since that was the
basis for the 68K emulator which Apple used. The pipeline is only 5
stages, so branch misprediction penalties are quite small compared to
the 20 cycles of the P4.
The 750 does have a 64-entry BTIC (Branch Target Instruction Cache), so
it will see some benefit from the "goto indirect label" feature of GCC.
-- Tim Olson
More information about the Squeak-dev
mailing list
|