On the effect of branch mispredictions in the Squeak VM

Tim Olson tim at io.com
Mon Jul 7 12:45:59 UTC 2003


"Andreas Raab" <andreas.raab at gmx.de> wrote:
| Hi John,
| 
| > My numbers using mac vm 3.5.2b1 (500Mhz G3) are:
| > 
| > 10	0	0
| > 100	0	0
| > 1000	1	1
| > 10000	8	7
| > 100000	77	78
| > 1000000	797	781
| > 10000000	7817	7861
| 
| Just curious: Do you "gnuify" the VM before compiling it[*]? The above
looks
| as if you might still be using the switch-based dispatch (in which
case
| there should be no difference between the two versions as the branch
| prediction will _always_ be wrong no matter how you arrange the
bytecodes).
| If you don't gnuify it, I'd recommend doing so - like I said it bought
me a
| factor of two in speed.

I suspect there will be an improvement, but not as much as in Andreas'
results (was that a Pentium-4?)  The PowerPC 750 (G3) was designed with
this kind of table-driven dispatch code in mind, since that was the
basis for the 68K emulator which Apple used.  The pipeline is only 5
stages, so branch misprediction penalties are quite small compared to
the 20 cycles of the P4.

The 750 does have a 64-entry BTIC (Branch Target Instruction Cache), so
it will see some benefit from the "goto indirect label" feature of GCC.

	-- Tim Olson



More information about the Squeak-dev mailing list