A few low-level Pentium II performance measurements

J Chapman carolan at pmail.net
Thu Feb 18 16:20:10 UTC 1999


> Just did it :
> 22727272 bytecodes/sec; 1123996 sends/sec
> 
> Scales pretty well :-)
>
>
> Could it be that there is a timing problem with the bytecodes/sec
> benchmark?
> This is exactly the same result as for the Mac.

Same number of bytecodes/second, but more message sends. That's
interesting. Has anyone seen greater than 22727272 bytecodes/second?

I had some fun last night looking at the performance registers on a 180
MHz PowerPC 604e running Squeak 2.3. I also used the 100 timesRepeat: [0
tinyBenchmarks] test.

The processor completed 212,000,000 instructions every 180,000,000 clock
cycles, for 0.85 cycles/instruction or 1.17 instructions/clock cycle.
This was slightly more than Jan's Pentium II, but the difference was
less than expected. The VM executed 12,500,000 bytecodes/second and
580,000 message sends, for about 17 machine instructions/bytecode. This
was slightly fewer instructions per bytecode than the Pentium II
results, which was surprising to me, given the RISC/CISC difference, and
that Pentium compilers are thought to generate better-optimized code
than PowerPC compilers.

Squeak's instruction profile is noticeably integer-heavy, which didn't
come as a surprise.

The on-chip caches seemed to work effectively, with only about 14,000
data cache misses and 5,000 instruction cache misses per clock cycle
during the test run. ITLB misses were down around 250 most of the time,
with brief spikes into the 600-1000 range. The processor spent about
1,300,000 cycles per second loading cache, and made about the same
number of incorrect branch predictions per second. I don't know enough
about hardware to interpret this data, but there it is for anyone who
can.





More information about the Squeak-dev mailing list