[squeak-dev] Re: jitter (was: The Old Man)

bryce at kampjes.demon.co.uk bryce at kampjes.demon.co.uk
Thu Apr 3 20:19:04 UTC 2008


I had the wrong benchmarks for 0.13. This post fixes my copy/paste
error. Thanks Goran.

bryce at kampjes.demon.co.uk writes:
 > Andreas Raab writes:
 > 
 >  > One of my problems with Exupery is that I've only seen claims about byte 
 >  > code speed and if you know where the time goes in a real-life 
 >  > environment then you know it ain't bytecodes. In other words, it seems 
 >  > to me that Exupery is optimizing the least significant portion of the 
 >  > VM. I'd be rather more impressed if it did double the send speed.
 > 
 > Then be impressed. Exupery has had double Squeak's send performance
 > since March 2005.
 > 
 >  http://people.squeakfoundation.org/person/willembryce/diary.html?start=23
 > 
 > That's done by using polymorphic inline caches which are also used to
 > drive dynamic primitive inlining. It is true that further send
 > performance gains are not planned before 1.0. Doubling send
 > performance should be enough to provide a practical performance
 > improvement. It's better to solve all the problems standing in the way
 > of a practical performance improvement before starting work on full
 > method inlining which should provide serious send performance.
 > 
 > Here's the current benchmarks:
 >   Executing Code
 >   ==============
 >   arithmaticLoopBenchmark 1397 compiled 138 ratio: 10.122
 >   bytecodeBenchmark 2183 compiled 435 ratio: 5.017
 >   sendBenchmark 1657 compiled 741 ratio: 2.236
 >   doLoopsBenchmark 1100 compiled 813 ratio: 1.353
 >   pointCreation 988 compiled 968 ratio: 1.021 
 >   largeExplorers 729 compiled 780 ratio: 0.935
 >   compilerBenchmark 529 compiled 480 ratio: 1.102
 >   Cumulative Time 1113.161 compiled 538.355 ratio 2.068
 > 
 >   Compile Time
 >   ============
 >   ExuperyBenchmarks>>arithmeticLoop 199ms
 >   SmallInteger>>benchmark 791ms
 >   InstructionStream>>interpretExtension:in:for: 14266ms
 >   Average 1309.515
 > 
 > The bottom two executing code benchmarks are macro benchmarks. They
 > compile a few methods based on a profile run then re-run the
 > benchmark.
 > 
 > There's several primitives that are inlined into the main interpret()
 > loop in the interpreter but require full worst case dispatching in
 > Exupery. They'll need to be implemented to prevent slow downs to
 > code the benefits. Also there are few limitations that can cause
 > Exupery to produce unperformant code in some situations. There
 > are also bugs, the last release would run for about an hour of
 > development before crashing. These are the issues that are currently
 > being worked on.
 > 
 > Here's the benchmarks from the 0.13 release:
  arithmaticLoopBenchmark 1396 compiled  128 ratio: 10.906
  bytecodeBenchmark       2111 compiled  460 ratio:  4.589
  sendBenchmark           1637 compiled  668 ratio:  2.451
  doLoopsBenchmark        1081 compiled  715 ratio:  1.512
  pointCreation           1245 compiled 1317 ratio:  0.945
  largeExplorers           728 compiled  715 ratio:  1.018
  compilerBenchmark        483 compiled  489 ratio:  0.988
  Cumulative Time         1125 compiled  537 ratio   2.093

  ExuperyBenchmarks>>arithmeticLoop 249ms 
  SmallInteger>>benchmark 1112ms 
  InstructionStream>>interpretExtension:in:for: 113460ms 
  Average 3155.360 

 >
 > The major gains are in the compileBenchmark macro benchmark and in
 > compilation time. Both due to work on the register allocator.
 > 
 > Exupery from the beginning has been an attempt to combine serious
 > optimisation with full method inlining similar to Self while having the
 > entire compiler written in Smalltalk. It's an ambitious goal that's
 > best tackled in smaller steps.
 > 
 > Bryce
 > 



More information about the Squeak-dev mailing list