[squeak-dev] Re: jitter (was: The Old Man)
bryce at kampjes.demon.co.uk
bryce at kampjes.demon.co.uk
Thu Apr 3 20:19:04 UTC 2008
I had the wrong benchmarks for 0.13. This post fixes my copy/paste
error. Thanks Goran.
bryce at kampjes.demon.co.uk writes:
> Andreas Raab writes:
>
> > One of my problems with Exupery is that I've only seen claims about byte
> > code speed and if you know where the time goes in a real-life
> > environment then you know it ain't bytecodes. In other words, it seems
> > to me that Exupery is optimizing the least significant portion of the
> > VM. I'd be rather more impressed if it did double the send speed.
>
> Then be impressed. Exupery has had double Squeak's send performance
> since March 2005.
>
> http://people.squeakfoundation.org/person/willembryce/diary.html?start=23
>
> That's done by using polymorphic inline caches which are also used to
> drive dynamic primitive inlining. It is true that further send
> performance gains are not planned before 1.0. Doubling send
> performance should be enough to provide a practical performance
> improvement. It's better to solve all the problems standing in the way
> of a practical performance improvement before starting work on full
> method inlining which should provide serious send performance.
>
> Here's the current benchmarks:
> Executing Code
> ==============
> arithmaticLoopBenchmark 1397 compiled 138 ratio: 10.122
> bytecodeBenchmark 2183 compiled 435 ratio: 5.017
> sendBenchmark 1657 compiled 741 ratio: 2.236
> doLoopsBenchmark 1100 compiled 813 ratio: 1.353
> pointCreation 988 compiled 968 ratio: 1.021
> largeExplorers 729 compiled 780 ratio: 0.935
> compilerBenchmark 529 compiled 480 ratio: 1.102
> Cumulative Time 1113.161 compiled 538.355 ratio 2.068
>
> Compile Time
> ============
> ExuperyBenchmarks>>arithmeticLoop 199ms
> SmallInteger>>benchmark 791ms
> InstructionStream>>interpretExtension:in:for: 14266ms
> Average 1309.515
>
> The bottom two executing code benchmarks are macro benchmarks. They
> compile a few methods based on a profile run then re-run the
> benchmark.
>
> There's several primitives that are inlined into the main interpret()
> loop in the interpreter but require full worst case dispatching in
> Exupery. They'll need to be implemented to prevent slow downs to
> code the benefits. Also there are few limitations that can cause
> Exupery to produce unperformant code in some situations. There
> are also bugs, the last release would run for about an hour of
> development before crashing. These are the issues that are currently
> being worked on.
>
> Here's the benchmarks from the 0.13 release:
arithmaticLoopBenchmark 1396 compiled 128 ratio: 10.906
bytecodeBenchmark 2111 compiled 460 ratio: 4.589
sendBenchmark 1637 compiled 668 ratio: 2.451
doLoopsBenchmark 1081 compiled 715 ratio: 1.512
pointCreation 1245 compiled 1317 ratio: 0.945
largeExplorers 728 compiled 715 ratio: 1.018
compilerBenchmark 483 compiled 489 ratio: 0.988
Cumulative Time 1125 compiled 537 ratio 2.093
ExuperyBenchmarks>>arithmeticLoop 249ms
SmallInteger>>benchmark 1112ms
InstructionStream>>interpretExtension:in:for: 113460ms
Average 3155.360
>
> The major gains are in the compileBenchmark macro benchmark and in
> compilation time. Both due to work on the register allocator.
>
> Exupery from the beginning has been an attempt to combine serious
> optimisation with full method inlining similar to Self while having the
> entire compiler written in Smalltalk. It's an ambitious goal that's
> best tackled in smaller steps.
>
> Bryce
>
More information about the Squeak-dev
mailing list
|