Igor Stasenko writes:
On 17/12/2007, bryce@kampjes.demon.co.uk bryce@kampjes.demon.co.uk wrote:
arithmaticLoopBenchmark 1396 compiled 128 ratio: 10.906 bytecodeBenchmark 2111 compiled 460 ratio: 4.589 sendBenchmark 1637 compiled 668 ratio: 2.451 doLoopsBenchmark 1081 compiled 715 ratio: 1.512 pointCreation 1245 compiled 1317 ratio: 0.945 largeExplorers 728 compiled 715 ratio: 1.018 compilerBenchmark 483 compiled 489 ratio: 0.988 Cumulative Time 1125 compiled 537 ratio 2.093
ExuperyBenchmarks>>arithmeticLoop 249ms SmallInteger>>benchmark 1112ms InstructionStream>>interpretExtension:in:for: 113460ms Average 3155.360
First, from the numbers above, I'd say that having a method that takes 2 minutes to compile is currently the biggest practical problem. The second set of numbers is a compilation time benchmark. The second biggest problem is that a 2.4 times increase in send speed is not transferring through to the two macro-benchmarks (largeExplorers and compilerBenchmark).
Do you make any difference between calling compiling method and , for instance, a primitive function?
The sender doesn't know if it's sending to a primitive or to a full method. If Exupery compiles a primitive then it executes in the senders context, just like the interpreter,
As i remember, you compiling methods to some form of a routine, which can be called using cdecl convention. But on top of that, knowing the fact that you calling a compiled method you can use some register optimizations like passing arguments in it, and in general by knowing where you changing registers, you can predict what of them are changing after call, and what will stay unchanged. And, of course, nothing stops you from using own calling convention to make code working faster. There's also a MMX/SSE registers which can be used for different purposes. All of the above, depending on choices, can greatly improve sends speed. Just want to know, what you thinking about it.
Currently Exupery uses C's calling conventions combined with the interpreters handling of contexts, there's plenty of room to improve this but I doubt that raw send speed is why the macro benchmarks aren't performing.
Also full method inlining will change the value of other send optimisations by removing most of the common sends. It's the best optimisation for common sends. 1.0 is a base to add full method inlining too.
And small trick when compiling SmallInteger methods: you already know that receiver is a smallinteger. So, by using that knowledge, some tests can be omitted. In same manner you can deal with compiling methods for classes which have byte/reference indexed instances.
Exupery compiles a method for each receiver so this is possible but not done yet. It'll get even more interesting when combined with full method inlining, then common self sends will become completely free.
Bryce