On 17/12/2007, bryce@kampjes.demon.co.uk bryce@kampjes.demon.co.uk wrote:
The primary goal for the next releases will be making the following benchmarks more compelling. I've added a compile time benchmark as there are a few performance bugs in the compiler that should be removed.
arithmaticLoopBenchmark 1396 compiled 128 ratio: 10.906 bytecodeBenchmark 2111 compiled 460 ratio: 4.589 sendBenchmark 1637 compiled 668 ratio: 2.451 doLoopsBenchmark 1081 compiled 715 ratio: 1.512 pointCreation 1245 compiled 1317 ratio: 0.945 largeExplorers 728 compiled 715 ratio: 1.018 compilerBenchmark 483 compiled 489 ratio: 0.988 Cumulative Time 1125 compiled 537 ratio 2.093
ExuperyBenchmarks>>arithmeticLoop 249ms SmallInteger>>benchmark 1112ms InstructionStream>>interpretExtension:in:for: 113460ms Average 3155.360
First, I'll get the register allocator to allocate each section of method separately. After that, I'll probably do some work on further optimising the register allocator but I might work on improving the generated native code.
Register allocating each section separately will both allow for better and faster allocation. It will make it easy to avoid dealing with registers and interference from other sections of the code and will reduce the size of the problem. Colouring register allocation written well should be on average n log n time but the performance bugs will raise that to probably n^2.
It's possible that just allocating each section of the method separately will be enough to bring allocation down to a reasonable time. It should definitely help for the larger methods but is unlikely to do anything for the arithmaticLoop and will only help the bytecode benchmark slightly. Compiling quicker will make it easier to run more extensive tests.
Do you make any difference between calling compiling method and , for instance, a primitive function? As i remember, you compiling methods to some form of a routine, which can be called using cdecl convention. But on top of that, knowing the fact that you calling a compiled method you can use some register optimizations like passing arguments in it, and in general by knowing where you changing registers, you can predict what of them are changing after call, and what will stay unchanged. And, of course, nothing stops you from using own calling convention to make code working faster. There's also a MMX/SSE registers which can be used for different purposes. All of the above, depending on choices, can greatly improve sends speed. Just want to know, what you thinking about it.
And small trick when compiling SmallInteger methods: you already know that receiver is a smallinteger. So, by using that knowledge, some tests can be omitted. In same manner you can deal with compiling methods for classes which have byte/reference indexed instances.
Too bad, it's not slate, where you know the types of all method arguments. In ST its just receiver :)
Bryce _______________________________________________ Exupery mailing list Exupery@lists.squeakfoundation.org http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/exupery