[squeak-dev] squeak profiling

Levente Uzonyi leves at elte.hu
Sat Apr 19 13:38:13 UTC 2008


Hi!

Quoting Riccardo Lucchese <riccardo.lucchese at gmail.com>:

> Hi,
>
> I'm working on profiling squeak/etoys for the olpc project.
>
> It seems to me that bytecodes cases in the interpret function
> (in /platform/unix/src/vm/interp.c) are not ordered in respect to
> the probability of their execution.
>

Since the switch-case statement in C is compiled into a jump table  
(http://en.wikipedia.org/wiki/Jump_table) the order of the branches  
shouldn't affect the speed of execution (you can even start with the  
default branch).

> Here is a graph for reference (more games in Etoys have
> the same pattern):
> http://www.bodhidharma.info/instructions.pdf
>
> What I did so far is changing the `256 cases switch' statement to an array
> of function pointers like (*exec_bytecode_funcs)[bytecode_id](
> ...shared data... );

Actually it should be slower, because function calls used to be slower  
than simple jumps or arithmetic instructions.

> The process was automated with a python script and that needed a little bit
> of code cleaning; after some testing I couldn't trigger any sort of
> bug in the new code.
>

AFAIK the interp.c is generated with VMMaker, so for a new vm, you  
have to run your script again. A pure Squeak solution might be better.

>
> if we assume a constant time T both for the execution of every
> [if !(right_case) jump next_case] in the old code and for the
> function pointer dereference in the new code there is a 10000% gain
> for the task of calling the right action for a given bytecode
> (also given the distribution showed in the graph linked above).

This gain didn't come because of the jump table.

>
> In my tests this is a 20/30% win over the interpret routine timings
> for different games
> in etoys.

Are you sure that the performance gain came from these changes? If so,  
then the only reason for the speedup i can think of is that the most  
common bytecodes' code are scattered across memory pages and there are  
more page faults with the switch-case implementation. I wonder if you  
could check how many page faults does the two implementations have.

Cheers,
Levente

>
> I appreciate any comments on this work.
> Maybe it could be done better than this ?
>
> Thanks,
> Riccardo
>
>






More information about the Squeak-dev mailing list