Performance profiling results...

Anthony Hannan ajh18 at cornell.edu
Tue Sep 25 17:07:00 UTC 2001


The methodCache entry size need to be a power of 2 and is currently 8,
but only uses 5 slots.  What if we take primitiveIndex out so our entry
size can be 4 with all slots used.  It will be almost as quick to fetch
the primitiveIndex from the method once the primitive bits are put
together in the header.  Granted this will bring the method into the CPU
cache, but most of the time (for primitive = 0) this will be happening
anyway when we start executing it.

Scott A Crosby <crosby at qwes.math.cmu.edu> wrote:
>         methodCache[probe + 1] = messageSelector;
>         methodCache[probe + 2] = lkupClass;
>         methodCache[probe + 3] = newMethod;
>         methodCache[probe + 4] = primitiveIndex;
>         methodCache[probe + 5] = newNativeMethod;
> 
> Note first that offsets 'probe + 0' 'probe + 6' and 'probe + 7' are not
> used. Thus meaning that 3/8 of the slots (37.5%) are not used. Secondly,
> note that it is not using 'probe + 0' at all. Every access has an offset!
> 
> Which also means that the cachelines holding it will have a lot of
> unused/worthless data. This is why I suggested split-tables, with say 4
> probings:
> 
>  probe1 probe1+1
>  probe2 probe2+1




More information about the Squeak-dev mailing list