More Profiling

Jan Bottorff janb at pmatrix.com
Thu Sep 27 12:33:12 UTC 2001


>First, try to avoid the
>branch misprediction penalty by having fewer collisions and making sure
>the first entry is *good*.  Second, even a misprediction costs about 1/10
>the cost of a full lookup, so probing 1-3 entries is still cheaper than a
>full lookup, though I am not sure if it is worth the complexity

On x86 processors (perhaps others?), it seems possible you could write the 
cache probe code to use conditional move instructions instead of branches 
(which might be mispredicted). Basically, you would have code that loaded 
all three probes, and only keeps the answer for the correct one, or else 
branches if it's a cache miss. The downside is you run instructions that 
are wasted if an early probe hits, but save the major hit from 
mispredictions. It's anybody's guess how this trade off actually performs, 
but though I'd toss out the thought. Some x86 C compilers will emit 
conditional move instructions under the right conditions. Code like the 
following (or some similar idea) might have a better chance of producing 
conditional move instructions that the original 
Interpreter:lookupInMethodCacheSel:class:. I believe you also could code 
this up as MMX instructions, which could compare both the class and 
selector in a single compare, and giving a handy mask value result. Non-x86 
(and the x86 SETcc instruction) processors may also have comparison 
instructions that give masks, which should allow avoiding the branches, so 
this may be a trick that works on a number of processors.

- Jan

         | hash probe probe2 probe3 |
         self inline: true.

         hash := selector bitXor: class.

         probe := hash bitAnd: MethodCacheMask.  "first probe"
         probe2 := (hash >> 1) bitAnd: MethodCacheMask.  "second probe"
         probe3 := (hash >> 2) bitAnd: MethodCacheMask.  "third probe"

         ((((methodCache at: probe + MethodCacheSelector) bitXor: selector) 
bitOr:
                  [(methodCache at: probe + MethodCacheClass) bitXor: 
class]) = 0) ifTrue:
                         [probe2 := probe "found entry in cache; done"].

         probe := probe2.
         ((((methodCache at: probe + MethodCacheSelector) bitXor: selector) 
bitOr:
                  [(methodCache at: probe + MethodCacheClass) bitXor: 
class]) = 0) ifTrue:
                         [probe3 := probe2 "found entry in cache; done"].

         probe := probe3.
         ((((methodCache at: probe + MethodCacheSelector) bitXor: selector) 
bitOr:
                  [(methodCache at: probe + MethodCacheClass) bitXor: 
class]) = 0) ifFalse:
                         [^ false].

         newMethod := methodCache at: probe + MethodCacheMethod.
         primitiveIndex := methodCache at: probe + MethodCachePrim.
         newNativeMethod := methodCache at: probe + MethodCacheNative.

         ^true









More information about the Squeak-dev mailing list