[Vm-dev] CogVM Benchmarking

Mon May 16 15:42:18 UTC 2011

On Mon, May 16, 2011 at 8:17 AM, Stefan Marr <squeak at stefan-marr.de> wrote:

>
> Hi Eliot:
>
> When exactly is the JIT compilation triggered in Cog?
>

There are currently four triggers and one caveat.

1. when a method is found in the method lookup cache.  This translates to
the second send of a message.  Of course there are caveats.  Cache
collisions could cause one method to mask another etc.  But the method cache
uses three probes to avoid collisions and is less used in the JIT because
inline caches relieve cache pressure.

2. when two successive block evaluations are of blocks in the same method
(i.e. the interpreted value prim remembers the method for the previous block
evaluation and if the next block evaluation is to the same method it
compiles the method).

3. evaluation of a method via withArgs:executeMethod: (i.e. doits).  Hence a
doit (but not necessarily the methods it calls) will always run jitted (but
see caveat).

4. the Nth evaluation of a loop in an interpreted method, where N defaults
to 10.  This has a command-line argument to alter the value (-cogminjumps).
The value should be tuned, e.g. to optimize start-up time.  But that's work
that remains to be done.

The caveat is that no method will be jitted if it has too many  literals.
 The literal count is used to avoid jitting very large methods.  The literal
count defaults to 60.  It s a command-line switch to control it
(-cogmaxlits).

So to be sure that you're measuring pure performance, and not any overhead
you really need to measure the third evaluation of any benchmark. The first
evaluation will load the method lookup caches.  The second will compile
everything, but compilation and linking might introduce some overhead.  The
third evaluation should be running at full speed.  But you might want to
time the successive evaluations.  In my experience compilation and linking
overhead is very low so I expect you'll be hard-pressed to see much
difference between the second and third runs.

Of course, running the GC before each evaluation is necessary to sync the GC
and avoid unequal GC activity in each run.

Tricky :)

> I would like to provide a configuration for the SMark benchmarking
> framework that ensures that everything got jitted, without doing to much
> unnecessary warmup.
>
> All the benchmark code is implemented in standard methods and executed very
> similarly to how SUnit works, thus, they are called like 'suite perform:
> aBenchmarkSelector'.
>
>
> How many iterations should I run the code before I can be sure that it is
> jitted?
> From your blog, I assume, it should be jitted after its second execution.
> (see snippet below)
>
> And I supposed a 'Smalltalk garbageCollect' should not interfere with this,
> and can be safely performed before a timed run?
> Is there anything else, I should cover in the framework? Any particular
> heuristics/mechanism that should be taken into account when trying to reach
> a stable state?
>

Forcing finalization?  Again in my experience having a still machine is very
important.  You'll see variations in timing caused by other activities on
the machine (Time Machine, your mailer uploading mail from the server etc).
 So this area is now quite difficult.  These are issues that Alexandre
Bergel has articulated well and are good reasons for going with his method
count approach.  Of course method counting doesn't apply to trying to
profile specific activities in the VM.  For that you do need a traditrional
profiler.  But for tuning Smalltalk applications Alexandre's approach seems
to make most sense.

HTH
Eliot

>
>
> What I found on your blog is the following:
> <<<
> So a simple way to implement the interpret on single-use policy is to only
> compile to machine code when finding a method in the first-level method
> lookup cache. We avoid compiling large methods, which are typically
> initializers, and rarely performance-critical inner loops, by refusing to
> compile methods whose number of literals exceeds a limit, settable via the
> command line that defaults to 60 literals, which excludes a mere 0.3% of
> methods in my image.
> >>>
>
> Thanks a lot
> Stefan
>
>
>
>
> --
> Stefan Marr
> Software Languages Lab
> Vrije Universiteit Brussel
> Pleinlaan 2 / B-1050 Brussels / Belgium
> http://soft.vub.ac.be/~smarr
> Phone: +32 2 629 2974
> Fax:   +32 2 629 3525
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20110516/1cb31f95/attachment.htm