<br><br><div class="gmail_quote">On Mon, May 16, 2011 at 8:17 AM, Stefan Marr <span dir="ltr">&lt;<a href="mailto:squeak@stefan-marr.de">squeak@stefan-marr.de</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<br>

Hi Eliot:<br>

<br>

When exactly is the JIT compilation triggered in Cog?<br></blockquote><div><br></div><div>There are currently four triggers and one caveat.</div><div><br></div><div>1. when a method is found in the method lookup cache.  This translates to the second send of a message.  Of course there are caveats.  Cache collisions could cause one method to mask another etc.  But the method cache uses three probes to avoid collisions and is less used in the JIT because inline caches relieve cache pressure.</div>

<div><br></div><div>2. when two successive block evaluations are of blocks in the same method (i.e. the interpreted value prim remembers the method for the previous block evaluation and if the next block evaluation is to the same method it compiles the method).</div>

<div><br></div><div>3. evaluation of a method via withArgs:executeMethod: (i.e. doits).  Hence a doit (but not necessarily the methods it calls) will always run jitted (but see caveat).</div><div><br></div><div>4. the Nth evaluation of a loop in an interpreted method, where N defaults to 10.  This has a command-line argument to alter the value (-cogminjumps). The value should be tuned, e.g. to optimize start-up time.  But that&#39;s work that remains to be done.</div>

<div><br></div><div>The caveat is that no method will be jitted if it has too many  literals.  The literal count is used to avoid jitting very large methods.  The literal count defaults to 60.  It s a command-line switch to control it (-cogmaxlits).</div>

<div><br></div><div>So to be sure that you&#39;re measuring pure performance, and not any overhead you really need to measure the third evaluation of any benchmark. The first evaluation will load the method lookup caches.  The second will compile everything, but compilation and linking might introduce some overhead.  The third evaluation should be running at full speed.  But you might want to time the successive evaluations.  In my experience compilation and linking overhead is very low so I expect you&#39;ll be hard-pressed to see much difference between the second and third runs.</div>

<div><br></div><div>Of course, running the GC before each evaluation is necessary to sync the GC and avoid unequal GC activity in each run.</div><div><br></div><div>Tricky :)</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


<br>

I would like to provide a configuration for the SMark benchmarking framework that ensures that everything got jitted, without doing to much unnecessary warmup.<br>

<br>

All the benchmark code is implemented in standard methods and executed very similarly to how SUnit works, thus, they are called like &#39;suite perform: aBenchmarkSelector&#39;.<br>

<br>

<br>

How many iterations should I run the code before I can be sure that it is jitted?<br>

>From your blog, I assume, it should be jitted after its second execution. (see snippet below)<br>

<br>

And I supposed a &#39;Smalltalk garbageCollect&#39; should not interfere with this, and can be safely performed before a timed run?<br>

Is there anything else, I should cover in the framework? Any particular heuristics/mechanism that should be taken into account when trying to reach a stable state?<br></blockquote><div><br></div><div>Forcing finalization?  Again in my experience having a still machine is very important.  You&#39;ll see variations in timing caused by other activities on the machine (Time Machine, your mailer uploading mail from the server etc).  So this area is now quite difficult.  These are issues that Alexandre Bergel has articulated well and are good reasons for going with his method count approach.  Of course method counting doesn&#39;t apply to trying to profile specific activities in the VM.  For that you do need a traditrional profiler.  But for tuning Smalltalk applications Alexandre&#39;s approach seems to make most sense.</div>

<div><br></div><div>HTH</div><div>Eliot</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<br>

<br>

What I found on your blog is the following:<br>

&lt;&lt;&lt;<br>

So a simple way to implement the interpret on single-use policy is to only compile to machine code when finding a method in the first-level method lookup cache. We avoid compiling large methods, which are typically initializers, and rarely performance-critical inner loops, by refusing to compile methods whose number of literals exceeds a limit, settable via the command line that defaults to 60 literals, which excludes a mere 0.3% of methods in my image.<br>


&gt;&gt;&gt;<br>

<br>

Thanks a lot<br>

Stefan<br>

<font color="#888888"><br>

<br>

<br>

<br>

--<br>

Stefan Marr<br>

Software Languages Lab<br>

Vrije Universiteit Brussel<br>

Pleinlaan 2 / B-1050 Brussels / Belgium<br>

<a href="http://soft.vub.ac.be/~smarr" target="_blank">http://soft.vub.ac.be/~smarr</a><br>

Phone: <a href="tel:%2B32%202%20629%202974" value="+3226292974">+32 2 629 2974</a><br>

Fax:   <a href="tel:%2B32%202%20629%203525" value="+3226293525">+32 2 629 3525</a><br>

<br>

</font></blockquote></div><br>