<br><br><div class="gmail_quote">On Thu, Jun 24, 2010 at 3:19 PM, Levente Uzonyi <span dir="ltr">&lt;<a href="mailto:leves@elte.hu">leves@elte.hu</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

On Tue, 22 Jun 2010, Eliot Miranda wrote:<br>

<br>

&lt;snip&gt;<div class="im"><br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

I can&#39;t say for sure without profiling (you&#39;ll find a good VM profiler<br>

QVMProfiler in the image in the tarball, which as yet works on MacOS only).<br>

</blockquote>

<br></div>

This looks promising, I (or someone else :)) just have to implement #primitiveExecutableModulesAndOffsets under win32 (and un*x), but that doesn&#39;t seem to be easy (at least the win32 part).</blockquote><div><br></div>

<div>If you look at platforms/win32/vm/sqWin32Backtrace.c you&#39;ll find code that extracts symbols from dlls for constructing a symbolic backtrace on crashes.  The code also uses a Teleplace.map file generated by the VM makefile which contains the symbols for the VM.  From this code you ought to be able to be able to implement a QVMProfilerWin32SymbolsManager almost entirely out of primitives.</div>

<div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

But I expect that the reason is the cost of invoking interpreter primitives<br>

from machine code.  Cog only implements a few primitives in machine code<br>

(arithmetic, at: &amp; block value) and for all others (e.g. nextPut: above) it<br>

executes the interpreter primitives.  lcsFor:and: uses at:put: heavily and<br>

Cog is using the interpreter version.  But the cost of invoking an<br>

interpreter primitive from machine code is higher than invoking it from the<br>

interpreter because of the system-call-like glue between the machine-code<br>

stack pages and the C stack on which the interpreter primitive runs.<br>

<br>

Three primitives that are currently interpreter primitives but must be<br>

implemented in machine code for better performance are new/basicNew,<br>

new:/basicNew: and at:put:.  I&#39;ve avoided implementing these in machine code<br>

because the object representation is so complex and am instead about to<br>

start work on a simpler object representation.  When I have that I&#39;ll<br>

implement these primitives and then the speed difference should tilt the<br>

other way.<br>

</blockquote>

<br></div>

This sounds reasonable. #lcsFor:and: uses #at:put: twice in the inner loop. One of them (lcss at: max + k + 1 put: lcs) can be eliminated without affecting the computation, because that just stores the results. So without only one #at:put: it took me 2423ms to run the benchmark. Which is still a bit too high. I think only the profiler can help here.<br>


<br>

Btw, is MessageTally less accurate with CogVM than with the SqueakVM?<br></blockquote><div><br></div><div>I&#39;m not sure.  We use a variant written by Andreas that is more accurate than MessageTally but that may use different plumbing.  </div>

<div><br></div><div>best</div><div>Eliot</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><font color="#888888">

<br>

<br>

Levente</font><div><div></div><div class="h5"><br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

Of course if anyone would like to implement these in the context of the<br>

current object representation be my guest and report back asap...<br>

<br>

best<br>

Eliot<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

<br>

Levente<br>

<br>

<br>

</blockquote>

<br>

</blockquote>

<br>

</div></div></blockquote></div><br>