[Vm-dev] New Cog VMs available

Thu Jul 7 20:04:18 UTC 2016

Yes. Number of samples and percentage of overall samples. You're right it's
not that many. Here is the result when I run 5 times the bench (I usually
use the profiler just as a hint to tweak performance, then the in-image
results matter).

New VM:

% of vanilla vm code (% of total) (samples) (cumulative)
20.72%    (  9.98%) scavengeReferentsOf (2943) (20.72%)
*10.52%    (  5.07%) lookupOrdinaryNoMNUEtcInClass (1494) (31.24%)*
  6.63%    (  3.19%) processWeakSurvivor (941) (37.87%)
  5.62%    (  2.71%) copyAndForward (798) (43.49%)
 * 4.94%    (  2.38%) addNewMethodToCache (702) (48.43%)*
  4.86%    (  2.34%) doScavenge (690) (53.29%)
  3.20%    (  1.54%) primitiveStringReplace (455) (56.50%)
  3.06%    (  1.48%) moveFramesInthroughtoPage (435) (59.56%)
  2.94%    (  1.42%) compact (418) (62.50%)
 * 2.87%    (  1.38%) interpret (407) (65.37%)*
 * 2.52%    (  1.21%) ceSendFromInLineCacheMiss (358) (67.89%)*

Old VM:

% of vanilla vm code (% of total) (samples) (cumulative)
27.19%    (10.97%) scavengeReferentsOf (2615) (27.19%)
  9.80%    (  3.95%) processWeakSurvivor (943) (36.99%)
  8.76%    (  3.54%) copyAndForward (843) (45.75%)
  7.06%    (  2.85%) doScavenge (679) (52.81%)
  4.81%    (  1.94%) primitiveStringReplace (463) (57.63%)
  4.28%    (  1.73%) moveFramesInthroughtoPage (412) (61.91%)
  *4.15%    (  1.67%) lookupOrdinaryNoMNUEtcInClass (399) (66.06%)*
  2.44%    (  0.99%) marryFrameSP (235) (68.50%)
  2.30%    (  0.93%) isWidowedContext (221) (70.80%)
  2.19%    (  0.88%) handleStackOverflow (211) (72.99%)
  1.64%    (  0.66%) primitiveCompareString (158) (74.63%)
  1.50%    (  0.60%) findMethodWithPrimitiveFromContextUpToContext (144)
(76.13%)
  1.50%    (  0.60%) ceBaseFrameReturn (144) (77.63%)
  1.45%    (  0.58%) ceStackOverflow (139) (79.07%)
  1.16%    (  0.47%) ceNonLocalReturn (112) (80.24%)
  1.11%    (  0.45%) allocateNewSpaceSlotsformatclassIndex (107) (81.35%)
  1.00%    (  0.40%) isBytes (96) (82.35%)
  0.95%    (  0.38%) fetchClassOfNonImm (91) (83.29%)
  0.93%    (  0.37%) stackValue (89) (84.22%)
  0.90%    (  0.36%) returnToExecutivepostContextSwitch (87) (85.12%)
 * 0.90%    (  0.36%) interpret (87) (86.03%)*
 * 0.84%    (  0.34%) addNewMethodToCache (81) (86.87%)*
*  0.83%    (  0.34%) ceSendFromInLineCacheMiss (80) (87.70%)*

Now if I look at the 4 methods highlighted, instead of the 3 I looked at
before, I see ~7% overhead. I'm not sure if compact got really slower or if
it used to be inlined somewhere during slang compilation. I still think we
should use these results as a hit and rely on the real benchmark for
performance evaluation.

If it's really a slang inlining problem, it will have side-effect on other
functions, many will get a little bit faster, and it really seems this is
the problem.

On Thu, Jul 7, 2016 at 8:50 PM, Holger Freyther <holger at freyther.de> wrote:

>
> > On 07 Jul 2016, at 19:41, Clément Bera <bera.clement at gmail.com> wrote:
> >
> > Hi Holger,
>
>
> Hi!
>
>
> > I'm sorry for the delay since you reported the bug. Everyone working on
> the VM is very busy (In my case, my time is consumed by my phd and sista,
> I'm trying to open an alpha of the image+runtime before ESUG running at
> ~1.5x).
>
> great.
>
>
> >
> > Now that the VM profiler is fully working I was able to look into the
> regression.
> >
> > Old VM vanilla code:
> >
> > 27.94%    (11.54%)    scavengeReferentsOf
>                              (528)   (27.94%)
> > 10.21%    (  4.22%)   processWeakSurvivor
>                              (193)   (38.15%)
> >   8.15%    (  3.36%)  copyAndForward
>                               (154)   (46.30%)
> >   7.41%    (  3.06%)  doScavenge
>                                       (140)   (53.70%)
> >   5.19%    (  2.14%)  lookupOrdinaryNoMNUEtcInClass
>              (98)            (58.89%)
> >   4.71%    (  1.94%)  primitiveStringReplace
>                               (89)            (63.60%)
> >   4.29%    (  1.77%)  moveFramesInthroughtoPage
>                      (81)            (67.88%)
> >   2.06%    (  0.85%)  isWidowedContext
>                                       (39)            (69.95%)
> > [...]
> >   0.90%    (  0.37%)  ceSendFromInLineCacheMiss
>                      (17)            (85.61%)
> >   0.85%    (  0.35%)  addNewMethodToCache
>                      (16)            (86.46%)
> >
> > New VM vanilla code:
> >
> > % of vanilla vm code (% of total)
>                                              (samples) (cumulative)
> > 22.41%    (10.44%)    scavengeReferentsOf
>                              (609)   (22.41%)
> > 14.46%    (  6.74%)   lookupOrdinaryNoMNUEtcInClass
>              (393)   (36.87%)
> >
>
>
> > I highlighted the problematic methods. It is likely that it has to do
> with slang inlining.
> >
> > The 3 methods highlighted seems to be responsible for an overhead of
> ~8.5% in the overall runtime. You seem to have an overhead of ~14% on my
> machine. There's quite a difference. The slang inlining overhead may have
> impacted other functions, and C profilers are never 100% accurate, so we
> will see when the problem is fixed if something else is also problematic.
> Now that we know what the problem is, I believe it will be fixed in the
> incoming weeks when someone (likely Eliot, maybe me) has time. Thanks for
> reporting the regression.
>
> What does (393) and (98) mean? Is it number of samples?
>
> holger
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20160707/7331edaa/attachment-0001.htm