[Vm-dev] New Cog VMs available
Clément Bera
bera.clement at gmail.com
Thu Jul 7 20:04:18 UTC 2016
Yes. Number of samples and percentage of overall samples. You're right it's
not that many. Here is the result when I run 5 times the bench (I usually
use the profiler just as a hint to tweak performance, then the in-image
results matter).
New VM:
% of vanilla vm code (% of total) (samples) (cumulative)
20.72% ( 9.98%) scavengeReferentsOf (2943) (20.72%)
*10.52% ( 5.07%) lookupOrdinaryNoMNUEtcInClass (1494) (31.24%)*
6.63% ( 3.19%) processWeakSurvivor (941) (37.87%)
5.62% ( 2.71%) copyAndForward (798) (43.49%)
* 4.94% ( 2.38%) addNewMethodToCache (702) (48.43%)*
4.86% ( 2.34%) doScavenge (690) (53.29%)
3.20% ( 1.54%) primitiveStringReplace (455) (56.50%)
3.06% ( 1.48%) moveFramesInthroughtoPage (435) (59.56%)
2.94% ( 1.42%) compact (418) (62.50%)
* 2.87% ( 1.38%) interpret (407) (65.37%)*
* 2.52% ( 1.21%) ceSendFromInLineCacheMiss (358) (67.89%)*
Old VM:
% of vanilla vm code (% of total) (samples) (cumulative)
27.19% (10.97%) scavengeReferentsOf (2615) (27.19%)
9.80% ( 3.95%) processWeakSurvivor (943) (36.99%)
8.76% ( 3.54%) copyAndForward (843) (45.75%)
7.06% ( 2.85%) doScavenge (679) (52.81%)
4.81% ( 1.94%) primitiveStringReplace (463) (57.63%)
4.28% ( 1.73%) moveFramesInthroughtoPage (412) (61.91%)
*4.15% ( 1.67%) lookupOrdinaryNoMNUEtcInClass (399) (66.06%)*
2.44% ( 0.99%) marryFrameSP (235) (68.50%)
2.30% ( 0.93%) isWidowedContext (221) (70.80%)
2.19% ( 0.88%) handleStackOverflow (211) (72.99%)
1.64% ( 0.66%) primitiveCompareString (158) (74.63%)
1.50% ( 0.60%) findMethodWithPrimitiveFromContextUpToContext (144)
(76.13%)
1.50% ( 0.60%) ceBaseFrameReturn (144) (77.63%)
1.45% ( 0.58%) ceStackOverflow (139) (79.07%)
1.16% ( 0.47%) ceNonLocalReturn (112) (80.24%)
1.11% ( 0.45%) allocateNewSpaceSlotsformatclassIndex (107) (81.35%)
1.00% ( 0.40%) isBytes (96) (82.35%)
0.95% ( 0.38%) fetchClassOfNonImm (91) (83.29%)
0.93% ( 0.37%) stackValue (89) (84.22%)
0.90% ( 0.36%) returnToExecutivepostContextSwitch (87) (85.12%)
* 0.90% ( 0.36%) interpret (87) (86.03%)*
* 0.84% ( 0.34%) addNewMethodToCache (81) (86.87%)*
* 0.83% ( 0.34%) ceSendFromInLineCacheMiss (80) (87.70%)*
Now if I look at the 4 methods highlighted, instead of the 3 I looked at
before, I see ~7% overhead. I'm not sure if compact got really slower or if
it used to be inlined somewhere during slang compilation. I still think we
should use these results as a hit and rely on the real benchmark for
performance evaluation.
If it's really a slang inlining problem, it will have side-effect on other
functions, many will get a little bit faster, and it really seems this is
the problem.
On Thu, Jul 7, 2016 at 8:50 PM, Holger Freyther <holger at freyther.de> wrote:
>
> > On 07 Jul 2016, at 19:41, Clément Bera <bera.clement at gmail.com> wrote:
> >
> > Hi Holger,
>
>
> Hi!
>
>
> > I'm sorry for the delay since you reported the bug. Everyone working on
> the VM is very busy (In my case, my time is consumed by my phd and sista,
> I'm trying to open an alpha of the image+runtime before ESUG running at
> ~1.5x).
>
> great.
>
>
> >
> > Now that the VM profiler is fully working I was able to look into the
> regression.
> >
> > Old VM vanilla code:
> >
> > 27.94% (11.54%) scavengeReferentsOf
> (528) (27.94%)
> > 10.21% ( 4.22%) processWeakSurvivor
> (193) (38.15%)
> > 8.15% ( 3.36%) copyAndForward
> (154) (46.30%)
> > 7.41% ( 3.06%) doScavenge
> (140) (53.70%)
> > 5.19% ( 2.14%) lookupOrdinaryNoMNUEtcInClass
> (98) (58.89%)
> > 4.71% ( 1.94%) primitiveStringReplace
> (89) (63.60%)
> > 4.29% ( 1.77%) moveFramesInthroughtoPage
> (81) (67.88%)
> > 2.06% ( 0.85%) isWidowedContext
> (39) (69.95%)
> > [...]
> > 0.90% ( 0.37%) ceSendFromInLineCacheMiss
> (17) (85.61%)
> > 0.85% ( 0.35%) addNewMethodToCache
> (16) (86.46%)
> >
> > New VM vanilla code:
> >
> > % of vanilla vm code (% of total)
> (samples) (cumulative)
> > 22.41% (10.44%) scavengeReferentsOf
> (609) (22.41%)
> > 14.46% ( 6.74%) lookupOrdinaryNoMNUEtcInClass
> (393) (36.87%)
> >
>
>
> > I highlighted the problematic methods. It is likely that it has to do
> with slang inlining.
> >
> > The 3 methods highlighted seems to be responsible for an overhead of
> ~8.5% in the overall runtime. You seem to have an overhead of ~14% on my
> machine. There's quite a difference. The slang inlining overhead may have
> impacted other functions, and C profilers are never 100% accurate, so we
> will see when the problem is fixed if something else is also problematic.
> Now that we know what the problem is, I believe it will be fixed in the
> incoming weeks when someone (likely Eliot, maybe me) has time. Thanks for
> reporting the regression.
>
> What does (393) and (98) mean? Is it number of samples?
>
> holger
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20160707/7331edaa/attachment-0001.htm
More information about the Vm-dev
mailing list