[Vm-dev] New Cog VMs available
Eliot Miranda
eliot.miranda at gmail.com
Thu Jul 14 00:53:29 UTC 2016
Hi Henrik,
On Tue, Jul 12, 2016 at 1:07 AM, Henrik Johansen <
henrik.s.johansen at veloxit.no> wrote:
>
> Could also be something gone wonky with the interpreters method cache,
> causing excessive misses and the fallback code (lookupOrdinary... and
> addNewMethod..) to run.
> Initially I'd suspect there might've been changes that'd affect hashing
> (such as classTag - index in class table, being used instead of class
> pointer), but I guess that happened before the first version.
>
> I would note, in the generated code for spur in the git repo, there's a
> case on top of addNewMethodToCache that doesn't actually add the found
> method to cache:
> if (!(((GIV(newMethod) & (tagMask())) == 0)
> && (((((usqInt) (longAt(GIV(newMethod)))) >> (formatShift())) &
> (formatMask())) >= (firstCompiledMethodFormat())))) {
> primitiveFunctionPointer = primitiveInvokeObjectAsMethod;
> return;
> }
>
> I don't know exactly what firstCompiledMethodFormat() entails, but if it
> means found method that is a CompiledMethod, that would pretty much mean
> the MethodCache is entirely unused, no?
>
The expression is ! (not) "is it non-immediate with compiled method
format". So this expression adds the primitiveInvokeObjectAsMethod as the
primitive whenever what the lookup finds is /not/ a compiled method. This
is the implementation of "objects as methods". All compiled methods are
non-immediates and have a format >= firstCompiledMethodFormat:
Behavior>>instSpec
"Answer the instance specification part of the format that defines what
kind of object
an instance of the receiver is. The formats are
0 = 0 sized objects (UndefinedObject True False et al)
1 = non-indexable objects with inst vars (Point et al)
2 = indexable objects with no inst vars (Array et al)
3 = indexable objects with inst vars (MethodContext AdditionalMethodState
et al)
4 = weak indexable objects with inst vars (WeakArray et al)
5 = weak non-indexable objects with inst vars (ephemerons) (Ephemeron)
6 = unused
7 = immediates (SmallInteger, Character)
8 = unused
9 = 64-bit indexable
10-11 = 32-bit indexable (Bitmap) (plus one odd bit, unused in 32-bits)
12-15 = 16-bit indexable (plus two odd bits, one unused in 32-bits)
16-23 = 8-bit indexable (plus three odd bits, one unused in 32-bits)
24-31 = compiled methods (CompiledMethod) (plus three odd bits, one unused
in 32-bits)
Note that in the VM instances also have a 5 bit format field that relates
to their class's format.
Formats 11, 13-15, 17-23 & 25-31 are unused in classes but used in
instances to define the
number of elements missing up to the slot size. For example, a 2-byte
ByteString instance
has format 18 in 32-bits, since its size is one 32-bit slot - 2 bytes ((18
bitAnd: 3) = 2), and
22 in 64 bits, since its size is one 64-bit slot - 6 bytes ((22 bitAnd: 7)
= 6)."
^(format bitShift: -16) bitAnd: 16r1F
So the format encodes both "it is a compiled method" and the number of
unused trailing elements.
> Cheers,
> Henry
>
> On 07 Jul 2016, at 10:04 , Clément Bera <bera.clement at gmail.com> wrote:
>
> Yes. Number of samples and percentage of overall samples. You're right
> it's not that many. Here is the result when I run 5 times the bench (I
> usually use the profiler just as a hint to tweak performance, then the
> in-image results matter).
>
> New VM:
>
> % of vanilla vm code (% of total) (samples) (cumulative)
> 20.72% ( 9.98%) scavengeReferentsOf (2943) (20.72%)
> *10.52% ( 5.07%) lookupOrdinaryNoMNUEtcInClass (1494) (31.24%)*
> 6.63% ( 3.19%) processWeakSurvivor (941) (37.87%)
> 5.62% ( 2.71%) copyAndForward (798) (43.49%)
> * 4.94% ( 2.38%) addNewMethodToCache (702) (48.43%)*
> 4.86% ( 2.34%) doScavenge (690) (53.29%)
> 3.20% ( 1.54%) primitiveStringReplace (455) (56.50%)
> 3.06% ( 1.48%) moveFramesInthroughtoPage (435) (59.56%)
> 2.94% ( 1.42%) compact (418) (62.50%)
> * 2.87% ( 1.38%) interpret (407) (65.37%)*
> * 2.52% ( 1.21%) ceSendFromInLineCacheMiss (358) (67.89%)*
>
> Old VM:
>
> % of vanilla vm code (% of total) (samples) (cumulative)
> 27.19% (10.97%) scavengeReferentsOf (2615) (27.19%)
> 9.80% ( 3.95%) processWeakSurvivor (943) (36.99%)
> 8.76% ( 3.54%) copyAndForward (843) (45.75%)
> 7.06% ( 2.85%) doScavenge (679) (52.81%)
> 4.81% ( 1.94%) primitiveStringReplace (463) (57.63%)
> 4.28% ( 1.73%) moveFramesInthroughtoPage (412) (61.91%)
> *4.15% ( 1.67%) lookupOrdinaryNoMNUEtcInClass (399) (66.06%)*
> 2.44% ( 0.99%) marryFrameSP (235) (68.50%)
> 2.30% ( 0.93%) isWidowedContext (221) (70.80%)
> 2.19% ( 0.88%) handleStackOverflow (211) (72.99%)
> 1.64% ( 0.66%) primitiveCompareString (158) (74.63%)
> 1.50% ( 0.60%) findMethodWithPrimitiveFromContextUpToContext (144)
> (76.13%)
> 1.50% ( 0.60%) ceBaseFrameReturn (144) (77.63%)
> 1.45% ( 0.58%) ceStackOverflow (139) (79.07%)
> 1.16% ( 0.47%) ceNonLocalReturn (112) (80.24%)
> 1.11% ( 0.45%) allocateNewSpaceSlotsformatclassIndex (107) (81.35%)
> 1.00% ( 0.40%) isBytes (96) (82.35%)
> 0.95% ( 0.38%) fetchClassOfNonImm (91) (83.29%)
> 0.93% ( 0.37%) stackValue (89) (84.22%)
> 0.90% ( 0.36%) returnToExecutivepostContextSwitch (87) (85.12%)
> * 0.90% ( 0.36%) interpret (87) (86.03%)*
> * 0.84% ( 0.34%) addNewMethodToCache (81) (86.87%)*
> * 0.83% ( 0.34%) ceSendFromInLineCacheMiss (80) (87.70%)*
>
> Now if I look at the 4 methods highlighted, instead of the 3 I looked at
> before, I see ~7% overhead. I'm not sure if compact got really slower or if
> it used to be inlined somewhere during slang compilation. I still think we
> should use these results as a hit and rely on the real benchmark for
> performance evaluation.
>
> If it's really a slang inlining problem, it will have side-effect on other
> functions, many will get a little bit faster, and it really seems this is
> the problem.
>
>
>
> On Thu, Jul 7, 2016 at 8:50 PM, Holger Freyther <holger at freyther.de>
> wrote:
>
>>
>> > On 07 Jul 2016, at 19:41, Clément Bera <bera.clement at gmail.com> wrote:
>> >
>> > Hi Holger,
>>
>>
>> Hi!
>>
>>
>> > I'm sorry for the delay since you reported the bug. Everyone working on
>> the VM is very busy (In my case, my time is consumed by my phd and sista,
>> I'm trying to open an alpha of the image+runtime before ESUG running at
>> ~1.5x).
>>
>> great.
>>
>>
>> >
>> > Now that the VM profiler is fully working I was able to look into the
>> regression.
>> >
>> > Old VM vanilla code:
>> >
>> > 27.94% (11.54%) scavengeReferentsOf
>> (528) (27.94%)
>> > 10.21% ( 4.22%) processWeakSurvivor
>> (193) (38.15%)
>> > 8.15% ( 3.36%) copyAndForward
>> (154) (46.30%)
>> > 7.41% ( 3.06%) doScavenge
>> (140) (53.70%)
>> > 5.19% ( 2.14%) lookupOrdinaryNoMNUEtcInClass
>> (98) (58.89%)
>> > 4.71% ( 1.94%) primitiveStringReplace
>> (89) (63.60%)
>> > 4.29% ( 1.77%) moveFramesInthroughtoPage
>> (81) (67.88%)
>> > 2.06% ( 0.85%) isWidowedContext
>> (39) (69.95%)
>> > [...]
>> > 0.90% ( 0.37%) ceSendFromInLineCacheMiss
>> (17) (85.61%)
>> > 0.85% ( 0.35%) addNewMethodToCache
>> (16) (86.46%)
>> >
>> > New VM vanilla code:
>> >
>> > % of vanilla vm code (% of total)
>> (samples) (cumulative)
>> > 22.41% (10.44%) scavengeReferentsOf
>> (609) (22.41%)
>> > 14.46% ( 6.74%) lookupOrdinaryNoMNUEtcInClass
>> (393) (36.87%)
>> >
>>
>>
>> > I highlighted the problematic methods. It is likely that it has to do
>> with slang inlining.
>> >
>> > The 3 methods highlighted seems to be responsible for an overhead of
>> ~8.5% in the overall runtime. You seem to have an overhead of ~14% on my
>> machine. There's quite a difference. The slang inlining overhead may have
>> impacted other functions, and C profilers are never 100% accurate, so we
>> will see when the problem is fixed if something else is also problematic.
>> Now that we know what the problem is, I believe it will be fixed in the
>> incoming weeks when someone (likely Eliot, maybe me) has time. Thanks for
>> reporting the regression.
>>
>> What does (393) and (98) mean? Is it number of samples?
>>
>> holger
>>
>>
>>
>>
>>
>
>
>
--
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20160713/45b26150/attachment-0001.htm
More information about the Vm-dev
mailing list