[Vm-dev] VM Maker: VMMaker.oscog-cb.1271.mcz

Clément Bera bera.clement at gmail.com
Wed Apr 29 17:19:30 UTC 2015


I think the point you want to make, which is probably relevant, is that the
case where the counter trips is rare (probably once every 10000 executions
in the future) and therefore having 37 or 58 bytes instructions for the
inlined #== that push a boolean on the simulated stack used only in this
case (37 or 58 depending if one operand is an unannotable constant) instead
of a 16 instructions for a send (marshalling + inline cache) and a bytecode
annotation in the cog method header may not be worth it.

I don't know. We can change that if you want. I don't mind. When I did it I
thought it was simpler like that. Maybe not.

2015-04-29 19:04 GMT+02:00 Clément Bera <bera.clement at gmail.com>:

> Hey Eliot,
>
> I answered in the thread where you asked the quesstion.
>
> The answer was:
>
> *Basically #== tries to use the inlined version merged with the following
> branch, but if the counter trips, #== is executed in an inlined version
> that ignores the branch, pushing true or false on the simulated stack, and
> then the code generated for the branch uses the result of #== pushed on the
> simulated stack, which calls the trampoline for the tripping counter.*
>
> *Conceptually, this is the code generated for #==, if followed by branch
> true, for this example:*
>
> *foo*
> * ^ instVar1 == instVar2 ifTrue: [ 1 ] ifFalse: [ 2 ]*
>
> *Loading the operands*
> *0000146a: movl -12(%ebp), %edx : 8B 55 F4 *
> *0000146d: movl %ds:0xc(%edx), %edi : 8B 7A 0C *
> *00001470: movl %ds:0x8(%edx), %esi : 8B 72 08 *
> *%edi <- instVar1*
> *%esi <- instVar2*
>
> *counter logic (execution count)*
> *00001473: movl %ds:0x40090, %ebx : 8B 1D 90 00 04 00 *
> *%ebx <- counter value for this branch*
> *00001479: subl $0x00010000, %ebx : 81 EB 00 00 01 00 *
> *execution count - 1*
> *0000147f: jb .+0x00000046 (0x000014c7) : 72 46 *
> *if trip, jumps to the alternative #== which push true or false on stack
> for the branch code*
> *00001481: movl %ebx, %ds:0x40090 : 89 1D 90 00 04 00 *
> *write back the counter to the memory location*
>
> *#== followed by inlined branch version*
> *00001487: cmpl %edi, %esi : 39 FE *
> *00001489: jz .+0x000000a6 (0x00001535) : 0F 84 A6 00 00 00 *
> *compare instVar1 and instVar2 and jump if equals to the branch that
> pushes 1*
> *0000148f: movl %edi, %eax : 89 F8 *
> *00001491: andl $0x00000003, %eax : 83 E0 03 *
> *00001494: jnz .+0x0000000e (0x000014a4) : 75 0E *
> *00001496: movl %ds:(%edi), %eax : 8B 07 *
> *00001498: andl $0x003ffff7, %eax : 25 F7 FF 3F 00 *
> *0000149d: jnz .+0x00000005 (0x000014a4) : 75 05 *
> *0000149f: movl %ds:0x8(%edi), %edi : 8B 7F 08 *
> *000014a2: jmp .+0xffffffe3 (0x00001487) : EB E3 *
> *000014a4: movl %esi, %eax : 89 F0 *
> *000014a6: andl $0x00000003, %eax : 83 E0 03 *
> *000014a9: jnz .+0x0000000e (0x000014b9) : 75 0E *
> *000014ab: movl %ds:(%esi), %eax : 8B 06 *
> *000014ad: andl $0x003ffff7, %eax : 25 F7 FF 3F 00 *
> *000014b2: jnz .+0x00000005 (0x000014b9) : 75 05 *
> *000014b4: movl %ds:0x8(%esi), %esi : 8B 76 08 *
> *000014b7: jmp .+0xffffffce (0x00001487) : EB CE *
>
> *forwarder checks*
>
> *counter logic (branch count)*
> *000014b9: subl $0x00000001, %ebx : 83 EB 01 *
> *branch count - 1*
> *000014bc: movl %ebx, %ds:0x40090 : 89 1D 90 00 04 00 *
>
> *write back the counter to the memory location*
>
> *000014c2: jmp .+0x00000075 (0x0000153c) : E9 75 00 00 00 *
> *The result of #== was false let's jump to the branch that pushes 2.*
>
> *We arrive here only if the counter has tripped*
> *000014c7: cmpl %edi, %esi : 39 FE *
> *000014c9: jz .+0x00000031 (0x000014fc) : 74 31 *
> *000014cb: movl %edi, %eax : 89 F8 *
> *000014cd: andl $0x00000003, %eax : 83 E0 03 *
> *000014d0: jnz .+0x0000000e (0x000014e0) : 75 0E *
> *000014d2: movl %ds:(%edi), %eax : 8B 07 *
> *000014d4: andl $0x003ffff7, %eax : 25 F7 FF 3F 00 *
> *000014d9: jnz .+0x00000005 (0x000014e0) : 75 05 *
> *000014db: movl %ds:0x8(%edi), %edi : 8B 7F 08 *
> *000014de: jmp .+0xffffffe7 (0x000014c7) : EB E7 *
> *000014e0: movl %esi, %eax : 89 F0 *
> *000014e2: andl $0x00000003, %eax : 83 E0 03 *
> *000014e5: jnz .+0x0000000e (0x000014f5) : 75 0E *
> *000014e7: movl %ds:(%esi), %eax : 8B 06 *
> *000014e9: andl $0x003ffff7, %eax : 25 F7 FF 3F 00 *
> *000014ee: jnz .+0x00000005 (0x000014f5) : 75 05 *
> *000014f0: movl %ds:0x8(%esi), %esi : 8B 76 08 *
> *000014f3: jmp .+0xffffffd2 (0x000014c7) : EB D2 *
> *forwarder checks*
> *000014f5: movl $0x00100008=false, %esi : BE 08 00 10 00 *
> *000014fa: jmp .+0x00000005 (0x00001501) : EB 05 *
> *000014fc: movl $0x00100010=true, %esi : BE 10 00 10 00 *
> *Inlined version of #== that answers true or false on the simulated stack.*
>
> *Now we're in the branch logic (genJumpIf), not the #== logic
> (genSpecialSelectorEqualsEquals)*
> *00001501: movl %esi, %eax : 89 F0 *
> *Here we get true or false, or some other result if this instruction is a
> fixup.*
> *00001503: movl %ds:0x40090, %edi : 8B 3D 90 00 04 00 *
> *00001509: subl $0x00010000, %edi : 81 EF 00 00 01 00 *
> *0000150f: jb .+0x0000001d (0x0000152e) : 72 1D *
> *00001511: movl %edi, %ds:0x40090 : 89 3D 90 00 04 00 *
> *Here is the counter logic again, if we had tripped in #==, we trip again.*
> *00001517: subl $0x00100008=false, %eax : 2D 08 00 10 00 *
> *0000151c: jz .+0x0000001e (0x0000153c) : 74 1E *
> *0000151e: subl $0x00000001, %edi : 83 EF 01 *
> *00001521: movl %edi, %ds:0x40090 : 89 3D 90 00 04 00 *
> *00001527: cmpl $0x00000008, %eax : 83 F8 08 *
> *0000152a: jz .+0x00000009 (0x00001535) : 74 09 *
> *0000152c: xorl %edi, %edi : 31 FF *
> *0000152e: call .+0xfffff9e5
> (0x00000f18=ceSendMustBeBooleanAddFalseTrampoline) : E8 E5 F9 FF FF *
> *HasBytecodePC bc 19/20:*
> *This is the trampoline for mustBeBoolean and counterTrip.*
> *00001533: jmp .+0xffffffce (0x00001503) : EB CE *
> *After a tripping counter, we assume the counters are reset and we jump
> back to the comparison. We have to jump back to the execution count counter
> logic not to confuse the branch counter logic*
>
> *Now this is the push 1, jump and push 2 logic*
> *00001535: pushl $0x00000003 : 68 03 00 00 00 *
> *0000153a: jmp .+0x00000005 (0x00001541) : EB 05 *
> *0000153c: pushl $0x00000005 : 68 05 00 00 00 *
> *Here the two branches push 1 and 2 on stack*
>
> *00001541: popl %edx : 5A *
> *00001542: movl %ebp, %esp : 89 EC *
> *00001544: popl %ebp : 5D *
> *00001545: ret $0x0004 : C2 04 00 *
> *Return*
>
> *Everything looks correct to me. The case you showed includes a and: so
> there's the additional fixup, I can't explain here because the code is
> twice longer and hard to follow, but I believe it is also correct.*
>
> 2015-04-29 16:44 GMT+02:00 Eliot Miranda <eliot.miranda at gmail.com>:
>
>>
>> Hi Clément,
>>
>>     but what happens if the counter trips on a #==?  There are two cases,
>> with and without the callback installed.  AFAICT the code will continue
>> after the jump and so won't reflect the values compared.  Can you explain
>> what actually happens?
>>
>> Eliot (phone)
>>
>> On Apr 29, 2015, at 7:33 AM, commits at source.squeak.org wrote:
>>
>> >
>> > Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
>> > http://source.squeak.org/VMMaker/VMMaker.oscog-cb.1271.mcz
>> >
>> > ==================== Summary ====================
>> >
>> > Name: VMMaker.oscog-cb.1271
>> > Author: cb
>> > Time: 29 April 2015, 10:33:32.261 am
>> > UUID: d1d8e36f-01bc-4d38-b9a2-6a5188c5e3cc
>> > Ancestors: VMMaker.oscog-eem.1270
>> >
>> > In fact in genJumpIf we always reload the counterReg after the
>> trampoline after the jump back to retry. No need to add logic there.
>> >
>> > Fix the bug that #== was marked as mapped whereas it should not thanks
>> to Eliot.
>> >
>> > The SistaSimulator runs without assertion failures. I'm going to try to
>> recompile it now.
>> >
>> > =============== Diff against VMMaker.oscog-eem.1270 ===============
>> >
>> > Item was removed:
>> > - ----- Method: SistaStackToRegisterMappingCogit
>> class>>generatorTableFrom: (in category 'class initialization') -----
>> > - generatorTableFrom: anArray
>> > -    "Override to replace the unmapped, non-counting inlined #== with a
>> mapped counting inlined #==."
>> > -    | table |
>> > -    table := super generatorTableFrom: anArray.
>> > -    table object do:
>> > -        [:descriptor|
>> > -         descriptor generator == #genSpecialSelectorEqualsEquals
>> ifTrue:
>> > -            [descriptor
>> > -                isMapped: true;
>> > -                isMappedInBlock: true;
>> > -                needsFrameFunction: nil]].
>> > -    ^table!
>> >
>> > Item was changed:
>> >  ----- Method: SistaStackToRegisterMappingCogit>>genJumpIf:to: (in
>> category 'bytecode generator support') -----
>> >  genJumpIf: boolean to: targetBytecodePC
>> >      "The heart of performance counting in Sista.  Conditional branches
>> are 6 times less
>> >       frequent than sends and can provide basic block frequencies (send
>> counters can't).
>> >       Each conditional has a 32-bit counter split into an upper 16 bits
>> counting executions
>> >       and a lower half counting untaken executions of the branch.
>> Executing the branch
>> >       decrements the upper half, tripping if the count goes negative.
>> Not taking the branch
>> >       decrements the lower half.  N.B. We *do not* eliminate dead
>> branches (true ifTrue:/true ifFalse:)
>> >       so that scanning for send and branch data is simplified and that
>> branch data is correct."
>> >      <inline: false>
>> >      | desc ok counterAddress countTripped retry counterReg |
>> >      <var: #ok type: #'AbstractInstruction *'>
>> >      <var: #desc type: #'CogSimStackEntry *'>
>> >      <var: #retry type: #'AbstractInstruction *'>
>> >      <var: #countTripped type: #'AbstractInstruction *'>
>> >
>> >      (coInterpreter isOptimizedMethod: methodObj) ifTrue: [ ^ super
>> genJumpIf: boolean to: targetBytecodePC ].
>> >
>> >      self ssFlushTo: simStackPtr - 1.
>> >      desc := self ssTop.
>> >      self ssPop: 1.
>> >      desc popToReg: TempReg.
>> >
>> >      "We prefer calleeSaved to avoid saving it across the trap trip
>> trampoline"
>> >      counterReg := self
>> allocateRegPreferringCalleeSavedNotConflictingWith: 0.
>> >      retry := self Label.
>> >      self
>> >          genExecutionCountLogicInto: [ :cAddress :countTripBranch |
>> >              counterAddress := cAddress.
>> >              countTripped := countTripBranch ]
>> >          counterReg: counterReg.
>> >      counterIndex := counterIndex + 1.
>> >
>> >      "Cunning trick by LPD.  If true and false are contiguous subtract
>> the smaller.
>> >       Correct result is either 0 or the distance between them.  If
>> result is not 0 or
>> >       their distance send mustBeBoolean."
>> >      self assert: (objectMemory objectAfter: objectMemory falseObject)
>> = objectMemory trueObject.
>> >      self annotate: (self SubCw: boolean R: TempReg) objRef: boolean.
>> >      self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
>> >
>> >      self genFallsThroughCountLogicCounterReg: counterReg
>> counterAddress: counterAddress.
>> >
>> >      self CmpCq: (boolean == objectMemory falseObject
>> >                      ifTrue: [objectMemory trueObject - objectMemory
>> falseObject]
>> >                      ifFalse: [objectMemory falseObject - objectMemory
>> trueObject])
>> >          R: TempReg.
>> >      ok := self JumpZero: 0.
>> >      self MoveCq: 0 R: counterReg. "if counterReg is 0 this is a
>> mustBeBoolean, not a counter trip."
>> >
>> > -    self flag: 'Hi Clément.  You can''t just save things to the
>> Smalltalk stack.  You can /only/ save things that execution expects to be
>> there on a context''s stack, because this frame may get mapped to a context
>> object and then back, and gc''ed etc.  The counter reg does not contain an
>> object so is a complete no-no on the Smalltalk stack.  On the C stack in
>> the trampoline is OK, not on the Smalltalk stack in method execution.  So
>> instead of saving and restoring the counterReg around the call, something
>> we can''t do, we can reload it after the call'.
>> > -    false ifTrue:
>> > -        ["If counterReg is caller saved then save it"
>> > -        (self register: counterReg isInMask: callerSavedRegMask)
>> ifTrue: [ self PushR: counterReg ]].
>> > -
>> >      countTripped jmpTarget:
>> >          (self CallRT: (boolean == objectMemory falseObject
>> >                          ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
>> >                          ifFalse:
>> [ceSendMustBeBooleanAddTrueTrampoline])).
>> >
>> >      "If we're in an image which hasn't got the Sista code loaded then
>> the ceCounterTripped:
>> >       trampoline will return directly to machine code, returning the
>> boolean.  So the code should
>> >       jump back to the retry point. The trampoline makes sure that
>> TempReg has been reloaded."
>> >      self annotateBytecode: self Label.
>> >
>> > -    self flag: 'see above'.
>> > -    false ifTrue:
>> > -        ["If counterReg is caller saved then restore it"
>> > -        (self register: counterReg isInMask: callerSavedRegMask)
>> ifTrue: [ self PopR: counterReg ]].
>> > -
>> > -    "Note we /can't/ save and restore the counterReg's contents around
>> the call since the stack can
>> > -     only contain what an interpreted context's stack would contain at
>> the corresponding point.  The
>> > -     counter is not an object, so can't be written to the stack. Hence
>> we reload it after the call."
>> > -    self MoveAw: counterAddress R: counterReg.
>> > -
>> >      self Jump: retry.
>> > +
>> >      ok jmpTarget: self Label.
>> >      ^0!
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20150429/c659dd2a/attachment-0001.htm


More information about the Vm-dev mailing list