[Vm-dev] VM Maker: VMMaker.oscog-cb.1284.mcz
commits at source.squeak.org
commits at source.squeak.org
Tue May 5 11:59:26 UTC 2015
Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-cb.1284.mcz
==================== Summary ====================
Name: VMMaker.oscog-cb.1284
Author: cb
Time: 5 May 2015, 1:58:03.991 pm
UUID: 37a41c7d-771e-4cee-9b78-94dc22070682
Ancestors: VMMaker.oscog-cb.1283
Reduced by 24 bytes the instructions generated by #== in SistaCogit in most cases.
=============== Diff against VMMaker.oscog-cb.1283 ===============
Item was changed:
StackToRegisterMappingCogit subclass: #SistaStackToRegisterMappingCogit
+ instanceVariableNames: 'picDataIndex picData numCounters counters counterIndex initialCounterValue ceTrapTrampoline branchReachedOnlyForCounterTrip'
- instanceVariableNames: 'picDataIndex picData numCounters counters counterIndex initialCounterValue ceTrapTrampoline'
classVariableNames: 'CounterBytes MaxCounterValue'
poolDictionaries: 'VMSqueakClassIndices'
category: 'VMMaker-JIT'!
!SistaStackToRegisterMappingCogit commentStamp: 'eem 4/7/2014 12:23' prior: 0!
A SistaStackToRegisterMappingCogit is a refinement of StackToRegisterMappingCogit that generates code suitable for dynamic optimization by Sista, the Speculative Inlining Smalltalk Architecture, a project by Clément Bera and Eliot Miranda. Sista is an optimizer that exists in the Smalltalk image, /not/ in the VM, and optimizes by substituting normal bytecoded methods by optimized bytecoded methods that may use special bytecodes for which the Cogit can generate faster code. These bytecodes eliminate overheads such as bounds checks or polymorphic code (indexing Array, ByteArray, String etc). But the bulk of the optimization performed is in inlining blocks and sends for the common path.
The basic scheme is that SistaStackToRegisterMappingCogit generates code containing performance counters. When these counters trip, a callback into the image is performed, at which point Sista analyses some portion of the stack, looking at performance data for the methods on the stack, and optimises based on the stack and performance data. Execution then resumes in the optimized code.
SistaStackToRegisterMappingCogit adds counters to conditional branches. Each branch has an executed and a taken count, implemented at the two 16-bit halves of a single 32-bit word. Each counter pair is initialized with initialCounterValue. On entry to the branch the executed count is decremented and if the count goes below zero the ceMustBeBooleanAdd[True|False] trampoline called. The trampoline distinguishes between true mustBeBoolean and counter trips because in the former the register temporarily holding the counter value will contain zero. Then the condition is tested, and if the branch is taken the taken count is decremented. The two counter values allow an optimizer to collect basic block execution paths and to know what are the "hot" paths through execution that are worth agressively optimizing. Since conditional branches are about 1/6 as frequent as sends, and since they can be used to determine the hot path through code, they are a better choice to count than, for example, method or block entry.
SistaStackToRegisterMappingCogit implements picDataFor:into: that fills an Array with the state of the counters in a method and the state of each linked send in a method. This is used to implement a primitive used by the optimizer to answer the branch and send data for a method as an Array.
Instance Variables
counterIndex: <Integer>
counterMethodCache: <CogMethod>
counters: <Array of AbstractInstruction>
initialCounterValue: <Integer>
numCounters: <Integer>
picData: <Integer Oop>
picDataIndex: <Integer>
prevMapAbsPCMcpc: <Integer>
counterIndex
- xxxxx
counterMethodCache
- xxxxx
counters
- xxxxx
initialCounterValue
- xxxxx
numCounters
- xxxxx
picData
- xxxxx
picDataIndex
- xxxxx
prevMapAbsPCMcpc
- xxxxx
!
Item was added:
+ ----- Method: SistaStackToRegisterMappingCogit>>genCounterTripOnlyJumpIf:to: (in category 'bytecode generator support') -----
+ genCounterTripOnlyJumpIf: boolean to: targetBytecodePC
+ "Specific version if the branch is only reached while falling through if the counter trips after an inlined #== branch. We do not regenerate the counter logic in this case to avoid 24 bytes instructions."
+
+ <var: #ok type: #'AbstractInstruction *'>
+ <var: #mustBeBooleanTrampoline type: #'AbstractInstruction *'>
+
+ | ok mustBeBooleanTrampoline |
+
+ self ssFlushTo: simStackPtr - 1.
+
+ self ssTop popToReg: TempReg.
+
+ self ssPop: 1.
+
+ counterIndex := counterIndex + 1. "counters are increased / decreased in the inlined branch"
+
+ "We need SendNumArgsReg because of the mustBeBooleanTrampoline"
+ self ssAllocateRequiredReg: SendNumArgsReg.
+ self MoveCq: 1 R: SendNumArgsReg.
+
+ "The first time this is reached, it calls necessarily the counter trip for the trampoline because SendNumArgsReg is non zero"
+ mustBeBooleanTrampoline := self CallRT: (boolean == objectMemory falseObject
+ ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
+ ifFalse: [ceSendMustBeBooleanAddTrueTrampoline]).
+
+ self annotateBytecode: self Label.
+
+ "Cunning trick by LPD. If true and false are contiguous subtract the smaller.
+ Correct result is either 0 or the distance between them. If result is not 0 or
+ their distance send mustBeBoolean."
+ self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
+ self annotate: (self SubCw: boolean R: TempReg) objRef: boolean.
+ self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
+
+ self CmpCq: (boolean == objectMemory falseObject
+ ifTrue: [objectMemory trueObject - objectMemory falseObject]
+ ifFalse: [objectMemory falseObject - objectMemory trueObject])
+ R: TempReg.
+
+ ok := self JumpZero: 0.
+ self MoveCq: 0 R: SendNumArgsReg. "if counterReg is 0 this is a mustBeBoolean, not a counter trip."
+
+ self Jump: mustBeBooleanTrampoline.
+
+ ok jmpTarget: self Label.
+ ^0!
Item was changed:
----- Method: SistaStackToRegisterMappingCogit>>genJumpIf:to: (in category 'bytecode generator support') -----
genJumpIf: boolean to: targetBytecodePC
"The heart of performance counting in Sista. Conditional branches are 6 times less
frequent than sends and can provide basic block frequencies (send counters can't).
Each conditional has a 32-bit counter split into an upper 16 bits counting executions
and a lower half counting untaken executions of the branch. Executing the branch
decrements the upper half, tripping if the count goes negative. Not taking the branch
decrements the lower half. N.B. We *do not* eliminate dead branches (true ifTrue:/true ifFalse:)
so that scanning for send and branch data is simplified and that branch data is correct."
<inline: false>
+ | ok counterAddress countTripped retry |
- | desc ok counterAddress countTripped retry |
<var: #ok type: #'AbstractInstruction *'>
- <var: #desc type: #'CogSimStackEntry *'>
<var: #retry type: #'AbstractInstruction *'>
<var: #countTripped type: #'AbstractInstruction *'>
(coInterpreter isOptimizedMethod: methodObj) ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].
+
+ branchReachedOnlyForCounterTrip ifTrue:
+ [ branchReachedOnlyForCounterTrip := false.
+ ^ self genCounterTripOnlyJumpIf: boolean to: targetBytecodePC ].
self ssFlushTo: simStackPtr - 1.
+ self ssTop popToReg: TempReg.
- desc := self ssTop.
self ssPop: 1.
- desc popToReg: TempReg.
"We need SendNumArgsReg because of the mustBeBooleanTrampoline"
self ssAllocateRequiredReg: SendNumArgsReg.
retry := self Label.
self
genExecutionCountLogicInto: [ :cAddress :countTripBranch |
counterAddress := cAddress.
countTripped := countTripBranch ]
counterReg: SendNumArgsReg.
counterIndex := counterIndex + 1.
"Cunning trick by LPD. If true and false are contiguous subtract the smaller.
Correct result is either 0 or the distance between them. If result is not 0 or
their distance send mustBeBoolean."
self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
self annotate: (self SubCw: boolean R: TempReg) objRef: boolean.
self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
self genFallsThroughCountLogicCounterReg: SendNumArgsReg counterAddress: counterAddress.
self CmpCq: (boolean == objectMemory falseObject
ifTrue: [objectMemory trueObject - objectMemory falseObject]
ifFalse: [objectMemory falseObject - objectMemory trueObject])
R: TempReg.
ok := self JumpZero: 0.
self MoveCq: 0 R: SendNumArgsReg. "if counterReg is 0 this is a mustBeBoolean, not a counter trip."
+
-
countTripped jmpTarget:
(self CallRT: (boolean == objectMemory falseObject
ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
ifFalse: [ceSendMustBeBooleanAddTrueTrampoline])).
"If we're in an image which hasn't got the Sista code loaded then the ceCounterTripped:
trampoline will return directly to machine code, returning the boolean. So the code should
jump back to the retry point. The trampoline makes sure that TempReg has been reloaded."
self annotateBytecode: self Label.
self Jump: retry.
ok jmpTarget: self Label.
^0!
Item was changed:
----- Method: SistaStackToRegisterMappingCogit>>genSpecialSelectorEqualsEqualsWithForwarders (in category 'bytecode generators') -----
genSpecialSelectorEqualsEqualsWithForwarders
"Override to count inlined branches if followed by a conditional branch.
We borrow the following conditional branch's counter and when about to
inline the comparison we decrement the counter (without writing it back)
and if it trips simply abort the inlining, falling back to the normal send which
will then continue to the conditional branch which will trip and enter the abort."
| nextPC postBranchPC targetBytecodePC branchDescriptor label counterReg fixup
counterAddress countTripped unforwardArg unforwardRcvr argReg rcvrReg regMask |
<var: #fixup type: #'BytecodeFixup *'>
<var: #countTripped type: #'AbstractInstruction *'>
<var: #label type: #'AbstractInstruction *'>
<var: #primDescriptor type: #'BytecodeDescriptor *'>
<var: #branchDescriptor type: #'BytecodeDescriptor *'>
((coInterpreter isOptimizedMethod: methodObj) or: [needsFrame not]) ifTrue: [ ^ self genSpecialSelectorEqualsEqualsWithForwardersWithoutCounters ].
regMask := 0.
self extractMaybeBranchDescriptorInto: [ :descr :next :postBranch :target |
branchDescriptor := descr. nextPC := next. postBranchPC := postBranch. targetBytecodePC := target ].
unforwardRcvr := (objectRepresentation isUnannotatableConstant: (self ssValue: 1)) not.
unforwardArg := (objectRepresentation isUnannotatableConstant: self ssTop) not.
"If an operand is an annotable constant, it may be forwarded, so we need to store it into a
register so the forwarder check can jump back to the comparison after unforwarding the constant.
However, if one of the operand is an unnanotable constant, does not allocate a register for it
(machine code will use operations on constants)."
self
allocateEqualsEqualsRegistersArgNeedsReg: unforwardArg
rcvrNeedsReg: unforwardRcvr
into: [ :rcvr :arg | rcvrReg:= rcvr. argReg := arg ].
argReg ifNotNil: [ regMask := self registerMaskFor: argReg ].
rcvrReg ifNotNil: [ regMask := regMask bitOr: (self registerMaskFor: rcvrReg) ].
"Only interested in inlining if followed by a conditional branch."
(branchDescriptor isBranchTrue or: [branchDescriptor isBranchFalse]) ifFalse:
[^ self genEqualsEqualsNoBranchArgIsConstant: unforwardArg not rcvrIsConstant: unforwardRcvr not argReg: argReg rcvrReg: rcvrReg].
counterReg := self allocateRegNotConflictingWith: regMask.
self
genExecutionCountLogicInto: [ :cAddress :countTripBranch |
counterAddress := cAddress.
countTripped := countTripBranch ]
counterReg: counterReg.
self assert: (unforwardArg or: [ unforwardRcvr ]).
"If branching the stack must be flushed for the merge"
self ssFlushTo: simStackPtr - 2.
label := self Label.
self genEqualsEqualsComparisonArgIsConstant: unforwardArg not rcvrIsConstant: unforwardRcvr not argReg: argReg rcvrReg: rcvrReg.
self ssPop: 2. "pop by 2 temporarily for the fixups"
branchDescriptor isBranchTrue
ifTrue:
[ fixup := self ensureNonMergeFixupAt: postBranchPC - initialPC.
self JumpZero: (self ensureNonMergeFixupAt: targetBytecodePC - initialPC) asUnsignedInteger. ]
ifFalse:
[ fixup := self ensureNonMergeFixupAt: targetBytecodePC - initialPC.
self JumpZero: (self ensureNonMergeFixupAt: postBranchPC - initialPC) asUnsignedInteger. ].
unforwardArg ifTrue: [ objectRepresentation genEnsureOopInRegNotForwarded: argReg scratchReg: TempReg jumpBackTo: label ].
unforwardRcvr ifTrue: [ objectRepresentation genEnsureOopInRegNotForwarded: rcvrReg scratchReg: TempReg jumpBackTo: label ].
self ssPop: -2.
self genFallsThroughCountLogicCounterReg: counterReg counterAddress: counterAddress.
self Jump: fixup.
countTripped jmpTarget: self Label.
"inlined version of #== ignoring the branchDescriptor if the counter trips to have normal state for the optimizer"
self genEqualsEqualsNoBranchArgIsConstant: unforwardArg not rcvrIsConstant: unforwardRcvr not argReg: argReg rcvrReg: rcvrReg.
+
+ (self fixupAt: nextPC - initialPC) targetInstruction = 0 ifTrue: [ branchReachedOnlyForCounterTrip := true ].
+
^ 0!
Item was changed:
----- Method: SistaStackToRegisterMappingCogit>>initialize (in category 'initialization') -----
initialize
super initialize.
+ branchReachedOnlyForCounterTrip := false.
cogMethodSurrogateClass := (objectMemory ifNil: [self class objectMemoryClass]) wordSize = 4
ifTrue: [CogSistaMethodSurrogate32]
ifFalse: [CogSistaMethodSurrogate64]!
More information about the Vm-dev
mailing list