... at http://www.mirandabanda.org/files/Cog/VM/VM.r3319.
CogVM binaries as per VMMaker.oscog-eem.1288/r3319
Spur:
Implement remembered table pruning via ref counts. The algorithm
selectively
tenures objects to reduce the remembered table, instead of merely tenuring
everything.
Be selective about remembering tenured objects; actually scan their contents
before remembering willy-nilly.
StackVM:
Allow setting preemptionYields in the stack VM.
Cogit:
Use new register allocation in #== with V3 in order to limit register moves.
--
best,
Eliot
Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-eem.1291.mcz
==================== Summary ====================
Name: VMMaker.oscog-eem.1291
Author: eem
Time: 6 May 2015, 10:41:54.305 am
UUID: 522d9b4a-a938-488b-ae43-6b4ccdd13d5c
Ancestors: VMMaker.oscog-cb.1290
Change computeRefCountToShrinkRT to
- compute the ref counts and population in a single
pass over the RT
- determine the ref count for tenuring based on
half the population of remembered objects, /not/
half the size of the RT.
=============== Diff against VMMaker.oscog-cb.1290 ===============
Item was changed:
----- Method: SpurGenerationScavenger>>computeRefCountToShrinkRT (in category 'remembered set') -----
computeRefCountToShrinkRT
"Some time in every scavenger's life there may come a time when someone writes code that stresses
the remembered table. One might conclude that if the remembered table is full, then the right thing
to do is simply to tenure everything, emptying the remembered table. Bt in some circumstances this
can be counter-productive, and result in the same situation arising soon after tenuring everything.
Instead, we can try and selectively prune the remembered table, tenuring only those objects that
are referenced by many objects in the remembered table. That's what this algorithm does. It
reference counts young objects referenced from the remembered set, and then sets a threshold
used to tenure objects oft referenced from the remembered set, thereby allowing the remembered
set to shrink, while not tenuring everything.
Once in a network monitoring application in a galaxy not dissimilar from the one this code inhabits,
a tree of nodes referring to large integers was in precisely this situation. The nodes were old, and
the integers were in new space. Some of the nodes referred to shared numbers, some their own
unique numbers. The numbers were updated frequently. Were new space simply tenured when the
remembered table was full, the remembered table would soon fill up as new numbers were computed.
Only by selectively pruning the remembered table of nodes that shared data, was a balance achieved
whereby the remembered table population was kept small, and tenuring rates were low."
<inline: #never>
+ | population |
+ <var: 'population' declareC: 'long population[MaxRTRefCount + 1]'>
+ self cCode: [self me: population ms: 0 et: (self sizeof: #long) * (MaxRTRefCount + 1)]
+ inSmalltalk: [population := CArrayAccessor on: (Array new: MaxRTRefCount + 1 withAll: 0)].
- | counts |
- <var: 'counts' declareC: 'long counts[MaxRTRefCount + 1]'>
- self cCode: '' inSmalltalk: [counts := CArrayAccessor on: (Array new: MaxRTRefCount + 1)].
self assert: self allNewSpaceObjectsHaveZeroRTRefCount.
+ self referenceCountRememberedReferents: population.
+ self setRefCountToShrinkRT: population
+
+ "For debugging:
+ (manager allNewSpaceObjectsDo: [:o| manager rtRefCountOf: o put: 0])"!
- self referenceCountRememberedReferents.
- self totalRememberedReferentCounts: counts.
- self setRefCountToShrinkRT: counts!
Item was removed:
- ----- Method: SpurGenerationScavenger>>referenceCountRememberedReferents (in category 'remembered set') -----
- referenceCountRememberedReferents
- "Reference count each referent of the remembered table using the rtRefCount
- field comprised of isGrey,isPinned,isRemembered. i.e. produce a reference
- count from 1 to 7 in all objects accessible from the RT."
- <inline: true>
- 0 to: rememberedSetSize - 1 do:
- [:i| | elephant |
- elephant := rememberedSet at: i.
- 0 to: (manager numPointerSlotsOf: elephant) - 1 do:
- [:j| | referent refCount |
- referent := manager fetchPointer: j ofObject: elephant.
- (manager isReallyYoung: referent) ifTrue:
- [refCount := manager rtRefCountOf: referent.
- (refCount := refCount + 1) <= MaxRTRefCount ifTrue:
- [manager rtRefCountOf: referent put: refCount]]]]!
Item was added:
+ ----- Method: SpurGenerationScavenger>>referenceCountRememberedReferents: (in category 'remembered set') -----
+ referenceCountRememberedReferents: population
+ "Both reference count young objects reachable from the RT,
+ and count the populations of each ref count, in a single pass."
+ <var: 'population' declareC: 'long population[MaxRTRefCount + 1]'>
+ <inline: true>
+ 0 to: rememberedSetSize - 1 do:
+ [:i| | elephant |
+ elephant := rememberedSet at: i.
+ 0 to: (manager numPointerSlotsOf: elephant) - 1 do:
+ [:j| | referent refCount |
+ referent := manager fetchPointer: j ofObject: elephant.
+ (manager isReallyYoung: referent) ifTrue:
+ [refCount := manager rtRefCountOf: referent.
+ refCount < MaxRTRefCount ifTrue:
+ [refCount > 0 ifTrue:
+ [population at: refCount put: (population at: refCount) - 1].
+ refCount := refCount + 1.
+ manager rtRefCountOf: referent put: refCount.
+ population at: refCount put: (population at: refCount) + 1]]]].!
Item was changed:
----- Method: SpurGenerationScavenger>>setRefCountToShrinkRT: (in category 'remembered set') -----
+ setRefCountToShrinkRT: population
+ "Choose a refCount that should shrink the rt by at least half.
+ i.e. find the maximum reference count that half the population have at least."
+ <var: 'population' declareC: 'long population[MaxRTRefCount + 1]'>
- setRefCountToShrinkRT: counts
- "Choose a refCount that will shrink the rt by at least half."
- <var: 'counts' declareC: 'long counts[MaxRTRefCount + 1]'>
<inline: true>
+ | entirePopulation i count |
+ self assert: (population at: 0) = 0.
+ entirePopulation := 0.
+ 1 to: MaxRTRefCount do:
+ [:j| entirePopulation := entirePopulation + (population at: j)].
- | i count |
count := 0.
i := MaxRTRefCount + 1.
+ [count < (entirePopulation // 2) and: [(i := i - 1) >= 0]] whileTrue:
+ [count := count + (population at: i)].
- [count < (rememberedSetSize // 2) and: [(i := i - 1) >= 0]] whileTrue:
- [count := count + (counts at: i)].
refCountToShrinkRT := i max: 0!
Item was removed:
- ----- Method: SpurGenerationScavenger>>totalRememberedReferentCounts: (in category 'remembered set') -----
- totalRememberedReferentCounts: counts
- <var: 'counts' declareC: 'long counts[MaxRTRefCount + 1]'>
- <inline: true>
- 0 to: MaxRTRefCount do:
- [:i| counts at: i put: 0].
- 0 to: rememberedSetSize - 1 do:
- [:i| | elephant |
- elephant := rememberedSet at: i.
- 0 to: (manager numPointerSlotsOf: elephant) - 1 do:
- [:j| | referent refCount |
- referent := manager fetchPointer: j ofObject: elephant.
- (manager isReallyYoung: referent) ifTrue:
- [refCount := manager rtRefCountOf: referent.
- counts at: refCount put: (counts at: refCount) + 1]]]!
Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-cb.1290.mcz
==================== Summary ====================
Name: VMMaker.oscog-cb.1290
Author: cb
Time: 6 May 2015, 11:05:39.237 am
UUID: bfb69f82-a7f2-42e8-b375-32586dc89270
Ancestors: VMMaker.oscog-cb.1289
Removed the storeCheck in inlined pointer at:put: if the value stored is an unannotatable constant. I believe this is important in some cases I found where an array was initialized with 0 or false in all its slots.
=============== Diff against VMMaker.oscog-cb.1289 ===============
Item was changed:
----- Method: StackToRegisterMappingCogit>>genTrinaryInlinePrimitive: (in category 'inline primitive generators') -----
genTrinaryInlinePrimitive: prim
"Unary inline primitives."
"SistaV1: 248 11111000 iiiiiiii mjjjjjjj Call Primitive #iiiiiiii + (jjjjjjj * 256) m=1 means inlined primitive, no hard return after execution.
See EncoderForSistaV1's class comment and StackInterpreter>>#trinaryInlinePrimitive:"
+ | ra1 ra2 rr adjust needsStoreCheck |
- | ra1 ra2 rr adjust |
"The store check requires rr to be ReceiverResultReg"
self
allocateRegForStackTopThreeEntriesInto: [:rTop :rNext :rThird | ra2 := rTop. ra1 := rNext. rr := rThird ]
thirdIsReceiver: prim = 0.
self assert: (rr ~= ra1 and: [rr ~= ra2 and: [ra1 ~= ra2]]).
+ needsStoreCheck := (objectRepresentation isUnannotatableConstant: self ssTop) not.
self ssTop popToReg: ra2.
self ssPop: 1.
self ssTop popToReg: ra1.
self ssPop: 1.
self ssTop popToReg: rr.
self ssPop: 1.
objectRepresentation genConvertSmallIntegerToIntegerInReg: ra1.
"Now: ra is the variable object, rr is long, TempReg holds the value to store."
prim caseOf: {
"0 - 1 pointerAt:put: and byteAt:Put:"
[0] -> [ adjust := (objectMemory baseHeaderSize >> objectMemory shiftForWord) - 1. "shift by baseHeaderSize and then move from 1 relative to zero relative"
adjust ~= 0 ifTrue: [ self AddCq: adjust R: ra1. ].
self MoveR: ra2 Xwr: ra1 R: rr.
+ "I added needsStoreCheck so if you initialize an array with a Smi such as 0 or a boolean you don't need the store check"
+ needsStoreCheck ifTrue:
+ [ self assert: needsFrame.
+ objectRepresentation genStoreCheckReceiverReg: rr valueReg: ra2 scratchReg: TempReg inFrame: true] ].
- objectRepresentation genStoreCheckReceiverReg: rr valueReg: ra2 scratchReg: TempReg inFrame: true].
[1] -> [ objectRepresentation genConvertSmallIntegerToIntegerInReg: ra2.
adjust := objectMemory baseHeaderSize - 1. "shift by baseHeaderSize and then move from 1 relative to zero relative"
self AddCq: adjust R: ra1.
self MoveR: ra2 Xbr: ra1 R: rr.
objectRepresentation genConvertIntegerToSmallIntegerInReg: ra2. ]
}
otherwise: [^EncounteredUnknownBytecode].
self ssPushRegister: ra2.
^0!
Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-cb.1284.mcz
==================== Summary ====================
Name: VMMaker.oscog-cb.1284
Author: cb
Time: 5 May 2015, 1:58:03.991 pm
UUID: 37a41c7d-771e-4cee-9b78-94dc22070682
Ancestors: VMMaker.oscog-cb.1283
Reduced by 24 bytes the instructions generated by #== in SistaCogit in most cases.
=============== Diff against VMMaker.oscog-cb.1283 ===============
Item was changed:
StackToRegisterMappingCogit subclass: #SistaStackToRegisterMappingCogit
+ instanceVariableNames: 'picDataIndex picData numCounters counters counterIndex initialCounterValue ceTrapTrampoline branchReachedOnlyForCounterTrip'
- instanceVariableNames: 'picDataIndex picData numCounters counters counterIndex initialCounterValue ceTrapTrampoline'
classVariableNames: 'CounterBytes MaxCounterValue'
poolDictionaries: 'VMSqueakClassIndices'
category: 'VMMaker-JIT'!
!SistaStackToRegisterMappingCogit commentStamp: 'eem 4/7/2014 12:23' prior: 0!
A SistaStackToRegisterMappingCogit is a refinement of StackToRegisterMappingCogit that generates code suitable for dynamic optimization by Sista, the Speculative Inlining Smalltalk Architecture, a project by Clément Bera and Eliot Miranda. Sista is an optimizer that exists in the Smalltalk image, /not/ in the VM, and optimizes by substituting normal bytecoded methods by optimized bytecoded methods that may use special bytecodes for which the Cogit can generate faster code. These bytecodes eliminate overheads such as bounds checks or polymorphic code (indexing Array, ByteArray, String etc). But the bulk of the optimization performed is in inlining blocks and sends for the common path.
The basic scheme is that SistaStackToRegisterMappingCogit generates code containing performance counters. When these counters trip, a callback into the image is performed, at which point Sista analyses some portion of the stack, looking at performance data for the methods on the stack, and optimises based on the stack and performance data. Execution then resumes in the optimized code.
SistaStackToRegisterMappingCogit adds counters to conditional branches. Each branch has an executed and a taken count, implemented at the two 16-bit halves of a single 32-bit word. Each counter pair is initialized with initialCounterValue. On entry to the branch the executed count is decremented and if the count goes below zero the ceMustBeBooleanAdd[True|False] trampoline called. The trampoline distinguishes between true mustBeBoolean and counter trips because in the former the register temporarily holding the counter value will contain zero. Then the condition is tested, and if the branch is taken the taken count is decremented. The two counter values allow an optimizer to collect basic block execution paths and to know what are the "hot" paths through execution that are worth agressively optimizing. Since conditional branches are about 1/6 as frequent as sends, and since they can be used to determine the hot path through code, they are a better choice to count than, for example, method or block entry.
SistaStackToRegisterMappingCogit implements picDataFor:into: that fills an Array with the state of the counters in a method and the state of each linked send in a method. This is used to implement a primitive used by the optimizer to answer the branch and send data for a method as an Array.
Instance Variables
counterIndex: <Integer>
counterMethodCache: <CogMethod>
counters: <Array of AbstractInstruction>
initialCounterValue: <Integer>
numCounters: <Integer>
picData: <Integer Oop>
picDataIndex: <Integer>
prevMapAbsPCMcpc: <Integer>
counterIndex
- xxxxx
counterMethodCache
- xxxxx
counters
- xxxxx
initialCounterValue
- xxxxx
numCounters
- xxxxx
picData
- xxxxx
picDataIndex
- xxxxx
prevMapAbsPCMcpc
- xxxxx
!
Item was added:
+ ----- Method: SistaStackToRegisterMappingCogit>>genCounterTripOnlyJumpIf:to: (in category 'bytecode generator support') -----
+ genCounterTripOnlyJumpIf: boolean to: targetBytecodePC
+ "Specific version if the branch is only reached while falling through if the counter trips after an inlined #== branch. We do not regenerate the counter logic in this case to avoid 24 bytes instructions."
+
+ <var: #ok type: #'AbstractInstruction *'>
+ <var: #mustBeBooleanTrampoline type: #'AbstractInstruction *'>
+
+ | ok mustBeBooleanTrampoline |
+
+ self ssFlushTo: simStackPtr - 1.
+
+ self ssTop popToReg: TempReg.
+
+ self ssPop: 1.
+
+ counterIndex := counterIndex + 1. "counters are increased / decreased in the inlined branch"
+
+ "We need SendNumArgsReg because of the mustBeBooleanTrampoline"
+ self ssAllocateRequiredReg: SendNumArgsReg.
+ self MoveCq: 1 R: SendNumArgsReg.
+
+ "The first time this is reached, it calls necessarily the counter trip for the trampoline because SendNumArgsReg is non zero"
+ mustBeBooleanTrampoline := self CallRT: (boolean == objectMemory falseObject
+ ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
+ ifFalse: [ceSendMustBeBooleanAddTrueTrampoline]).
+
+ self annotateBytecode: self Label.
+
+ "Cunning trick by LPD. If true and false are contiguous subtract the smaller.
+ Correct result is either 0 or the distance between them. If result is not 0 or
+ their distance send mustBeBoolean."
+ self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
+ self annotate: (self SubCw: boolean R: TempReg) objRef: boolean.
+ self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
+
+ self CmpCq: (boolean == objectMemory falseObject
+ ifTrue: [objectMemory trueObject - objectMemory falseObject]
+ ifFalse: [objectMemory falseObject - objectMemory trueObject])
+ R: TempReg.
+
+ ok := self JumpZero: 0.
+ self MoveCq: 0 R: SendNumArgsReg. "if counterReg is 0 this is a mustBeBoolean, not a counter trip."
+
+ self Jump: mustBeBooleanTrampoline.
+
+ ok jmpTarget: self Label.
+ ^0!
Item was changed:
----- Method: SistaStackToRegisterMappingCogit>>genJumpIf:to: (in category 'bytecode generator support') -----
genJumpIf: boolean to: targetBytecodePC
"The heart of performance counting in Sista. Conditional branches are 6 times less
frequent than sends and can provide basic block frequencies (send counters can't).
Each conditional has a 32-bit counter split into an upper 16 bits counting executions
and a lower half counting untaken executions of the branch. Executing the branch
decrements the upper half, tripping if the count goes negative. Not taking the branch
decrements the lower half. N.B. We *do not* eliminate dead branches (true ifTrue:/true ifFalse:)
so that scanning for send and branch data is simplified and that branch data is correct."
<inline: false>
+ | ok counterAddress countTripped retry |
- | desc ok counterAddress countTripped retry |
<var: #ok type: #'AbstractInstruction *'>
- <var: #desc type: #'CogSimStackEntry *'>
<var: #retry type: #'AbstractInstruction *'>
<var: #countTripped type: #'AbstractInstruction *'>
(coInterpreter isOptimizedMethod: methodObj) ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].
+
+ branchReachedOnlyForCounterTrip ifTrue:
+ [ branchReachedOnlyForCounterTrip := false.
+ ^ self genCounterTripOnlyJumpIf: boolean to: targetBytecodePC ].
self ssFlushTo: simStackPtr - 1.
+ self ssTop popToReg: TempReg.
- desc := self ssTop.
self ssPop: 1.
- desc popToReg: TempReg.
"We need SendNumArgsReg because of the mustBeBooleanTrampoline"
self ssAllocateRequiredReg: SendNumArgsReg.
retry := self Label.
self
genExecutionCountLogicInto: [ :cAddress :countTripBranch |
counterAddress := cAddress.
countTripped := countTripBranch ]
counterReg: SendNumArgsReg.
counterIndex := counterIndex + 1.
"Cunning trick by LPD. If true and false are contiguous subtract the smaller.
Correct result is either 0 or the distance between them. If result is not 0 or
their distance send mustBeBoolean."
self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
self annotate: (self SubCw: boolean R: TempReg) objRef: boolean.
self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
self genFallsThroughCountLogicCounterReg: SendNumArgsReg counterAddress: counterAddress.
self CmpCq: (boolean == objectMemory falseObject
ifTrue: [objectMemory trueObject - objectMemory falseObject]
ifFalse: [objectMemory falseObject - objectMemory trueObject])
R: TempReg.
ok := self JumpZero: 0.
self MoveCq: 0 R: SendNumArgsReg. "if counterReg is 0 this is a mustBeBoolean, not a counter trip."
+
-
countTripped jmpTarget:
(self CallRT: (boolean == objectMemory falseObject
ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
ifFalse: [ceSendMustBeBooleanAddTrueTrampoline])).
"If we're in an image which hasn't got the Sista code loaded then the ceCounterTripped:
trampoline will return directly to machine code, returning the boolean. So the code should
jump back to the retry point. The trampoline makes sure that TempReg has been reloaded."
self annotateBytecode: self Label.
self Jump: retry.
ok jmpTarget: self Label.
^0!
Item was changed:
----- Method: SistaStackToRegisterMappingCogit>>genSpecialSelectorEqualsEqualsWithForwarders (in category 'bytecode generators') -----
genSpecialSelectorEqualsEqualsWithForwarders
"Override to count inlined branches if followed by a conditional branch.
We borrow the following conditional branch's counter and when about to
inline the comparison we decrement the counter (without writing it back)
and if it trips simply abort the inlining, falling back to the normal send which
will then continue to the conditional branch which will trip and enter the abort."
| nextPC postBranchPC targetBytecodePC branchDescriptor label counterReg fixup
counterAddress countTripped unforwardArg unforwardRcvr argReg rcvrReg regMask |
<var: #fixup type: #'BytecodeFixup *'>
<var: #countTripped type: #'AbstractInstruction *'>
<var: #label type: #'AbstractInstruction *'>
<var: #primDescriptor type: #'BytecodeDescriptor *'>
<var: #branchDescriptor type: #'BytecodeDescriptor *'>
((coInterpreter isOptimizedMethod: methodObj) or: [needsFrame not]) ifTrue: [ ^ self genSpecialSelectorEqualsEqualsWithForwardersWithoutCounters ].
regMask := 0.
self extractMaybeBranchDescriptorInto: [ :descr :next :postBranch :target |
branchDescriptor := descr. nextPC := next. postBranchPC := postBranch. targetBytecodePC := target ].
unforwardRcvr := (objectRepresentation isUnannotatableConstant: (self ssValue: 1)) not.
unforwardArg := (objectRepresentation isUnannotatableConstant: self ssTop) not.
"If an operand is an annotable constant, it may be forwarded, so we need to store it into a
register so the forwarder check can jump back to the comparison after unforwarding the constant.
However, if one of the operand is an unnanotable constant, does not allocate a register for it
(machine code will use operations on constants)."
self
allocateEqualsEqualsRegistersArgNeedsReg: unforwardArg
rcvrNeedsReg: unforwardRcvr
into: [ :rcvr :arg | rcvrReg:= rcvr. argReg := arg ].
argReg ifNotNil: [ regMask := self registerMaskFor: argReg ].
rcvrReg ifNotNil: [ regMask := regMask bitOr: (self registerMaskFor: rcvrReg) ].
"Only interested in inlining if followed by a conditional branch."
(branchDescriptor isBranchTrue or: [branchDescriptor isBranchFalse]) ifFalse:
[^ self genEqualsEqualsNoBranchArgIsConstant: unforwardArg not rcvrIsConstant: unforwardRcvr not argReg: argReg rcvrReg: rcvrReg].
counterReg := self allocateRegNotConflictingWith: regMask.
self
genExecutionCountLogicInto: [ :cAddress :countTripBranch |
counterAddress := cAddress.
countTripped := countTripBranch ]
counterReg: counterReg.
self assert: (unforwardArg or: [ unforwardRcvr ]).
"If branching the stack must be flushed for the merge"
self ssFlushTo: simStackPtr - 2.
label := self Label.
self genEqualsEqualsComparisonArgIsConstant: unforwardArg not rcvrIsConstant: unforwardRcvr not argReg: argReg rcvrReg: rcvrReg.
self ssPop: 2. "pop by 2 temporarily for the fixups"
branchDescriptor isBranchTrue
ifTrue:
[ fixup := self ensureNonMergeFixupAt: postBranchPC - initialPC.
self JumpZero: (self ensureNonMergeFixupAt: targetBytecodePC - initialPC) asUnsignedInteger. ]
ifFalse:
[ fixup := self ensureNonMergeFixupAt: targetBytecodePC - initialPC.
self JumpZero: (self ensureNonMergeFixupAt: postBranchPC - initialPC) asUnsignedInteger. ].
unforwardArg ifTrue: [ objectRepresentation genEnsureOopInRegNotForwarded: argReg scratchReg: TempReg jumpBackTo: label ].
unforwardRcvr ifTrue: [ objectRepresentation genEnsureOopInRegNotForwarded: rcvrReg scratchReg: TempReg jumpBackTo: label ].
self ssPop: -2.
self genFallsThroughCountLogicCounterReg: counterReg counterAddress: counterAddress.
self Jump: fixup.
countTripped jmpTarget: self Label.
"inlined version of #== ignoring the branchDescriptor if the counter trips to have normal state for the optimizer"
self genEqualsEqualsNoBranchArgIsConstant: unforwardArg not rcvrIsConstant: unforwardRcvr not argReg: argReg rcvrReg: rcvrReg.
+
+ (self fixupAt: nextPC - initialPC) targetInstruction = 0 ifTrue: [ branchReachedOnlyForCounterTrip := true ].
+
^ 0!
Item was changed:
----- Method: SistaStackToRegisterMappingCogit>>initialize (in category 'initialization') -----
initialize
super initialize.
+ branchReachedOnlyForCounterTrip := false.
cogMethodSurrogateClass := (objectMemory ifNil: [self class objectMemoryClass]) wordSize = 4
ifTrue: [CogSistaMethodSurrogate32]
ifFalse: [CogSistaMethodSurrogate64]!
Ok this fix the bug the SistaCogit now runs fine.
It could be better to put some values on stack instead of using registers
when calling some trampolines but I am a bit confused by which registers a
trampoline can used in its implementation so I postpone that for later.
2015-05-05 11:48 GMT+02:00 <commits(a)source.squeak.org>:
>
> Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
> http://source.squeak.org/VMMaker/VMMaker.oscog-cb.1283.mcz
>
> ==================== Summary ====================
>
> Name: VMMaker.oscog-cb.1283
> Author: cb
> Time: 5 May 2015, 11:48:13.804 am
> UUID: 3000cd43-f84f-4b75-9fdf-aff579a5105b
> Ancestors: VMMaker.oscog-eem.1282
>
> Fixed a bug in SistaCogit where a trampoline needed a specific register
> instead of an allocated one.
>
> =============== Diff against VMMaker.oscog-eem.1282 ===============
>
> Item was removed:
> - ----- Method:
> SistaStackToRegisterMappingCogit>>allocateRegPreferringCalleeSavedNotConflictingWith:
> (in category 'simulation stack') -----
> - allocateRegPreferringCalleeSavedNotConflictingWith: regMask
> - "If there are multiple free registers, choose one which is callee
> saved,
> - else just allocate a register not conflicting with regMask"
> - | reg |
> - reg := backEnd availableRegisterOrNilFor: ((self liveRegisters
> bitOr: regMask) bitOr: callerSavedRegMask).
> - ^ reg
> - ifNil: [ self allocateRegNotConflictingWith: regMask ]
> - ifNotNil: [ reg ]!
>
> Item was changed:
> ----- Method: SistaStackToRegisterMappingCogit>>genJumpIf:to: (in
> category 'bytecode generator support') -----
> genJumpIf: boolean to: targetBytecodePC
> "The heart of performance counting in Sista. Conditional branches
> are 6 times less
> frequent than sends and can provide basic block frequencies (send
> counters can't).
> Each conditional has a 32-bit counter split into an upper 16 bits
> counting executions
> and a lower half counting untaken executions of the branch.
> Executing the branch
> decrements the upper half, tripping if the count goes negative.
> Not taking the branch
> decrements the lower half. N.B. We *do not* eliminate dead
> branches (true ifTrue:/true ifFalse:)
> so that scanning for send and branch data is simplified and that
> branch data is correct."
> <inline: false>
> + | desc ok counterAddress countTripped retry |
> - | desc ok counterAddress countTripped retry counterReg |
> <var: #ok type: #'AbstractInstruction *'>
> <var: #desc type: #'CogSimStackEntry *'>
> <var: #retry type: #'AbstractInstruction *'>
> <var: #countTripped type: #'AbstractInstruction *'>
>
> (coInterpreter isOptimizedMethod: methodObj) ifTrue: [ ^ super
> genJumpIf: boolean to: targetBytecodePC ].
>
> self ssFlushTo: simStackPtr - 1.
> desc := self ssTop.
> self ssPop: 1.
> desc popToReg: TempReg.
>
> + "We need SendNumArgsReg because of the mustBeBooleanTrampoline"
> + self ssAllocateRequiredReg: SendNumArgsReg.
> +
> - "We prefer calleeSaved to avoid saving it across the trap trip
> trampoline"
> - counterReg := self
> allocateRegPreferringCalleeSavedNotConflictingWith: 0.
> retry := self Label.
> self
> genExecutionCountLogicInto: [ :cAddress :countTripBranch |
> counterAddress := cAddress.
> countTripped := countTripBranch ]
> + counterReg: SendNumArgsReg.
> - counterReg: counterReg.
> counterIndex := counterIndex + 1.
>
> "Cunning trick by LPD. If true and false are contiguous subtract
> the smaller.
> Correct result is either 0 or the distance between them. If
> result is not 0 or
> their distance send mustBeBoolean."
> self assert: (objectMemory objectAfter: objectMemory falseObject)
> = objectMemory trueObject.
> self annotate: (self SubCw: boolean R: TempReg) objRef: boolean.
> self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
>
> + self genFallsThroughCountLogicCounterReg: SendNumArgsReg
> counterAddress: counterAddress.
> - self genFallsThroughCountLogicCounterReg: counterReg
> counterAddress: counterAddress.
>
> self CmpCq: (boolean == objectMemory falseObject
> ifTrue: [objectMemory trueObject -
> objectMemory falseObject]
> ifFalse: [objectMemory falseObject
> - objectMemory trueObject])
> R: TempReg.
> ok := self JumpZero: 0.
> + self MoveCq: 0 R: SendNumArgsReg. "if counterReg is 0 this is a
> mustBeBoolean, not a counter trip."
> - self MoveCq: 0 R: counterReg. "if counterReg is 0 this is a
> mustBeBoolean, not a counter trip."
>
> countTripped jmpTarget:
> (self CallRT: (boolean == objectMemory falseObject
> ifTrue:
> [ceSendMustBeBooleanAddFalseTrampoline]
> ifFalse:
> [ceSendMustBeBooleanAddTrueTrampoline])).
>
> "If we're in an image which hasn't got the Sista code loaded then
> the ceCounterTripped:
> trampoline will return directly to machine code, returning the
> boolean. So the code should
> jump back to the retry point. The trampoline makes sure that
> TempReg has been reloaded."
> self annotateBytecode: self Label.
>
> self Jump: retry.
>
> ok jmpTarget: self Label.
> ^0!
>
>