[Vm-dev] VM Maker: VMMaker.oscog-eem.1332.mcz
commits at source.squeak.org
commits at source.squeak.org
Tue Jun 2 00:08:01 UTC 2015
Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-eem.1332.mcz
==================== Summary ====================
Name: VMMaker.oscog-eem.1332
Author: eem
Time: 1 June 2015, 5:02:18.369 pm
UUID: 528336b3-25ab-41f7-b788-2b9155e5d980
Ancestors: VMMaker.oscog-eem.1331
Fix the regression in implicit receiver sends caused
by VMMaker.oscog-eem.1317 (fix to performance
regression caused by using XCHG on x86).
The implicit receiver cache uses SendNumArgsReg
to refer to the cache object. Hence we must use
TempReg for genPushRegisterArgsForNumArgs: in
this case. So refactor to
genPushRegisterArgsForNumArgs:scratchReg:, passing
either TempReg or SendNumArgsReg as approprate.
=============== Diff against VMMaker.oscog-eem.1331 ===============
Item was removed:
- ----- Method: CogARMCompiler>>genPushRegisterArgsForNumArgs: (in category 'smalltalk calling convention') -----
- genPushRegisterArgsForNumArgs: numArgs
- "Ensure that the register args are pushed before the retpc for arity <= self numRegArgs."
- "This is easy on a RISC like ARM because the return address is in the link register. Putting
- the receiver and args above the return address means the CoInterpreter has a single
- machine-code frame format which saves us a lot of work
- NOTA BENE: we do NOT push the return address here, which means it must be dealt with later."
- numArgs <= cogit numRegArgs ifTrue:
- [self assert: cogit numRegArgs <= 2.
- cogit PushR: ReceiverResultReg.
- numArgs > 0 ifTrue:
- [cogit PushR: Arg0Reg.
- numArgs > 1 ifTrue:
- [cogit PushR: Arg1Reg]]]!
Item was added:
+ ----- Method: CogARMCompiler>>genPushRegisterArgsForNumArgs:scratchReg: (in category 'smalltalk calling convention') -----
+ genPushRegisterArgsForNumArgs: numArgs scratchReg: ignored
+ "Ensure that the register args are pushed before the retpc for arity <= self numRegArgs."
+ "This is easy on a RISC like ARM because the return address is in the link register. Putting
+ the receiver and args above the return address means the CoInterpreter has a single
+ machine-code frame format which saves us a lot of work
+ NOTA BENE: we do NOT push the return address here, which means it must be dealt with later."
+ numArgs <= cogit numRegArgs ifTrue:
+ [self assert: cogit numRegArgs <= 2.
+ cogit PushR: ReceiverResultReg.
+ numArgs > 0 ifTrue:
+ [cogit PushR: Arg0Reg.
+ numArgs > 1 ifTrue:
+ [cogit PushR: Arg1Reg]]]!
Item was removed:
- ----- Method: CogAbstractInstruction>>genPushRegisterArgsForNumArgs: (in category 'smalltalk calling convention') -----
- genPushRegisterArgsForNumArgs: numArgs
- "Ensure that the register args are pushed before the retpc for arity <= self numRegArgs."
- "This won't be as clumsy on a RISC. But putting the receiver and
- args above the return address means the CoInterpreter has a
- single machine-code frame format which saves us a lot of work."
-
- "This won't be as clumsy on a RISC. But putting the receiver and
- args above the return address means the CoInterpreter has a
- single machine-code frame format which saves us a lot of work."
-
- "Iff there are register args convert
- sp -> retpc (send site retpc)
- to
- base -> receiver
- (arg0)
- (arg1)
- sp -> retpc (send site retpc)"
- self subclassResponsibility!
Item was added:
+ ----- Method: CogAbstractInstruction>>genPushRegisterArgsForNumArgs:scratchReg: (in category 'smalltalk calling convention') -----
+ genPushRegisterArgsForNumArgs: numArgs scratchReg: scratchReg
+ "Ensure that the register args are pushed before the retpc for arity <= self numRegArgs.
+ This isn't as clumsy on a RISC. But putting the receiver and
+ args above the return address means the CoInterpreter has a
+ single machine-code frame format which saves us a lot of work."
+
+ "Iff there are register args convert
+ sp -> retpc (send site retpc)
+ to
+ base -> receiver
+ (arg0)
+ (arg1)
+ sp -> retpc (send site retpc)"
+ self subclassResponsibility!
Item was removed:
- ----- Method: CogIA32Compiler>>genPushRegisterArgsForNumArgs: (in category 'smalltalk calling convention') -----
- genPushRegisterArgsForNumArgs: numArgs
- "Ensure that the register args are pushed before the retpc for arity <= self numRegArgs. This
- won't be as clumsy on a RISC. But putting the receiver and args above the return address
- means the CoInterpreter has a single machine-code frame format which saves us a lot of work.
- N.B. Take great care to /not/ smash TempReg, which is used in directed send marshalling.
- We could use XCHG to swap the ReceiverResultReg and top-of-stack return address, pushing the
- the ret pc (now in ReceiverResultReg) later, but XCHG is very slow. We can use SendNumArgsReg
- because it is only live in sends of arity >= (NumSendTrampolines - 1)."
- self assert: cogit numRegArgs < (NumSendTrampolines - 1).
- numArgs <= cogit numRegArgs ifTrue:
- [self assert: cogit numRegArgs <= 2.
- false "these two variants show the same performance on Intel Core i7, but the second one may be shorter."
- ifTrue:
- [cogit MoveMw: 0 r: SPReg R: SendNumArgsReg. "Save return pc"
- numArgs > 0 ifTrue:
- [cogit PushR: Arg0Reg.
- numArgs > 1 ifTrue:
- [cogit PushR: Arg1Reg]].
- cogit PushR: SendNumArgsReg.
- cogit MoveR: ReceiverResultReg Mw: objectMemory wordSize * (1 + numArgs) r: SPReg]
- ifFalse:
- ["a.k.a.
- cogit gen: XCHGMwrR operand: 0 operand: SPReg operand: ReceiverResultReg.
- but XCHG is slow."
- cogit MoveMw: 0 r: SPReg R: SendNumArgsReg. "Save return pc"
- cogit MoveR: ReceiverResultReg Mw: 0 r: SPReg.
- numArgs > 0 ifTrue:
- [cogit PushR: Arg0Reg.
- numArgs > 1 ifTrue:
- [cogit PushR: Arg1Reg]].
- cogit PushR: SendNumArgsReg]] "Restore return address"!
Item was added:
+ ----- Method: CogIA32Compiler>>genPushRegisterArgsForNumArgs:scratchReg: (in category 'smalltalk calling convention') -----
+ genPushRegisterArgsForNumArgs: numArgs scratchReg: scratchReg
+ "Ensure that the register args are pushed before the retpc for arity <= self numRegArgs. This
+ isn't as clumsy on a RISC. But putting the receiver and args above the return address
+ means the CoInterpreter has a single machine-code frame format which saves us a lot of work.
+ N.B. Take great care to /not/ smash TempReg, which is used in directed send marshalling.
+ We could use XCHG to swap the ReceiverResultReg and top-of-stack return address, pushing the
+ the ret pc (now in ReceiverResultReg) later, but XCHG is very slow. We can use SendNumArgsReg
+ because it is only live in sends of arity >= (NumSendTrampolines - 1)."
+ self assert: cogit numRegArgs < (NumSendTrampolines - 1).
+ numArgs <= cogit numRegArgs ifTrue:
+ [self assert: cogit numRegArgs <= 2.
+ false "these two variants show the same performance on Intel Core i7, but the second one may be shorter."
+ ifTrue:
+ [cogit MoveMw: 0 r: SPReg R: scratchReg. "Save return pc"
+ numArgs > 0 ifTrue:
+ [cogit PushR: Arg0Reg.
+ numArgs > 1 ifTrue:
+ [cogit PushR: Arg1Reg]].
+ cogit PushR: scratchReg.
+ cogit MoveR: ReceiverResultReg Mw: objectMemory wordSize * (1 + numArgs) r: SPReg]
+ ifFalse:
+ ["a.k.a.
+ cogit gen: XCHGMwrR operand: 0 operand: SPReg operand: ReceiverResultReg.
+ but XCHG is slow."
+ cogit MoveMw: 0 r: SPReg R: scratchReg. "Save return pc"
+ cogit MoveR: ReceiverResultReg Mw: 0 r: SPReg.
+ numArgs > 0 ifTrue:
+ [cogit PushR: Arg0Reg.
+ numArgs > 1 ifTrue:
+ [cogit PushR: Arg1Reg]].
+ cogit PushR: scratchReg]] "Restore return address"!
Item was changed:
----- Method: Cogit>>genNSSendTrampolineFor:numArgs:called: (in category 'initialization') -----
genNSSendTrampolineFor: aRoutine numArgs: numArgs called: aString
"ReceiverResultReg: method receiver
SendNumArgsReg: the NSSendCache cache"
<option: #NewspeakVM>
<var: #aRoutine type: #'void *'>
<var: #aString type: #'char *'>
| jumpMiss jumpItsTheReceiverStupid |
<var: #jumpMiss type: #'AbstractInstruction *'>
<var: #jumpItsTheReceiverStupid type: #'AbstractInstruction *'>
opcodeIndex := 0.
objectRepresentation
genGetInlineCacheClassTagFrom: ReceiverResultReg
into: ClassReg
forEntry: false.
self MoveMw: NSCClassTagIndex * objectMemory wordSize r: SendNumArgsReg R: TempReg.
self CmpR: ClassReg R: TempReg.
jumpMiss := self JumpNonZero: 0.
self MoveMw: NSCEnclosingObjectIndex * objectMemory wordSize r: SendNumArgsReg R: TempReg.
self CmpCq: 0 R: TempReg.
jumpItsTheReceiverStupid := self JumpZero: 0.
self MoveR: TempReg R: ReceiverResultReg.
"We don't patch stack(-numArgs). See comment in ceImplicitReceiverSend:receiver:"
jumpItsTheReceiverStupid jmpTarget: self Label.
self MoveMw: NSCTargetIndex * objectMemory wordSize r: SendNumArgsReg R: TempReg.
self JumpR: TempReg.
jumpMiss jmpTarget: self Label.
objectRepresentation
genEnsureObjInRegNotForwarded: ReceiverResultReg
scratchReg: TempReg
updatingMw: FoxMFReceiver
r: FPReg.
+ self numRegArgs > 0 ifTrue:
+ [backEnd genPushRegisterArgsForNumArgs: numArgs scratchReg: TempReg].
- backEnd genPushRegisterArgsForNumArgs: numArgs.
^self
genTrampolineFor: aRoutine
called: aString
numArgs: 2
arg: SendNumArgsReg "The NSSendCache"
arg: ReceiverResultReg
arg: nil
arg: nil
saveRegs: false
pushLinkReg: true
resultReg: ReceiverResultReg "Never happens?"
appendOpcodes: true!
Item was changed:
----- Method: StackToRegisterMappingCogit>>compileOpenPIC:numArgs: (in category 'in-line cacheing') -----
compileOpenPIC: selector numArgs: numArgs
"Compile the code for an open PIC. Perform a probe of the first-level method
lookup cache followed by a call of ceSendFromInLineCacheMiss: if the probe fails.
Override to push the register args when calling ceSendFromInLineCacheMiss:"
| jumpSelectorMiss jumpClassMiss itsAHit jumpBCMethod |
<var: #jumpSelectorMiss type: #'AbstractInstruction *'>
<var: #jumpClassMiss type: #'AbstractInstruction *'>
<var: #itsAHit type: #'AbstractInstruction *'>
<var: #jumpBCMethod type: #'AbstractInstruction *'>
self compilePICAbort: numArgs.
entry := objectRepresentation genGetClassTagOf: ReceiverResultReg into: ClassReg scratchReg: TempReg.
"Do first of three probes. See CoInterpreter>>lookupInMethodCacheSel:classTag:"
self flag: #lookupInMethodCacheSel:classTag:. "so this method shows up as a sender of lookupInMethodCacheSel:class:"
self MoveR: ClassReg R: SendNumArgsReg.
self annotate: (self XorCw: selector R: ClassReg) objRef: selector.
self LogicalShiftLeftCq: objectMemory shiftForWord R: ClassReg.
self AndCq: MethodCacheMask << objectMemory shiftForWord R: ClassReg.
self MoveMw: coInterpreter methodCacheAddress asUnsignedInteger + (MethodCacheSelector << objectMemory shiftForWord)
r: ClassReg
R: TempReg.
self annotate: (self CmpCw: selector R: TempReg) objRef: selector.
jumpSelectorMiss := self JumpNonZero: 0.
self MoveMw: coInterpreter methodCacheAddress asUnsignedInteger + (MethodCacheClass << objectMemory shiftForWord)
r: ClassReg
R: TempReg.
self CmpR: SendNumArgsReg R: TempReg.
jumpClassMiss := self JumpNonZero: 0.
itsAHit := self Label.
"Fetch the method. The interpret trampoline requires the bytecoded method in SendNumArgsReg"
self MoveMw: coInterpreter methodCacheAddress asUnsignedInteger + (MethodCacheMethod << objectMemory shiftForWord)
r: ClassReg
R: SendNumArgsReg.
"If the method is compiled jump to its unchecked entry-point, otherwise interpret it."
objectRepresentation
genLoadSlot: HeaderIndex sourceReg: SendNumArgsReg destReg: TempReg.
self MoveR: TempReg R: ClassReg.
jumpBCMethod := objectRepresentation genJumpSmallIntegerInScratchReg: TempReg.
jumpBCMethod jmpTarget: picInterpretAbort.
self AddCq: cmNoCheckEntryOffset R: ClassReg.
self JumpR: ClassReg.
"First probe missed. Do second of three probes. Shift hash right one and retry."
jumpSelectorMiss jmpTarget: (jumpClassMiss jmpTarget: self Label).
self MoveR: SendNumArgsReg R: ClassReg.
self annotate: (self XorCw: selector R: ClassReg) objRef: selector.
self LogicalShiftLeftCq: objectMemory shiftForWord - 1 R: ClassReg.
self AndCq: MethodCacheMask << objectMemory shiftForWord R: ClassReg.
self MoveMw: coInterpreter methodCacheAddress asUnsignedInteger + (MethodCacheSelector << objectMemory shiftForWord)
r: ClassReg
R: TempReg.
self annotate: (self CmpCw: selector R: TempReg) objRef: selector.
jumpSelectorMiss := self JumpNonZero: 0.
self MoveMw: coInterpreter methodCacheAddress asUnsignedInteger + (MethodCacheClass << objectMemory shiftForWord)
r: ClassReg
R: TempReg.
self CmpR: SendNumArgsReg R: TempReg.
self JumpZero: itsAHit.
"Second probe missed. Do last probe. Shift hash right two and retry."
jumpSelectorMiss jmpTarget: self Label.
self MoveR: SendNumArgsReg R: ClassReg.
self annotate: (self XorCw: selector R: ClassReg) objRef: selector.
objectMemory shiftForWord > 2 ifTrue:
[self LogicalShiftLeftCq: objectMemory shiftForWord - 1 R: ClassReg].
self AndCq: MethodCacheMask << objectMemory shiftForWord R: ClassReg.
self MoveMw: coInterpreter methodCacheAddress asUnsignedInteger + (MethodCacheSelector << objectMemory shiftForWord)
r: ClassReg
R: TempReg.
self annotate: (self CmpCw: selector R: TempReg) objRef: selector.
jumpSelectorMiss := self JumpNonZero: 0.
self MoveMw: coInterpreter methodCacheAddress asUnsignedInteger + (MethodCacheClass << objectMemory shiftForWord)
r: ClassReg
R: TempReg.
self CmpR: SendNumArgsReg R: TempReg.
self JumpZero: itsAHit.
"Last probe missed. Call ceSendFromInLineCacheMiss: to do the full lookup."
jumpSelectorMiss jmpTarget: self Label.
+ backEnd genPushRegisterArgsForNumArgs: numArgs scratchReg: SendNumArgsReg.
- backEnd genPushRegisterArgsForNumArgs: numArgs.
self genSmalltalkToCStackSwitch: true.
methodLabel addDependent: (self annotateAbsolutePCRef: (self MoveCw: methodLabel asInteger R: SendNumArgsReg)).
self
compileCallFor: #ceSendFromInLineCacheMiss:
numArgs: 1
arg: SendNumArgsReg
arg: nil
arg: nil
arg: nil
resultReg: nil
saveRegs: false
"Note that this call does not return."!
Item was changed:
----- Method: StackToRegisterMappingCogit>>genDoubleArithmetic:preOpCheck: (in category 'primitive generators') -----
genDoubleArithmetic: arithmeticOperator preOpCheck: preOpCheckOrNil
"Receiver and arg in registers.
Stack looks like
return address"
<var: #preOpCheckOrNil declareC: 'AbstractInstruction *(*preOpCheckOrNil)(int rcvrReg, int argReg)'>
| jumpFailClass jumpFailAlloc jumpFailCheck jumpImmediate jumpNonInt doOp |
<var: #jumpFailClass type: #'AbstractInstruction *'>
<var: #jumpFailAlloc type: #'AbstractInstruction *'>
<var: #jumpImmediate type: #'AbstractInstruction *'>
<var: #jumpNonInt type: #'AbstractInstruction *'>
<var: #jumpFailCheck type: #'AbstractInstruction *'>
<var: #doOp type: #'AbstractInstruction *'>
self MoveR: Arg0Reg R: TempReg.
objectRepresentation genGetDoubleValueOf: ReceiverResultReg into: DPFPReg0.
self MoveR: Arg0Reg R: ClassReg.
jumpImmediate := objectRepresentation genJumpImmediateInScratchReg: TempReg.
objectRepresentation genGetCompactClassIndexNonImmOf: Arg0Reg into: SendNumArgsReg.
objectRepresentation genCmpClassFloatCompactIndexR: SendNumArgsReg.
jumpFailClass := self JumpNonZero: 0.
objectRepresentation genGetDoubleValueOf: Arg0Reg into: DPFPReg1.
doOp := self Label.
preOpCheckOrNil ifNotNil:
[jumpFailCheck := self perform: preOpCheckOrNil with: DPFPReg0 with: DPFPReg1].
self gen: arithmeticOperator operand: DPFPReg1 operand: DPFPReg0.
jumpFailAlloc := objectRepresentation
genAllocFloatValue: DPFPReg0
into: SendNumArgsReg
scratchReg: ClassReg
scratchReg: TempReg.
self MoveR: SendNumArgsReg R: ReceiverResultReg.
self RetN: 0.
"We need to push the register args on two paths; this one and the interpreter primitive path.
But the interpreter primitive path won't unless regArgsHaveBeenPushed is false."
self assert: methodOrBlockNumArgs <= self numRegArgs.
jumpFailClass jmpTarget: self Label.
preOpCheckOrNil ifNotNil:
[jumpFailCheck jmpTarget: jumpFailClass getJmpTarget].
+ backEnd genPushRegisterArgsForNumArgs: methodOrBlockNumArgs scratchReg: SendNumArgsReg.
- backEnd genPushRegisterArgsForNumArgs: methodOrBlockNumArgs.
jumpFailClass := self Jump: 0.
jumpImmediate jmpTarget: self Label.
objectRepresentation smallIntegerIsOnlyImmediateType ifFalse:
[jumpNonInt := objectRepresentation genJumpNotSmallIntegerInScratchReg: TempReg].
objectRepresentation genConvertSmallIntegerToIntegerInReg: ClassReg.
self ConvertR: ClassReg Rd: DPFPReg1.
self Jump: doOp.
jumpFailAlloc jmpTarget: self Label.
self compileFallbackToInterpreterPrimitive.
jumpFailClass jmpTarget: self Label.
objectRepresentation smallIntegerIsOnlyImmediateType ifFalse:
[jumpNonInt jmpTarget: jumpFailClass getJmpTarget].
^0!
Item was changed:
----- Method: StackToRegisterMappingCogit>>genPICMissTrampolineFor: (in category 'initialization') -----
genPICMissTrampolineFor: numArgs
<inline: false>
| startAddress |
startAddress := methodZoneBase.
opcodeIndex := 0.
"N.B. a closed PIC jumps to the miss routine, not calls it, so there is only one retpc on the stack."
+ backEnd genPushRegisterArgsForNumArgs: numArgs scratchReg: SendNumArgsReg.
- backEnd genPushRegisterArgsForNumArgs: numArgs.
self genTrampolineFor: #ceCPICMiss:receiver:
called: (self trampolineName: 'cePICMiss' numArgs: (numArgs <= self numRegArgs ifTrue: [numArgs] ifFalse: [-1]))
numArgs: 2
arg: ClassReg
arg: ReceiverResultReg
arg: nil
arg: nil
saveRegs: false
pushLinkReg: true
resultReg: nil
appendOpcodes: true.
^startAddress!
Item was changed:
----- Method: StackToRegisterMappingCogit>>genPushRegisterArgs (in category 'compile abstract instructions') -----
genPushRegisterArgs
"Ensure that the register args are pushed before the retpc for methods with arity <= self numRegArgs."
"This won't be as clumsy on a RISC. But putting the receiver and
args above the return address means the CoInterpreter has a
single machine-code frame format which saves us a lot of work."
(regArgsHaveBeenPushed
or: [methodOrBlockNumArgs > self numRegArgs]) ifFalse:
+ [backEnd genPushRegisterArgsForNumArgs: methodOrBlockNumArgs scratchReg: SendNumArgsReg.
- [backEnd genPushRegisterArgsForNumArgs: methodOrBlockNumArgs.
regArgsHaveBeenPushed := true]!
Item was changed:
----- Method: StackToRegisterMappingCogit>>genSendTrampolineFor:numArgs:called:arg:arg:arg: (in category 'initialization') -----
genSendTrampolineFor: aRoutine numArgs: numArgs called: aString arg: regOrConst0 arg: regOrConst1 arg: regOrConst2
"Generate a trampoline with three arguments.
Hack: a negative value indicates an abstract register, a non-negative value indicates a constant."
<var: #aRoutine type: #'void *'>
<var: #aString type: #'char *'>
| startAddress |
<inline: false>
startAddress := methodZoneBase.
opcodeIndex := 0.
+ backEnd genPushRegisterArgsForNumArgs: numArgs scratchReg: SendNumArgsReg.
- backEnd genPushRegisterArgsForNumArgs: numArgs.
self genTrampolineFor: aRoutine
called: aString
numArgs: 3
arg: regOrConst0
arg: regOrConst1
arg: regOrConst2
arg: nil
saveRegs: false
pushLinkReg: true
resultReg: nil
appendOpcodes: true.
^startAddress!
Item was changed:
----- Method: StackToRegisterMappingCogit>>genSendTrampolineFor:numArgs:called:arg:arg:arg:arg: (in category 'initialization') -----
genSendTrampolineFor: aRoutine numArgs: numArgs called: aString arg: regOrConst0 arg: regOrConst1 arg: regOrConst2 arg: regOrConst3
"Generate a trampoline with four arguments.
Hack: a negative value indicates an abstract register, a non-negative value indicates a constant."
<var: #aRoutine type: #'void *'>
<var: #aString type: #'char *'>
| startAddress |
<inline: false>
startAddress := methodZoneBase.
opcodeIndex := 0.
+ backEnd genPushRegisterArgsForNumArgs: numArgs scratchReg: SendNumArgsReg.
- backEnd genPushRegisterArgsForNumArgs: numArgs.
self genTrampolineFor: aRoutine
called: aString
numArgs: 4
arg: regOrConst0
arg: regOrConst1
arg: regOrConst2
arg: regOrConst3
saveRegs: false
pushLinkReg: true
resultReg: nil
appendOpcodes: true.
^startAddress!
More information about the Vm-dev
mailing list