<div dir="ltr">Eliot,<div><br></div><div>Recent commits are very exciting. Context and closure creations are now inlined in machine code :-)</div><div><br></div><div>Have you already done at:put: and stringAt:put: or is it your next step ?</div>
<div><br></div><div>Please tell us about the new bench results with these features.</div><div><br></div><div>Clément</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-06-02 16:14 GMT+02:00 <span dir="ltr"><<a href="mailto:commits@source.squeak.org" target="_blank">commits@source.squeak.org</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
Eliot Miranda uploaded a new version of VMMaker to project VM Maker:<br>
<a href="http://source.squeak.org/VMMaker/VMMaker.oscog-eem.746.mcz" target="_blank">http://source.squeak.org/VMMaker/VMMaker.oscog-eem.746.mcz</a><br>
<br>
==================== Summary ====================<br>
<br>
Name: VMMaker.oscog-eem.746<br>
Author: eem<br>
Time: 1 June 2014, 6:05:30.694 pm<br>
UUID: cc4961d3-e629-4e28-b308-88eab314a8c9<br>
Ancestors: VMMaker.oscog-eem.745<br>
<br>
Implement a peephole in the Spur Cogit for an indirection<br>
vector initialized with a single value Avoid initializing the<br>
slot in the array to nil and instead initialize it with the value.<br>
<br>
Refactor setting byte1, byte2 & byte3 into<br>
loadSubsequentBytesForDescriptor:at: for the peephole<br>
tryCollapseTempVectorInitializationOfSize:.<br>
<br>
No loner inline CoInterpreter>>pre/postGCAction: for VM profiling.<br>
<br>
Increase the number of trampoline table slots.<br>
<br>
Simulator:<br>
Fix CurrentImageCoInterpreterFacade for the new Spur<br>
inline instantiation code.<br>
<br>
=============== Diff against VMMaker.oscog-eem.745 ===============<br>
<br>
Item was changed:<br>
----- Method: CoInterpreter>>postGCAction: (in category 'object memory support') -----<br>
postGCAction: gcModeArg<br>
"Attempt to shrink free memory, signal the gc semaphore and let the Cogit do its post GC thang"<br>
+ <inline: false><br>
self assert: gcModeArg = gcMode.<br>
super postGCAction: gcModeArg.<br>
cogit cogitPostGCAction: gcModeArg.<br>
lastCoggableInterpretedBlockMethod := lastUncoggableInterpretedBlockMethod := nil.<br>
gcMode := 0!<br>
<br>
Item was changed:<br>
----- Method: CoInterpreter>>preGCAction: (in category 'object memory support') -----<br>
preGCAction: gcModeArg<br>
+ <inline: false><br>
- <inline: true><br>
"Need to write back the frame pointers unless all pages are free (as in snapshot).<br>
Need to set gcMode var (to avoid passing the flag through a lot of the updating code)"<br>
super preGCAction: gcModeArg.<br>
<br>
gcMode := gcModeArg.<br>
<br>
cogit recordEventTrace ifTrue:<br>
[| traceType |<br>
traceType := gcModeArg == GCModeFull ifTrue: [TraceFullGC] ifFalse: [TraceIncrementalGC].<br>
self recordTrace: traceType thing: traceType source: 0].<br>
<br>
cogit recordPrimTrace ifTrue:<br>
[| traceType |<br>
traceType := gcModeArg == GCModeFull ifTrue: [TraceFullGC] ifFalse: [TraceIncrementalGC].<br>
self fastLogPrim: traceType]!<br>
<br>
Item was added:<br>
+ ----- Method: CogObjectRepresentation>>createsArraysInline (in category 'bytecode generator support') -----<br>
+ createsArraysInline<br>
+ "Answer if the object representation allocates arrays inline. By<br>
+ default answer false. Better code can be generated when creating<br>
+ arrays inline if values are /not/ flushed to the stack."<br>
+ ^false!<br>
<br>
Item was removed:<br>
- ----- Method: CogObjectRepresentationFor32BitSpur>>createsClosuresInline (in category 'bytecode generator support') -----<br>
- createsClosuresInline<br>
- "Answer if the object representation allocates closures inline. By<br>
- default answer false. Better code can be generated when creating<br>
- closures inline if copied values are /not/ flushed to the stack."<br>
- ^true!<br>
<br>
Item was added:<br>
+ ----- Method: CogObjectRepresentationForSpur>>createsArraysInline (in category 'bytecode generator support') -----<br>
+ createsArraysInline<br>
+ "Answer if the object representation allocates arrays inline. By<br>
+ default answer false. Better code can be generated when creating<br>
+ arrays inline if values are /not/ flushed to the stack."<br>
+ ^true!<br>
<br>
Item was added:<br>
+ ----- Method: CogObjectRepresentationForSpur>>createsClosuresInline (in category 'bytecode generator support') -----<br>
+ createsClosuresInline<br>
+ "Answer if the object representation allocates closures inline. By<br>
+ default answer false. Better code can be generated when creating<br>
+ closures inline if copied values are /not/ flushed to the stack."<br>
+ ^true!<br>
<br>
Item was changed:<br>
----- Method: Cogit>>compileAbstractInstructionsFrom:through: (in category 'compile abstract instructions') -----<br>
compileAbstractInstructionsFrom: start through: end<br>
"Loop over bytecodes, dispatching to the generator for each bytecode, handling fixups in due course."<br>
| nextOpcodeIndex descriptor fixup result nExts |<br>
<var: #descriptor type: #'BytecodeDescriptor *'><br>
<var: #fixup type: #'BytecodeFixup *'><br>
bytecodePC := start.<br>
nExts := 0.<br>
[byte0 := (objectMemory fetchByte: bytecodePC ofObject: methodObj) + bytecodeSetOffset.<br>
descriptor := self generatorAt: byte0.<br>
+ self loadSubsequentBytesForDescriptor: descriptor at: bytecodePC.<br>
- descriptor numBytes > 1 ifTrue:<br>
- [byte1 := objectMemory fetchByte: bytecodePC + 1 ofObject: methodObj.<br>
- descriptor numBytes > 2 ifTrue:<br>
- [byte2 := objectMemory fetchByte: bytecodePC + 2 ofObject: methodObj.<br>
- descriptor numBytes > 3 ifTrue:<br>
- [byte3 := objectMemory fetchByte: bytecodePC + 3 ofObject: methodObj.<br>
- descriptor numBytes > 4 ifTrue:<br>
- [self notYetImplemented]]]].<br>
nextOpcodeIndex := opcodeIndex.<br>
result := self perform: descriptor generator.<br>
descriptor isExtension ifFalse: "extended bytecodes must consume their extensions"<br>
[self assert: (extA = 0 and: [extB = 0])].<br>
fixup := self fixupAt: bytecodePC - initialPC.<br>
fixup targetInstruction ~= 0 ifTrue:<br>
["There is a fixup for this bytecode. It must point to the first generated<br>
instruction for this bytecode. If there isn't one we need to add a label."<br>
opcodeIndex = nextOpcodeIndex ifTrue:<br>
[self Label].<br>
fixup targetInstruction: (self abstractInstructionAt: nextOpcodeIndex)].<br>
bytecodePC := self nextBytecodePCFor: descriptor at: bytecodePC exts: nExts in: methodObj.<br>
result = 0 and: [bytecodePC <= end]]<br>
whileTrue:<br>
[nExts := descriptor isExtension ifTrue: [nExts + 1] ifFalse: [0]].<br>
self checkEnoughOpcodes.<br>
^result!<br>
<br>
Item was added:<br>
+ ----- Method: Cogit>>loadSubsequentBytesForDescriptor:at: (in category 'compile abstract instructions') -----<br>
+ loadSubsequentBytesForDescriptor: descriptor at: pc<br>
+ <var: #descriptor type: #'BytecodeDescriptor *'><br>
+ descriptor numBytes > 1 ifTrue:<br>
+ [byte1 := objectMemory fetchByte: pc + 1 ofObject: methodObj.<br>
+ descriptor numBytes > 2 ifTrue:<br>
+ [byte2 := objectMemory fetchByte: pc + 2 ofObject: methodObj.<br>
+ descriptor numBytes > 3 ifTrue:<br>
+ [byte3 := objectMemory fetchByte: pc + 3 ofObject: methodObj.<br>
+ descriptor numBytes > 4 ifTrue:<br>
+ [self notYetImplemented]]]]!<br>
<br>
Item was added:<br>
+ ----- Method: CurrentImageCoInterpreterFacade class>>objectMemoryClass (in category 'accessing') -----<br>
+ objectMemoryClass<br>
+ ^self subclassResponsibility!<br>
<br>
Item was changed:<br>
----- Method: CurrentImageCoInterpreterFacade>>cogit: (in category 'initialize-release') -----<br>
cogit: aCogit<br>
cogit := aCogit.<br>
coInterpreter cogit: aCogit.<br>
+ (objectMemory respondsTo: #cogit:) ifTrue:<br>
+ [objectMemory cogit: aCogit]!<br>
- objectMemory cogit: aCogit!<br>
<br>
Item was added:<br>
+ ----- Method: CurrentImageCoInterpreterFacade>>indexablePointersFormat (in category 'accessing') -----<br>
+ indexablePointersFormat<br>
+ ^objectMemory indexablePointersFormat!<br>
<br>
Item was changed:<br>
----- Method: CurrentImageCoInterpreterFacade>>initialize (in category 'initialize-release') -----<br>
initialize<br>
memory := ByteArray new: 262144.<br>
+ objectMemory := self class objectMemoryClass new.<br>
- objectMemory := NewCoObjectMemory new.<br>
coInterpreter := CoInterpreter new.<br>
coInterpreter<br>
instVarNamed: 'objectMemory'<br>
put: objectMemory;<br>
instVarNamed: 'primitiveTable'<br>
put: (CArrayAccessor on: CoInterpreter primitiveTable copy).<br>
variables := Dictionary new.<br>
#('stackLimit') do:<br>
[:l| self addressForLabel: l].<br>
self initializeObjectMap!<br>
<br>
Item was added:<br>
+ ----- Method: CurrentImageCoInterpreterFacade>>methodNeedsLargeContext: (in category 'accessing') -----<br>
+ methodNeedsLargeContext: aMethodOop<br>
+ ^(self objectForOop: aMethodOop) frameSize > CompiledMethod smallFrameSize!<br>
<br>
Item was added:<br>
+ ----- Method: CurrentImageCoInterpreterFacadeForSpurObjectRepresentation class>>objectMemoryClass (in category 'accessing') -----<br>
+ objectMemoryClass<br>
+ ^Spur32BitCoMemoryManager!<br>
<br>
Item was added:<br>
+ ----- Method: CurrentImageCoInterpreterFacadeForSpurObjectRepresentation>>arrayFormat (in category 'accessing') -----<br>
+ arrayFormat<br>
+ ^objectMemory arrayFormat!<br>
<br>
Item was added:<br>
+ ----- Method: CurrentImageCoInterpreterFacadeForSpurObjectRepresentation>>getScavengeThreshold (in category 'accessing') -----<br>
+ getScavengeThreshold<br>
+ ^objectMemory getScavengeThreshold ifNil: [16r24680]!<br>
<br>
Item was added:<br>
+ ----- Method: CurrentImageCoInterpreterFacadeForSpurObjectRepresentation>>headerForSlots:format:classIndex: (in category 'accessing') -----<br>
+ headerForSlots: numSlots format: formatField classIndex: classIndex<br>
+ ^objectMemory headerForSlots: numSlots format: formatField classIndex: classIndex!<br>
<br>
Item was added:<br>
+ ----- Method: CurrentImageCoInterpreterFacadeForSpurObjectRepresentation>>numSlotsMask (in category 'accessing') -----<br>
+ numSlotsMask<br>
+ ^objectMemory numSlotsMask!<br>
<br>
Item was added:<br>
+ ----- Method: CurrentImageCoInterpreterFacadeForSpurObjectRepresentation>>rememberedBitShift (in category 'accessing') -----<br>
+ rememberedBitShift<br>
+ ^objectMemory rememberedBitShift!<br>
<br>
Item was added:<br>
+ ----- Method: CurrentImageCoInterpreterFacadeForSpurObjectRepresentation>>smallObjectBytesForSlots: (in category 'accessing') -----<br>
+ smallObjectBytesForSlots: numSlots<br>
+ ^objectMemory smallObjectBytesForSlots: numSlots!<br>
<br>
Item was added:<br>
+ ----- Method: CurrentImageCoInterpreterFacadeForSpurObjectRepresentation>>storeCheckBoundary (in category 'accessing') -----<br>
+ storeCheckBoundary<br>
+ ^objectMemory storeCheckBoundary ifNil: [16r12345678]!<br>
<br>
Item was added:<br>
+ ----- Method: CurrentImageCoInterpreterFacadeForSqueakV3ObjectRepresentation class>>objectMemoryClass (in category 'accessing') -----<br>
+ objectMemoryClass<br>
+ ^NewObjectMemory!<br>
<br>
Item was changed:<br>
----- Method: SimpleStackBasedCogit class>>initializeMiscConstants (in category 'class initialization') -----<br>
initializeMiscConstants<br>
super initializeMiscConstants.<br>
MaxLiteralCountForCompile := initializationOptions at: #MaxLiteralCountForCompile ifAbsent: [60].<br>
NumTrampolines := NewspeakVM<br>
+ ifTrue: [50]<br>
+ ifFalse: [42]!<br>
- ifTrue: [46]<br>
- ifFalse: [38]!<br>
<br>
Item was changed:<br>
----- Method: StackToRegisterMappingCogit class>>initializeMiscConstants (in category 'class initialization') -----<br>
initializeMiscConstants<br>
super initializeMiscConstants.<br>
NumTrampolines := NewspeakVM<br>
+ ifTrue: [60]<br>
+ ifFalse: [52]!<br>
- ifTrue: [58]<br>
- ifFalse: [50]!<br>
<br>
Item was changed:<br>
----- Method: StackToRegisterMappingCogit>>compileAbstractInstructionsFrom:through: (in category 'compile abstract instructions') -----<br>
compileAbstractInstructionsFrom: start through: end<br>
"Loop over bytecodes, dispatching to the generator for each bytecode, handling fixups in due course."<br>
| nextOpcodeIndex descriptor nExts fixup result |<br>
<var: #descriptor type: #'BytecodeDescriptor *'><br>
<var: #fixup type: #'BytecodeFixup *'><br>
self traceSimStack.<br>
bytecodePC := start.<br>
nExts := 0.<br>
descriptor := nil.<br>
deadCode := false.<br>
[self cCode: '' inSmalltalk:<br>
[(debugBytecodePointers includes: bytecodePC) ifTrue: [self halt]].<br>
fixup := self fixupAt: bytecodePC - initialPC.<br>
fixup targetInstruction asUnsignedInteger > 0<br>
ifTrue:<br>
[deadCode := false.<br>
fixup targetInstruction asUnsignedInteger >= 2 ifTrue:<br>
[self merge: fixup<br>
afterContinuation: (descriptor notNil<br>
and: [descriptor isUnconditionalBranch<br>
or: [descriptor isReturn]]) not]]<br>
ifFalse: "If there's no fixup following a return there's no jump to that code and it is dead."<br>
[(descriptor notNil and: [descriptor isReturn]) ifTrue:<br>
[deadCode := true]].<br>
self cCode: '' inSmalltalk:<br>
[deadCode ifFalse:<br>
[self assert: simStackPtr + (needsFrame ifTrue: [0] ifFalse: [1])<br>
= (self debugStackPointerFor: bytecodePC)]].<br>
byte0 := (objectMemory fetchByte: bytecodePC ofObject: methodObj) + bytecodeSetOffset.<br>
descriptor := self generatorAt: byte0.<br>
+ self loadSubsequentBytesForDescriptor: descriptor at: bytecodePC.<br>
- descriptor numBytes > 1 ifTrue:<br>
- [byte1 := objectMemory fetchByte: bytecodePC + 1 ofObject: methodObj.<br>
- descriptor numBytes > 2 ifTrue:<br>
- [byte2 := objectMemory fetchByte: bytecodePC + 2 ofObject: methodObj.<br>
- descriptor numBytes > 3 ifTrue:<br>
- [byte3 := objectMemory fetchByte: bytecodePC + 3 ofObject: methodObj.<br>
- descriptor numBytes > 4 ifTrue:<br>
- [self notYetImplemented]]]].<br>
nextOpcodeIndex := opcodeIndex.<br>
result := deadCode<br>
ifTrue: "insert nops for dead code that is mapped so that bc to mc mapping is not many to one"<br>
[(descriptor isMapped<br>
or: [inBlock and: [descriptor isMappedInBlock]]) ifTrue:<br>
[self annotateBytecode: self Nop].<br>
0]<br>
ifFalse:<br>
[self perform: descriptor generator].<br>
descriptor isExtension ifFalse: "extended bytecodes must consume their extensions"<br>
[self assert: (extA = 0 and: [extB = 0])].<br>
self traceDescriptor: descriptor; traceSimStack.<br>
(fixup targetInstruction asUnsignedInteger between: 1 and: 2) ifTrue:<br>
["There is a fixup for this bytecode. It must point to the first generated<br>
instruction for this bytecode. If there isn't one we need to add a label."<br>
opcodeIndex = nextOpcodeIndex ifTrue:<br>
[self Label].<br>
fixup targetInstruction: (self abstractInstructionAt: nextOpcodeIndex)].<br>
bytecodePC := self nextBytecodePCFor: descriptor at: bytecodePC exts: nExts in: methodObj.<br>
result = 0 and: [bytecodePC <= end]] whileTrue:<br>
[nExts := descriptor isExtension ifTrue: [nExts + 1] ifFalse: [0]].<br>
self checkEnoughOpcodes.<br>
^result!<br>
<br>
Item was added:<br>
+ ----- Method: StackToRegisterMappingCogit>>evaluate:at: (in category 'peephole optimizations') -----<br>
+ evaluate: descriptor at: pc<br>
+ <var: #descriptor type: #'BytecodeDescriptor *'><br>
+ byte0 := objectMemory fetchByte: pc ofObject: methodObj.<br>
+ self assert: descriptor = (self generatorAt: bytecodeSetOffset + byte0).<br>
+ self loadSubsequentBytesForDescriptor: descriptor at: pc.<br>
+ self perform: descriptor generator!<br>
<br>
Item was changed:<br>
----- Method: StackToRegisterMappingCogit>>genPushNewArrayBytecode (in category 'bytecode generators') -----<br>
genPushNewArrayBytecode<br>
| size popValues |<br>
self assert: needsFrame.<br>
optStatus isReceiverResultRegLive: false.<br>
(popValues := byte1 > 127)<br>
ifTrue: [self ssFlushTo: simStackPtr]<br>
ifFalse: [self ssAllocateCallReg: SendNumArgsReg and: ReceiverResultReg].<br>
size := byte1 bitAnd: 127.<br>
+ popValues ifFalse:<br>
+ [(self tryCollapseTempVectorInitializationOfSize: size) ifTrue:<br>
+ [^0]].<br>
objectRepresentation genNewArrayOfSize: size initialized: popValues not.<br>
popValues ifTrue:<br>
[size - 1 to: 0 by: -1 do:<br>
[:i|<br>
self PopR: TempReg.<br>
objectRepresentation<br>
genStoreSourceReg: TempReg<br>
slotIndex: i<br>
intoNewObjectInDestReg: ReceiverResultReg].<br>
self ssPop: size].<br>
^self ssPushRegister: ReceiverResultReg!<br>
<br>
Item was added:<br>
+ ----- Method: StackToRegisterMappingCogit>>tryCollapseTempVectorInitializationOfSize: (in category 'peephole optimizations') -----<br>
+ tryCollapseTempVectorInitializationOfSize: slots<br>
+ "Try and collapse<br>
+ push: (Array new: 1)<br>
+ popIntoTemp: tempIndex<br>
+ pushConstant: const or pushTemp: n<br>
+ popIntoTemp: 0 inVectorAt: tempIndex<br>
+ into<br>
+ tempAt: tempIndex put: {const}.<br>
+ One might think that we should look for a sequence of more than<br>
+ one pushes and pops but this is extremely rare."<br>
+ | pushArrayDesc storeArrayDesc pushValueDesc storeValueDesc reg |<br>
+ <var: #pushArrayDesc type: #'BytecodeDescriptor *'><br>
+ <var: #pushValueDesc type: #'BytecodeDescriptor *'><br>
+ <var: #storeArrayDesc type: #'BytecodeDescriptor *'><br>
+ <var: #storeValueDesc type: #'BytecodeDescriptor *'><br>
+ slots ~= 1 ifTrue:<br>
+ [^false].<br>
+ pushArrayDesc := self generatorAt: bytecodeSetOffset<br>
+ + (objectMemory<br>
+ fetchByte: bytecodePC<br>
+ ofObject: methodObj).<br>
+ self assert: pushArrayDesc generator == #genPushNewArrayBytecode.<br>
+ storeArrayDesc := self generatorAt: bytecodeSetOffset<br>
+ + (objectMemory<br>
+ fetchByte: bytecodePC<br>
+ + pushArrayDesc numBytes<br>
+ ofObject: methodObj).<br>
+ storeArrayDesc generator ~~ #genStoreAndPopTemporaryVariableBytecode ifTrue:<br>
+ [^false].<br>
+ pushValueDesc := self generatorAt: bytecodeSetOffset<br>
+ + (objectMemory<br>
+ fetchByte: bytecodePC<br>
+ + pushArrayDesc numBytes<br>
+ + storeArrayDesc numBytes<br>
+ ofObject: methodObj).<br>
+ (pushValueDesc generator ~~ #genPushLiteralConstantBytecode<br>
+ and: [pushValueDesc generator ~~ #genPushQuickIntegerConstantBytecode<br>
+ and: [pushValueDesc generator ~~ #genPushTemporaryVariableBytecode]]) ifTrue:<br>
+ [^false].<br>
+ storeValueDesc := self generatorAt: bytecodeSetOffset<br>
+ + (objectMemory<br>
+ fetchByte: bytecodePC<br>
+ + pushArrayDesc numBytes<br>
+ + storeArrayDesc numBytes<br>
+ + pushValueDesc numBytes<br>
+ ofObject: methodObj).<br>
+ storeValueDesc generator ~~ #genStoreAndPopRemoteTempLongBytecode ifTrue:<br>
+ [^false].<br>
+<br>
+ objectRepresentation genNewArrayOfSize: 1 initialized: false.<br>
+ self evaluate: pushValueDesc at: bytecodePC + pushArrayDesc numBytes + storeArrayDesc numBytes.<br>
+ reg := self ssStorePop: true toPreferredReg: TempReg.<br>
+ objectRepresentation<br>
+ genStoreSourceReg: reg<br>
+ slotIndex: 0<br>
+ intoNewObjectInDestReg: ReceiverResultReg.<br>
+ self ssPushRegister: ReceiverResultReg.<br>
+ self evaluate: storeArrayDesc at: bytecodePC + pushArrayDesc numBytes.<br>
+ bytecodePC := bytecodePC<br>
+ "+ pushArrayDesc numBytes this gets added by nextBytecodePCFor:at:exts:in:"<br>
+ + storeArrayDesc numBytes<br>
+ + pushValueDesc numBytes<br>
+ + storeValueDesc numBytes.<br>
+ ^true!<br>
<br>
</blockquote></div><br></div>