[Vm-dev] VM Maker: VMMaker.oscog-eem.276.mcz

commits at source.squeak.org commits at source.squeak.org
Thu Mar 21 23:45:50 UTC 2013


Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-eem.276.mcz

==================== Summary ====================

Name: VMMaker.oscog-eem.276
Author: eem
Time: 21 March 2013, 4:43:39.056 pm
UUID: 52d03a8f-11ec-4454-9c4a-ea1c383f339a
Ancestors: VMMaker.oscog-eem.275

Cogit:
Stop reporting EncounteredUnknownBytecode with an error message.

Fix slip in ceSICMiss: that didn't link new PIC if an MNU case.

Better document the calling conventions.

Add an assert to genGetImplicitReceiverFor:forPush: that checks
the appropriate use of the IsNSSend annotation.

Fix simulation bug in CogMethodSurrogate>>nextOpenPIC.

Slang:
Nuke unused InterpreterSimulationObject.

=============== Diff against VMMaker.oscog-eem.275 ===============

Item was changed:
  ----- Method: CogMethodSurrogate>>nextOpenPIC (in category 'accessing') -----
  nextOpenPIC
  	| moField |
  	moField := self methodObject.
  	^moField ~= 0 ifTrue:
+ 		[cogit cogMethodSurrogateAt: moField]!
- 		[cogit cogMethodSurrogateAt: moField - self homeOffset]!

Item was changed:
  CogClass subclass: #Cogit
  	instanceVariableNames: 'coInterpreter objectMemory objectRepresentation processor threadManager methodZone methodZoneBase codeBase minValidCallAddress lastNInstructions simulatedAddresses simulatedTrampolines simulatedVariableGetters simulatedVariableSetters printRegisters printInstructions compilationTrace clickConfirm breakPC breakBlock singleStep guardPageSize traceFlags traceStores breakMethod methodObj initialPC endPC methodOrBlockNumArgs inBlock needsFrame hasYoungReferent primitiveIndex backEnd postCompileHook primInvokeLabel methodLabel stackCheckLabel blockEntryLabel blockEntryNoContextSwitch blockNoContextSwitchOffset stackOverflowCall sendMissCall missOffset entryPointMask checkedEntryAlignment cmEntryOffset entry cmNoCheckEntryOffset noCheckEntry dynSuperEntry dynSuperEntryAlignment cmDynSuperEntryOffset mnuCall interpretCall endCPICCase0 endCPICCase1 numPICCases firstCPICCaseOffset cPICCaseSize cPICEndSize closedPICSize openPICSize fixups abstractOpcodes annotations generatorTable primitiveGeneratorTable byte0 byte1 byte2 byte3 bytecodePC bytecodeSetOffset opcodeIndex numAbstractOpcodes annotationIndex blockStarts blockCount labelCounter cStackAlignment expectedSPAlignment expectedFPAlignment codeModified maxMethodBefore maxLitIndex ceMethodAbortTrampoline cePICAbortTrampoline ceCheckForInterruptTrampoline ceCPICMissTrampoline ceStoreCheckTrampoline ceReturnToInterpreterTrampoline ceBaseFrameReturnTrampoline ceSendMustBeBooleanAddTrueTrampoline ceSendMustBeBooleanAddFalseTrampoline ceCannotResumeTrampoline ceClosureCopyTrampoline ceCreateNewArrayTrampoline ceEnterCogCodePopReceiverReg ceEnterCogCodePopReceiverAndClassRegs cePrimReturnEnterCogCode cePrimReturnEnterCogCodeProfiling ceNonLocalReturnTrampoline ceActiveContextTrampoline ceFetchContextInstVarTrampoline ceStoreContextInstVarTrampoline cePositive32BitIntegerTrampoline ceImplicitReceiverTrampoline ceExplicitReceiverTrampoline ceCaptureCStackPointers ceFlushICache ceCheckFeaturesFunction ceTraceLinkedSendTrampoline ceTraceBlockActivationTrampoline ceTraceStoreTrampoline ceGetSP sendTrampolines superSendTrampolines dynamicSuperSendTrampolines firstSend lastSend realCEEnterCogCodePopReceiverReg realCEEnterCogCodePopReceiverAndClassRegs trampolineTableIndex trampolineAddresses objectReferencesInRuntime runtimeObjectRefIndex cFramePointerInUse debugPrimCallStackOffset ceTryLockVMOwner ceUnlockVMOwner cogMethodSurrogateClass cogBlockMethodSurrogateClass extA extB'
+ 	classVariableNames: 'AltBlockCreationBytecodeSize AltNSSendIsPCAnnotated AnnotationConstantNames AnnotationShift BlockCreationBytecodeSize Debug DisplacementMask DisplacementX2N EagerInstructionDecoration EncounteredUnknownBytecode FirstAnnotation HasBytecodePC InsufficientCodeSpace IsAbsPCReference IsDisplacement IsDisplacementX2N IsNSSendCall IsObjectReference IsRelativeCall IsSendCall MapEnd MaxCompiledPrimitiveIndex MaxNegativeErrorCode MaxUnitDisplacement MaxX2NDisplacement MethodTooBig NSSendIsPCAnnotated NotFullyInitialized NumObjRefsInRuntime NumSendTrampolines NumTrampolines ProcessorClass YoungSelectorInPIC'
- 	classVariableNames: 'AltBlockCreationBytecodeSize AltNSSendIsPCAnnotated AnnotationConstantNames AnnotationShift BlockCreationBytecodeSize Debug DisplacementMask DisplacementX2N EagerInstructionDecoration EncounteredUnknownBytecode FirstAnnotation HasBytecodePC InsufficientCodeSpace IsAbsPCReference IsDisplacement IsDisplacementX2N IsNSSendCall IsObjectReference IsRelativeCall IsSendCall MapEnd MaxCompiledPrimitiveIndex MaxNegativeErrorCode MaxUnitDisplacement MaxUnreportableError MaxX2NDisplacement MethodTooBig NSSendIsPCAnnotated NotFullyInitialized NumObjRefsInRuntime NumSendTrampolines NumTrampolines ProcessorClass YoungSelectorInPIC'
  	poolDictionaries: 'CogMethodConstants CogRTLOpcodes VMBasicConstants VMObjectIndices VMStackFrameOffsets'
  	category: 'VMMaker-JIT'!
  Cogit class
  	instanceVariableNames: 'generatorTable primitiveTable'!
  
  !Cogit commentStamp: 'eem 2/13/2013 15:37' prior: 0!
  I am the code generator for the Cog VM.  My job is to produce machine code versions of methods for faster execution and to manage inline caches for faster send performance.  I can be tested in the current image using my class-side in-image compilation facilities.  e.g. try
  
  	StackToRegisterMappingCogit genAndDis: (Integer >> #benchFib)
  
  I have concrete subclasses that implement different levels of optimization:
  	SimpleStackBasedCogit is the simplest code generator.
  
  	StackToRegisterMappingCogit is the current production code generator  It defers pushing operands
  	to the stack until necessary and implements a register-based calling convention for low-arity sends.
  
  	StackToRegisterMappingCogit is an experimental code generator with support for counting
  	conditional branches, intended to support adaptive optimization.
  
  coInterpreter <CoInterpreterSimulator>
  	the VM's interpreter with which I cooperate
  methodZoneManager <CogMethodZoneManager>
  	the manager of the machine code zone
  objectRepresentation <CogObjectRepresentation>
  	the object used to generate object accesses
  processor <BochsIA32Alien|?>
  	the simulator that executes the IA32/x86 machine code I generate when simulating execution in Smalltalk
  simulatedTrampolines <Dictionary of Integer -> MessageSend>
  	the dictionary mapping trap jump addresses to run-time routines used to warp from simulated machine code in to the Smalltalk run-time.
  simulatedVariableGetters <Dictionary of Integer -> MessageSend>
  	the dictionary mapping trap read addresses to variables in run-time objects used to allow simulated machine code to read variables in the Smalltalk run-time.
  simulatedVariableSetters <Dictionary of Integer -> MessageSend>
  	the dictionary mapping trap write addresses to variables in run-time objects used to allow simulated machine code to write variables in the Smalltalk run-time.
  printRegisters printInstructions clickConfirm <Boolean>
  	flags controlling debug printing and code simulation
  breakPC <Integer>
  	machine code pc breakpoint
  cFramePointer cStackPointer <Integer>
  	the variables representing the C stack & frame pointers, which must change on FFI callback and return
  selectorOop <sqInt>
  	the oop of the methodObj being compiled
  methodObj <sqInt>
  	the bytecode method being compiled
  initialPC endPC <Integer>
  	the start and end pcs of the methodObj being compiled
  methodOrBlockNumArgs <Integer>
  	argument count of current method or block being compiled
  needsFrame <Boolean>
  	whether methodObj or block needs a frame to execute
  primitiveIndex <Integer>
  	primitive index of current method being compiled
  methodLabel <CogAbstractOpcode>
  	label for the method header
  blockEntryLabel <CogAbstractOpcode>
  	label for the start of the block dispatch code
  stackOverflowCall <CogAbstractOpcode>
  	label for the call of ceStackOverflow in the method prolog
  sendMissCall <CogAbstractOpcode>
  	label for the call of ceSICMiss in the method prolog
  entryOffset <Integer>
  	offset of method entry code from start (header) of method
  entry <CogAbstractOpcode>
  	label for the first instruction of the method entry code
  noCheckEntryOffset <Integer>
  	offset of the start of a method proper (after the method entry code) from start (header) of method
  noCheckEntry <CogAbstractOpcode>
  	label for the first instruction of start of a method proper
  fixups <Array of <AbstractOpcode Label | nil>>
  	the labels for forward jumps that will be fixed up when reaching the relevant bytecode.  fixup shas one element per byte in methodObj's bytecode
  abstractOpcodes <Array of <AbstractOpcode>>
  	the code generated when compiling methodObj
  byte0 byte1 byte2 byte3 <Integer>
  	individual bytes of current bytecode being compiled in methodObj
  bytecodePointer <Integer>
  	bytecode pc (same as Smalltalk) of the current bytecode being compiled
  opcodeIndex <Integer>
  	the index of the next free entry in abstractOpcodes (this code is translated into C where OrderedCollection et al do not exist)
  numAbstractOpcodes <Integer>
  	the number of elements in abstractOpcocdes
  blockStarts <Array of <BlockStart>>
  	the starts of blocks in the current method
  blockCount
  	the index into blockStarts as they are being noted, and hence eventuakly teh total number of blocks in the current method
  labelCounter <Integer>
  	a nicety for numbering labels not needed in the production system but probably not expensive enough to worry about
  ceStackOverflowTrampoline <Integer>
  ceSend0ArgsTrampoline <Integer>
  ceSend1ArgsTrampoline <Integer>
  ceSend2ArgsTrampoline <Integer>
  ceSendNArgsTrampoline <Integer>
  ceSendSuper0ArgsTrampoline <Integer>
  ceSendSuper1ArgsTrampoline <Integer>
  ceSendSuper2ArgsTrampoline <Integer>
  ceSendSuperNArgsTrampoline <Integer>
  ceSICMissTrampoline <Integer>
  ceCPICMissTrampoline <Integer>
  ceStoreCheckTrampoline <Integer>
  ceReturnToInterpreterTrampoline <Integer>
  ceBaseFrameReturnTrampoline <Integer>
  ceSendMustBeBooleanTrampoline <Integer>
  ceClosureCopyTrampoline <Integer>
  	the various trampolines (system-call-like jumps from machine code to the run-time).
  	See Cogit>>generateTrampolines for the mapping from trampoline to run-time
  	routine and then read the run-time routine for a funcitonal description.
  ceEnterCogCodePopReceiverReg <Integer>
  	the enilopmart (jump from run-time to machine-code)
  methodZoneBase <Integer>
  !
  Cogit class
  	instanceVariableNames: 'generatorTable primitiveTable'!

Item was changed:
  ----- Method: Cogit class>>initializeErrorCodes (in category 'class initialization') -----
  initializeErrorCodes
- 	self flag: 'these should be positive quantities and the check for error code should be a comparison against minCogMethodAddress/methodZoneBase'.
  	NotFullyInitialized := -1.
  	InsufficientCodeSpace := -2.
  	MethodTooBig := -4.
  	YoungSelectorInPIC := -5.
- 	MaxUnreportableError := YoungSelectorInPIC.
  	EncounteredUnknownBytecode := -6.
  	MaxNegativeErrorCode := EncounteredUnknownBytecode!

Item was changed:
  ----- Method: Cogit>>ceSICMiss: (in category 'in-line cacheing') -----
  ceSICMiss: receiver
  	"An in-line cache check in a method has failed.  The failing entry check has jumped
  	 to the ceMethodAbort abort call at the start of the method which has called this routine.
  	 If possible allocate a closed PIC for the current and existing classes.
  	 The stack looks like:
  			receiver
  			args
  			sender return address
  	  sp=>	ceMethodAbort call return address
  	 So we can find the method that did the failing entry check at
  		ceMethodAbort call return address - missOffset
  	 and we can find the send site from the outer return address."
  	<api>
  	| pic innerReturn outerReturn entryPoint targetMethod newTargetMethodOrNil errorSelectorOrNil cacheTag extent result |
  	<var: #pic type: #'CogMethod *'>
  	<var: #targetMethod type: #'CogMethod *'>
  	"Whether we can relink to a PIC or not we need to pop off the inner return and identify the target method."
  	innerReturn := coInterpreter popStack.
  	targetMethod := self cCoerceSimple: innerReturn - missOffset to: #'CogMethod *'.
  	outerReturn := coInterpreter stackTop.
  	self assert: (outerReturn between: methodZoneBase and: methodZone freeStart).
  	entryPoint := backEnd callTargetFromReturnAddress: outerReturn.
  
  	self assert: targetMethod selector ~= objectMemory nilObject.
  	self cppIf: NewspeakVM ifTrue:
  		[self assert: (targetMethod asInteger + cmEntryOffset = entryPoint
  					or: [targetMethod asInteger + cmDynSuperEntryOffset = entryPoint]).
  		 "Avoid the effort of implementing PICs for the relatively low dynamic frequency
  		  dynamic super send and simply rebind the send site."
  		 targetMethod asInteger + cmDynSuperEntryOffset = entryPoint ifTrue:
  			[^coInterpreter
  				ceDynamicSuperSend: targetMethod selector
  				to: receiver
  				numArgs: targetMethod cmNumArgs]].
  	self assert: targetMethod asInteger + cmEntryOffset = entryPoint.
  
  	self lookup: targetMethod selector
  		for: receiver
  		methodAndErrorSelectorInto:
  			[:method :errsel|
  			newTargetMethodOrNil := method.
  			errorSelectorOrNil := errsel].
  	"We assume lookupAndCog:for: will *not* reclaim the method zone"
  	self assert: outerReturn = coInterpreter stackTop.
  	cacheTag := objectRepresentation inlineCacheTagForInstance: receiver.
  	((errorSelectorOrNil notNil and: [errorSelectorOrNil ~= SelectorDoesNotUnderstand])
  	 or: [(objectRepresentation inlineCacheTagIsYoung: cacheTag)
  	 or: [newTargetMethodOrNil isNil
  	 or: [objectMemory isYoung: newTargetMethodOrNil]]]) ifTrue:
  		[result := self patchToOpenPICFor: targetMethod selector
  					numArgs: targetMethod cmNumArgs
  					receiver: receiver.
  		 self assert: result not. "If patchToOpenPICFor:.. returns we're out of code memory"
  		 ^coInterpreter ceSendFromInLineCacheMiss: targetMethod].
  	"See if an Open PIC is already available."
  	pic := methodZone openPICWithSelector: targetMethod selector.
  	pic isNil ifTrue:
  		["otherwise attempt to create a closed PIC for the two cases."
  		 pic := self cogPICSelector: targetMethod selector
  					numArgs: targetMethod cmNumArgs
  					Case0Method: targetMethod
  					Case1Method: newTargetMethodOrNil
  					tag: cacheTag
  					isMNUCase: errorSelectorOrNil = SelectorDoesNotUnderstand.
+ 		 pic asInteger < 0 ifTrue:
+ 			["For some reason the PIC couldn't be generated, most likely a lack of code memory.
+ 			  Continue as if this is an unlinked send."
- 		 (errorSelectorOrNil notNil
- 		  or: [pic asInteger between: MaxNegativeErrorCode and: -1]) ifTrue:
- 			["If for some reason the PIC couldn't be generated, most likely a lack of code memory.
- 			  Continue as if this is an unlinked send.  If this is an error case continue as if this is an
- 			  unlinked send to invoke the appropriate error behavior (MNU, cannot interpret et al)."
  			 pic asInteger = InsufficientCodeSpace ifTrue:
  				[coInterpreter callForCogCompiledCodeCompaction].
+ 			^coInterpreter ceSendFromInLineCacheMiss: targetMethod].
+ 		 processor flushICacheFrom: pic asInteger to: pic asInteger + closedPICSize].
+ 	"Relink the send site to the pic."
- 			^coInterpreter ceSendFromInLineCacheMiss: targetMethod]].
  	extent := backEnd
  				rewriteCallAt: outerReturn
  				target: pic asInteger + cmEntryOffset.
+ 	processor flushICacheFrom: outerReturn - 1 - extent to: outerReturn - 1.
- 	processor
- 		flushICacheFrom: outerReturn - 1 - extent to: outerReturn - 1;
- 		flushICacheFrom: pic asInteger to: pic asInteger + closedPICSize.
  	"Jump back into the pic at its entry in case this is an MNU (newTargetMethodOrNil is nil)"
  	coInterpreter
  		executeCogMethodFromLinkedSend: pic
  		withReceiver: receiver
  		andCacheTag: (backEnd inlineCacheTagAt: outerReturn).
  	"NOTREACHED"
  	^nil!

Item was changed:
  ----- Method: Cogit>>cog:selector: (in category 'jit - api') -----
  cog: aMethodObj selector: aSelectorOop
  	"Attempt to produce a machine code method for the bytecode method
  	 object aMethodObj.  N.B. If there is no code memory available do *NOT*
  	 attempt to reclaim the method zone.  Certain clients (e.g. ceSICMiss:)
  	 depend on the zone remaining constant across method generation."
  	<api>
  	<returnTypeC: #'CogMethod *'>
  	| cogMethod |
  	<var: #cogMethod type: #'CogMethod *'>
  	self assert: ((coInterpreter methodHasCogMethod: aMethodObj) not
  				or: [(self noAssertMethodClassAssociationOf: aMethodObj) = objectMemory nilObject]).
  	"coInterpreter stringOf: aSelectorOop"
  	coInterpreter
  		compilationBreak: aSelectorOop
  		point: (objectMemory lengthOf: aSelectorOop).
  	aMethodObj = breakMethod ifTrue: [self halt: 'Compilation of breakMethod'].
  	self cppIf: NewspeakVM
  		ifTrue: [cogMethod := methodZone findPreviouslyCompiledVersionOf: aMethodObj with: aSelectorOop.
  				cogMethod ifNotNil:
  					[(coInterpreter methodHasCogMethod: aMethodObj) not ifTrue:
  						[self assert: (coInterpreter rawHeaderOf: aMethodObj) = cogMethod methodHeader.
  						 cogMethod methodObject: aMethodObj.
  						 coInterpreter rawHeaderOf: aMethodObj put: cogMethod asInteger].
  					^cogMethod]].
  	"If the generators for the alternate bytecode set are missing then interpret."
  	(coInterpreter methodUsesAlternateBytecodeSet: aMethodObj)
  		ifTrue:
  			[(self numElementsIn: generatorTable) <= 256 ifTrue:
  				[^nil].
  			 bytecodeSetOffset := 256]
  		ifFalse:
  			[bytecodeSetOffset := 0].
  	extA := extB := 0.
  	methodObj := aMethodObj.
  	cogMethod := self compileCogMethod: aSelectorOop.
  	(cogMethod asInteger between: MaxNegativeErrorCode and: -1) ifTrue:
+ 		[cogMethod asUnsignedInteger = InsufficientCodeSpace ifTrue:
+ 			[coInterpreter callForCogCompiledCodeCompaction].
+ 		"Right now no errors should be reported, so nothing more to do."
+ 		"self reportError: (self cCoerceSimple: cogMethod to: #sqInt)."
- 		[cogMethod asInteger >= MaxUnreportableError
- 			ifTrue:
- 				[cogMethod asInteger = InsufficientCodeSpace ifTrue:
- 					[coInterpreter callForCogCompiledCodeCompaction]]
- 			ifFalse:
- 				[self reportError: (self cCoerceSimple: cogMethod to: #sqInt)].
  		 ^nil].
  	"self cCode: ''
  		inSmalltalk:
  			[coInterpreter printCogMethod: cogMethod.
  			 ""coInterpreter symbolicMethod: aMethodObj.""
  			 self assertValidMethodMap: cogMethod."
  			 "self disassembleMethod: cogMethod."
  			 "printInstructions := clickConfirm := true""]."
  	^cogMethod!

Item was changed:
  ----- Method: InterpreterPlugin class>>pluginClassesUpTo: (in category 'translation') -----
  pluginClassesUpTo: aPluginClass
  	"Answer the classes to include for translation of aPluginClass, superclasses first, aPluginClass last."
  	| theClass classes |
  
  	classes := OrderedCollection new.
  	theClass := self.
  	[theClass == Object
+ 	 or: [theClass == VMClass]] whileFalse:
- 	 or: [theClass == InterpreterSimulationObject
- 	 or: [theClass == VMClass]]] whileFalse:
  		[classes addLast: theClass.
  		theClass := theClass superclass].
  	^classes reverse!

Item was removed:
- Object subclass: #InterpreterSimulationObject
- 	instanceVariableNames: ''
- 	classVariableNames: ''
- 	poolDictionaries: ''
- 	category: 'VMMaker-Plugins'!

Item was removed:
- ----- Method: InterpreterSimulationObject>>cCoerce:to: (in category 'simulation') -----
- cCoerce: value to: cTypeString
- 	"Here the Simulator has a chance to create properly typed flavors of CArray access."
- 
- 	value isCObjectAccessor ifTrue:
- 		[^ self getInterpreter cCoerce: value to: cTypeString].
- 	(value isMemberOf: CArray) ifTrue:
- 		[^ self getInterpreter cCoerce: value to: cTypeString].
- 	^ value!

Item was removed:
- ----- Method: InterpreterSimulationObject>>long32At: (in category 'memory access') -----
- long32At: byteAddress
- 	"Simulation support.  Answer the 32-bit word at byteAddress which must be 0 mod 4."
- 
- 	^self getInterpreter long32At: byteAddress!

Item was removed:
- ----- Method: InterpreterSimulationObject>>long32At:put: (in category 'memory access') -----
- long32At: byteAddress put: a32BitValue
- 	"Simulation support.  Store the 32-bit value at byteAddress which must be 0 mod 4."
- 
- 	^self getInterpreter long32At: byteAddress put: a32BitValue!

Item was removed:
- ----- Method: InterpreterSimulationObject>>oopForPointer: (in category 'memory access') -----
- oopForPointer: aPointer
- 	"Simulation support.  Pointers and oops are the same when simulating; answer aPointer."
- 
- 	^aPointer!

Item was removed:
- ----- Method: InterpreterSimulationObject>>pointerForOop: (in category 'memory access') -----
- pointerForOop: anOop
- 	"Simulation support.  Pointers and oops are the same when simulating; answer anOop."
- 
- 	^anOop!

Item was added:
+ ----- Method: SimpleStackBasedCogit class>>callingConvention (in category 'documentation') -----
+ callingConvention
+ 	"The Smalltalk-to-Smalltalk calling convention for SimpleStackBasedCogit is
+ 	 designed to be congruent with the interpreter and convenient for inline cacheing.
+ 	 For inline cacheing it is convenient if the receiver is in a register.
+ 
+ 	 Hence the calling convention is:
+ 	
+ 		On call ReceiverResultReg (edx on x86) contains the receiver, and the receiver
+ 		and arguments are all on the stack, receiver furthest from top-of-stack.
+ 	
+ 		If the number of arguments is 3 or greater then the argument count is passed in
+ 		SendNumArgsReg (this is for the linking run-time routine; it is ignored in linked sends).
+ 
+ 		On return result is in ReceiverResultReg.  The callee removes arguments from the stack.
+ 		The caller pushes the result if the result is used."!

Item was removed:
- ----- Method: SimpleStackBasedCogit>>genGetImplicitReceiverFor: (in category 'bytecode generators') -----
- genGetImplicitReceiverFor: selector
- 	"Cached implicit receiver implementation.  Caller looks like
- 		mov selector, ClassReg
- 				call ceImplicitReceiverTrampoline
- 				br continue
- 		Lclass	.word
- 		Lmixin:	.word
- 		continue:
- 	 If class matches class of receiver then mixin contains either 0 or the implicit receiver.
- 	 If 0, answer the actual receiver.  This is done in the trampoline.
- 	 See generateNewspeakRuntime.
- 
- 	 N.B. For PC mapping either this is used for SendAbsentImplicit or for PushAbsentReceiver
- 	 but not both.  So any Newspeak instruction set has to choose either SendAbsentImplicit
- 	 or PushAbsentReceiver.  See isPCMappedAnnotation:alternateInstructionSet:"
- 	| skip |
- 	<var: #skip type: #'AbstractInstruction *'>
- 	(objectMemory isYoung: selector) ifTrue:
- 		[hasYoungReferent := true].
- 	self assert: needsFrame.
- 	self MoveCw: selector R: SendNumArgsReg.
- 	self CallNewspeakSend: ceImplicitReceiverTrampoline.
- 	skip := self Jump: 0.
- 	self Fill32: 0.
- 	self Fill32: 0.
- 	skip jmpTarget: self Label.
- 	^0!

Item was added:
+ ----- Method: SimpleStackBasedCogit>>genGetImplicitReceiverFor:forPush: (in category 'bytecode generators') -----
+ genGetImplicitReceiverFor: selector forPush: forPushSendBar
+ 	"Cached implicit receiver implementation.  Caller looks like
+ 		mov selector, ClassReg
+ 				call ceImplicitReceiverTrampoline
+ 				br continue
+ 		Lclass	.word
+ 		Lmixin:	.word
+ 		continue:
+ 	 If class matches class of receiver then mixin contains either 0 or the implicit receiver.
+ 	 If 0, answer the actual receiver.  This is done in the trampoline.
+ 	 See generateNewspeakRuntime."
+ 
+ 	| skip |
+ 	<var: #skip type: #'AbstractInstruction *'>
+ 	"N.B. For PC mapping either this is used for SendAbsentImplicit or for PushAbsentReceiver
+ 	 but not both.  So any Newspeak instruction set has to choose either SendAbsentImplicit
+ 	 or PushImplicitReceiver.  See isPCMappedAnnotation:alternateInstructionSet:"
+ 	self assert: forPushSendBar = (self isPCMappedAnnotation: IsNSSendCall
+ 										alternateInstructionSet: bytecodeSetOffset > 0).
+ 	(objectMemory isYoung: selector) ifTrue:
+ 		[hasYoungReferent := true].
+ 	self assert: needsFrame.
+ 	self MoveCw: selector R: SendNumArgsReg.
+ 	self CallNewspeakSend: ceImplicitReceiverTrampoline.
+ 	skip := self Jump: 0.
+ 	self Fill32: 0.
+ 	self Fill32: 0.
+ 	skip jmpTarget: self Label.
+ 	^0!

Item was changed:
  ----- Method: SimpleStackBasedCogit>>genPushImplicitReceiverBytecode (in category 'bytecode generators') -----
  genPushImplicitReceiverBytecode
  	| result |
+ 	result := self genGetImplicitReceiverFor: (self getLiteral: byte1) forPush: true.
- 	result := self genGetImplicitReceiverFor: (self getLiteral: byte1).
  	result ~= 0 ifTrue:
  		[^result].
  	self PushR: ReceiverResultReg.
  	^0!

Item was changed:
  ----- Method: SimpleStackBasedCogit>>genSendAbsentImplicit:numArgs: (in category 'bytecode generators') -----
  genSendAbsentImplicit: selector numArgs: numArgs
  	"Get the implicit receiver and shuffle arguments if necessary.
  	 Then send."
  	<inline: false>
  	| result |
+ 	result := self genGetImplicitReceiverFor: selector forPush: false.
- 	result := self genGetImplicitReceiverFor: selector.
  	result ~= 0 ifTrue:
  		[^result].
  	numArgs = 0
  		ifTrue:
  			[self PushR: ReceiverResultReg]
  		ifFalse:
  			[self MoveMw: 0 r: SPReg R: TempReg.
  			self PushR: TempReg.
  			2 to: numArgs do:
  				[:index|
  				self MoveMw: index * BytesPerWord r: SPReg R: TempReg.
  				self MoveR: TempReg Mw: index - 1 * BytesPerWord r: SPReg].
  			"if we copied the code in genSend:numArgs: we could save an instruction.
  			But we care not; the smarts are in StackToRegisterMappingCogit et al"
  			self MoveR: ReceiverResultReg Mw: numArgs * BytesPerWord r: SPReg].
  	^self genSend: selector numArgs: numArgs!

Item was changed:
  ----- Method: StackToRegisterMappingCogit class>>callingConvention (in category 'documentation') -----
  callingConvention
+ 	"The Smalltalk-to-Smalltalk calling convention aims to trade simplicity of compilation against
+ 	 effectiveness of optimization.  Most Smalltalk methods, and certainly most performance-
+ 	 critical primitives have two or less arguments.  So arranging that the receiver and up to two
+ 	 args args are in registers arranges that performance-critical primitives can access their
+ 	 arguments in registers.  So if the argument count is <= numRegArgs nothing is passed on
+ 	 the stack and everything is passed in ReceiverResultReg, Arg0Reg et al.  Above numRegArgs
- 	"The calling convention aims to trade simplicity of compilation against effectiveness of optimization.
- 	 Most Smalltalk methods, and certainly most performance-critical primitives have two or less arguments.
- 	 So arranging that the receiver and up to two args args are in registers arranges that performance-critical
- 	 primitives can access their arguments in registers.  So if the argument count is <= numRegArgs nothing
- 	 is passed on the stack and everything is passed in ReceiverResultReg, Arg0Reg et al.  Above numRegArgs
  	 everything is passed on the stack.
  
+ 	 To save the CoInterpreter from change we shuffle the retpc and push the register args in
+ 	 the prolog so that the frame format is unchanged by register args.  Also, the trampolines for
+ 	 unlinked sends do the same, as does the code preceeding an interpreter primitive.  It turns
+ 	 out that this protocol is faster than always pushing arguments.  Comparing benchFib with the
+ 	 shuffling protocol against an always-push protocol on a 2.66 GHz Core i7 (MacBook Pro) , the
+ 	 shuffling protocol is 6.3% faster than the always push protocol.
- 	 To save the CoInterpreter from change we shuffle the retpc and push the register args in the prolog so
- 	 that the frame format is unchanged by register args.  Also, the trampolines for unlinked sends do the same,
- 	 as does the code preceeding an interpreter primitive.  It turns out that this protocol is faster than always
- 	 pushing arguments.  Comparing benchFib with the shuffling protocol against an always-push protocol on a
- 	 2.66 GHz Core i7 (MacBook Pro) , the shuffling protocol is 6.3% faster than the always push protocol.
  
+ 	 Not shuffling the stack and pushing register arguments after frame build is faster yet again,
+ 	 5.8% faster that the stack shuffle.  So it might be worth-while to change the CoInterpreter's
+ 	 frame management to allow numArgs <= numRegArgs frames to push receiver and arguments
+ 	 after saving the return pc.  This implies changes in stack-to-context mapping, GC,
+ 	 interpreter-to-machine code frame conversion and no doubt else where.
+ 
+ 	 Hence the calling convention is
+ 
+ 		- if the number of arguments is less than or equal to numRegArgs then the receiver and arguments
+ 		  are passed in registers.  numRegArgs is currently 1, but will become 2 once the code generator
+ 		  generates machine code primitives which take 2 arguments (i.e. once the object representation
+ 		  makes it feasible to implement at:put: in machine code numRegArgs will be raised to 2).  The receiver
+ 		  is passed in ReceiverResultReg, the first argument in Arg0Reg (esi on x86) and the second argument
+ 		  (if numRegArgs = 2) in Arg1Reg (edi on x86).
+ 
+ 		- if the number of arguments is greater than numRegArgs then the calling convention is as for
+ 		  SimpleStackBasedCogIt; ReceiverResultReg contains the receiver, and the receiver and arguments
+ 		  are all on the stack, receiver furthest from top-of-stack.  If the argument count is > 2 then argument
+ 		  count is passed in SendNumArgsReg (for the benefit of the run-time linking routines; it is ignored in
+ 		  linked sends).
+ 
+ 		On return the result is in ReceiverResultReg.  The callee removes arguments from the stack.
+ 
+ 		Note that if a machine code method contains a call to an interpreter primitive it will push any register
+ 		arguments on the stack before calling the primitive so that to the primitive the stack looks the same
+ 		as it does in the interpreter.
+ 
+ 		Within all machine code primitives except primitiveClosureValue all arguments are taken form registers
+ 	 	since no machine code primitiver has more than numRegArgs arguments.  primitiveClosureValue pushes
+ 		its register arguments immedately only for laziness to be able to reuse SimpleStackBasedCogit's code.
+ 
+ 		Within machine code methods with interpreter primtiives the register arguments are pushed before calling
+ 		the interpreter primitive.  In normal methods and if not already done so in [primitive code, the register
+ 		arguments are pushed during frame build.  If a method is compiled frameless it will access its arguments
+ 		 in registers."!
- 	 Not shuffling the stack and pushing register arguments after frame build is faster yet again, 5.8% faster
- 	 that the stack shuffle.  So it could be worth-while to change the CoInterpreter's frame management to
- 	 allow numArgs <= numRegArgs frames to push receiver and arguments after saving the return pc.  This
- 	 implies changes in stack-to-context mapping, GC, interpreter-to-machine code frame conversion and no
- 	 doubt else where."!

Item was removed:
- ----- Method: StackToRegisterMappingCogit>>genGetImplicitReceiverFor: (in category 'bytecode generators') -----
- genGetImplicitReceiverFor: selector
- 	"Cached implicit receiver implementation.  Caller looks like
- 		mov selector, ClassReg
- 				call cePushImplicitReceiver
- 				br continue
- 		Lclass	.word
- 		Lmixin:	.word
- 		continue:
- 	 If class matches class of receiver then mixin contains either 0 or the implicit receiver.
- 	 If 0, answer the actual receiver.  This is done in the trampoline.
- 	 See generateNewspeakRuntime."
- 	self ssAllocateCallReg: SendNumArgsReg and: ReceiverResultReg and: ClassReg and: Arg1Reg.
- 	^super genGetImplicitReceiverFor: selector!

Item was added:
+ ----- Method: StackToRegisterMappingCogit>>genGetImplicitReceiverFor:forPush: (in category 'bytecode generators') -----
+ genGetImplicitReceiverFor: selector forPush: forPushSendBar
+ 	"Cached implicit receiver implementation.  Caller looks like
+ 		mov selector, ClassReg
+ 				call cePushImplicitReceiver
+ 				br continue
+ 		Lclass	.word
+ 		Lmixin:	.word
+ 		continue:
+ 	 If class matches class of receiver then mixin contains either 0 or the implicit receiver.
+ 	 If 0, answer the actual receiver.  This is done in the trampoline.
+ 	 See generateNewspeakRuntime."
+ 	self ssAllocateCallReg: SendNumArgsReg and: ReceiverResultReg and: ClassReg and: Arg1Reg.
+ 	^super genGetImplicitReceiverFor: selector forPush: forPushSendBar!

Item was changed:
  ----- Method: StackToRegisterMappingCogit>>genPushImplicitReceiverBytecode (in category 'bytecode generators') -----
  genPushImplicitReceiverBytecode
  	| result |
+ 	result := self genGetImplicitReceiverFor: (self getLiteral: byte1) forPush: true.
- 	result := self genGetImplicitReceiverFor: (self getLiteral: byte1).
  	result ~= 0 ifTrue:
  		[^result].
  	^self ssPushRegister: ReceiverResultReg!

Item was changed:
  ----- Method: StackToRegisterMappingCogit>>genSendAbsentImplicit:numArgs: (in category 'bytecode generators') -----
  genSendAbsentImplicit: selector numArgs: numArgs
  	"Get the implicit receiver and marshall arguments, shuffling the
  	 stack to push the implicit receiver if necessary. Then send."
  	<inline: false>
  	| result |
+ 	"This must not be PC-mapped"
+ 	result := self genGetImplicitReceiverFor: selector forPush: false.
- 	result := self genGetImplicitReceiverFor: selector.
  	result ~= 0 ifTrue:
  		[^result].
  	self marshallImplicitReceiverSendArguments: numArgs.
  	^self genMarshalledSend: selector numArgs: numArgs!



More information about the Vm-dev mailing list