[Vm-dev] VM Maker: VMMaker.oscog-cb.1019.mcz

commits at source.squeak.org commits at source.squeak.org
Thu Jan 15 16:16:06 UTC 2015


ClementBera uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-cb.1019.mcz

==================== Summary ====================

Name: VMMaker.oscog-cb.1019
Author: cb
Time: 15 January 2015, 5:14:07.437 pm
UUID: 9f3b07dd-ec85-455f-b8a4-2dc0f27868ed
Ancestors: VMMaker.oscog-cb.1018

added super calls in conditionnal jumps generation in order not to generate counters for optimized methods.

=============== Diff against VMMaker.oscog-cb.1018 ===============

Item was changed:
  ----- Method: SistaStackToRegisterMappingCogit>>genJumpIf:to: (in category 'bytecode generators') -----
  genJumpIf: boolean to: targetBytecodePC
  	"The heart of performance counting in Sista.  Conditional branches are 6 times less
  	 frequent than sends and can provide basic block frequencies (send counters can't).
  	 Each conditional has a 32-bit counter split into an upper 16 bits counting executions
  	 and a lower half counting untaken executions of the branch.  Executing the branch
  	 decrements the upper half, tripping if the count goes negative.  Not taking the branch
  	 decrements the lower half.  N.B. We *do not* eliminate dead branches (true ifTrue:/true ifFalse:)
  	 so that scanning for send and branch data is simplified and that branch data is correct."
  	<inline: false>
  	| desc ok counterAddress countTripped retry |
  	<var: #ok type: #'AbstractInstruction *'>
  	<var: #desc type: #'CogSimStackEntry *'>
  	<var: #retry type: #'AbstractInstruction *'>
  	<var: #countTripped type: #'AbstractInstruction *'>
+ 
+ 	(coInterpreter isOptimizedMethod: methodObj) ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].
+ 
  	self ssFlushTo: simStackPtr - 1.
  	desc := self ssTop.
  	self ssPop: 1.
  	desc popToReg: TempReg.
  
  	self ssAllocateRequiredReg: SendNumArgsReg. "Use this as the count reg."
  	counterAddress := counters + ((self sizeof: #sqInt) * counterIndex).
  	counterIndex := counterIndex + 1.
  	self flag: 'will need to use MoveAw32:R: if 64 bits'.
  	self assert: objectMemory wordSize = CounterBytes.
  	retry := self MoveAw: counterAddress R: SendNumArgsReg.
  	self SubCq: 16r10000 R: SendNumArgsReg. "Count executed"
  	"Don't write back if we trip; avoids wrapping count back to initial value, and if we trip we don't execute."
  	countTripped := self JumpCarry: 0.
  	self MoveR: SendNumArgsReg Aw: counterAddress. "write back"
  
  	"Cunning trick by LPD.  If true and false are contiguous subtract the smaller.
  	 Correct result is either 0 or the distance between them.  If result is not 0 or
  	 their distance send mustBeBoolean."
  	self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
  	self annotate: (self SubCw: boolean R: TempReg) objRef: boolean.
  	self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
  
  	self SubCq: 1 R: SendNumArgsReg. "Count untaken"
  	self MoveR: SendNumArgsReg Aw: counterAddress. "write back"
  
  	self CmpCq: (boolean == objectMemory falseObject
  					ifTrue: [objectMemory trueObject - objectMemory falseObject]
  					ifFalse: [objectMemory falseObject - objectMemory trueObject])
  		R: TempReg.
  	ok := self JumpZero: 0.
  	self MoveCq: 0 R: SendNumArgsReg. "if SendNumArgsReg is 0 this is a mustBeBoolean, not a counter trip."
  	countTripped jmpTarget:
  		(self CallRT: (boolean == objectMemory falseObject
  						ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
  						ifFalse: [ceSendMustBeBooleanAddTrueTrampoline])).
  	"If we're in an image which hasn't got the Sista code loaded then the ceCounterTripped:
  	 trampoline will return directly to machine code, returning the boolean.  So the code should
  	 jump back to the retry point. The trampoline makes sure that TempReg has been reloaded."
  	self annotateBytecode: self Label.
  	self Jump: retry.
  	ok jmpTarget: self Label.
  	^0!

Item was changed:
  ----- Method: SistaStackToRegisterMappingCogit>>genSpecialSelectorComparison (in category 'bytecode generators') -----
  genSpecialSelectorComparison
  	"Override to count inlined branches if followed by a conditional branch.
  	 We borrow the following conditional branch's counter and when about to
  	 inline the comparison we decrement the counter (without writing it back)
  	 and if it trips simply abort the inlining, falling back to the normal send which
  	 will then continue to the conditional branch which will trip and enter the abort."
  	| nextPC postBranchPC targetBytecodePC primDescriptor branchDescriptor nExts
  	  rcvrIsInt argIsInt rcvrInt argInt result jumpNotSmallInts inlineCAB annotateInst
  	  counterAddress countTripped |
  	<var: #countTripped type: #'AbstractInstruction *'>
  	<var: #primDescriptor type: #'BytecodeDescriptor *'>
  	<var: #jumpNotSmallInts type: #'AbstractInstruction *'>
  	<var: #branchDescriptor type: #'BytecodeDescriptor *'>
+ 
+ 	(coInterpreter isOptimizedMethod: methodObj) ifTrue: [ ^ super genSpecialSelectorComparison ].
+ 
  	self ssFlushTo: simStackPtr - 2.
  	primDescriptor := self generatorAt: byte0.
  	argIsInt := self ssTop type = SSConstant
  				 and: [objectMemory isIntegerObject: (argInt := self ssTop constant)].
  	rcvrIsInt := (self ssValue: 1) type = SSConstant
  				 and: [objectMemory isIntegerObject: (rcvrInt := (self ssValue: 1) constant)].
  
  	"short-cut the jump if operands are SmallInteger constants."
  	(argIsInt and: [rcvrIsInt]) ifTrue:
  		[self cCode: '' inSmalltalk: "In Simulator ints are unsigned..."
  				[rcvrInt := objectMemory integerValueOf: rcvrInt.
  				argInt := objectMemory integerValueOf: argInt].
  		 primDescriptor opcode caseOf: {
  			[JumpLess]				-> [result := rcvrInt < argInt].
  			[JumpLessOrEqual]		-> [result := rcvrInt <= argInt].
  			[JumpGreater]			-> [result := rcvrInt > argInt].
  			[JumpGreaterOrEqual]	-> [result := rcvrInt >= argInt].
  			[JumpZero]				-> [result := rcvrInt = argInt].
  			[JumpNonZero]			-> [result := rcvrInt ~= argInt] }.
  		 "Must enter any annotatedConstants into the map"
  		 self annotateBytecodeIfAnnotated: (self ssValue: 1).
  		 self annotateBytecodeIfAnnotated: self ssTop.
  		 "Must annotate the bytecode for correct pc mapping."
  		 self ssPop: 2.
  		 ^self ssPushAnnotatedConstant: (result
  											ifTrue: [objectMemory trueObject]
  											ifFalse: [objectMemory falseObject])].
  
  	nextPC := bytecodePC + primDescriptor numBytes.
  	nExts := 0.
  	[branchDescriptor := self generatorAt: (objectMemory fetchByte: nextPC ofObject: methodObj) + (byte0 bitAnd: 256).
  	 branchDescriptor isExtension] whileTrue:
  		[nExts := nExts + 1.
  		 nextPC := nextPC + branchDescriptor numBytes].
  	"Only interested in inlining if followed by a conditional branch."
  	inlineCAB := branchDescriptor isBranchTrue or: [branchDescriptor isBranchFalse].
  	"Further, only interested in inlining = and ~= if there's a SmallInteger constant involved.
  	 The relational operators successfully statically predict SmallIntegers; the equality operators do not."
  	(inlineCAB and: [primDescriptor opcode = JumpZero or: [primDescriptor opcode = JumpNonZero]]) ifTrue:
  		[inlineCAB := argIsInt or: [rcvrIsInt]].
  	inlineCAB ifFalse:
  		[^self genSpecialSelectorSend].
  
  	targetBytecodePC := nextPC
  							+ branchDescriptor numBytes
  							+ (self spanFor: branchDescriptor at: nextPC exts: nExts in: methodObj).
  	postBranchPC := nextPC + branchDescriptor numBytes.
  	argIsInt
  		ifTrue:
  			[(self ssValue: 1) popToReg: ReceiverResultReg.
  			 annotateInst := self ssTop annotateUse.
  			 self ssPop: 2.
  			 self MoveR: ReceiverResultReg R: TempReg]
  		ifFalse:
  			[self marshallSendArguments: 1.
  			 self MoveR: Arg0Reg R: TempReg.
  			 rcvrIsInt ifFalse:
  				[objectRepresentation isSmallIntegerTagNonZero
  					ifTrue: [self AndR: ReceiverResultReg R: TempReg]
  					ifFalse: [self OrR: ReceiverResultReg R: TempReg]]].
  	jumpNotSmallInts := objectRepresentation genJumpNotSmallIntegerInScratchReg: TempReg.
  
  	self ssAllocateRequiredReg: SendNumArgsReg. "Use this as the count reg."
  	counterAddress := counters + ((self sizeof: #sqInt) * counterIndex).
  	self flag: 'will need to use MoveAw32:R: if 64 bits'.
  	self assert: objectMemory wordSize = CounterBytes.
  	self MoveAw: counterAddress R: SendNumArgsReg.
  	self SubCq: 16r10000 R: SendNumArgsReg. "Count executed"
  	"If counter trips simply abort the inlined comparison and send continuing to the following
  	 branch *without* writing back.  A double decrement will not trip the second time."
  	countTripped := self JumpCarry: 0.
  	self MoveR: SendNumArgsReg Aw: counterAddress. "write back"
  
  	argIsInt
  		ifTrue: [annotateInst
  					ifTrue: [self annotateBytecode: (self CmpCq: argInt R: ReceiverResultReg)]
  					ifFalse: [self CmpCq: argInt R: ReceiverResultReg]]
  		ifFalse: [self CmpR: Arg0Reg R: ReceiverResultReg].
  	"Cmp is weird/backwards so invert the comparison.  Further since there is a following conditional
  	 jump bytecode define non-merge fixups and leave the cond bytecode to set the mergeness."
  	self gen: (branchDescriptor isBranchTrue
  				ifTrue: [primDescriptor opcode]
  				ifFalse: [self inverseBranchFor: primDescriptor opcode])
  		operand: (self ensureNonMergeFixupAt: targetBytecodePC - initialPC) asUnsignedInteger.
  	self SubCq: 1 R: SendNumArgsReg. "Count untaken"
  	self MoveR: SendNumArgsReg Aw: counterAddress. "write back"
  	self Jump: (self ensureNonMergeFixupAt: postBranchPC - initialPC).
  	countTripped jmpTarget: (jumpNotSmallInts jmpTarget: self Label).
  	argIsInt ifTrue:
  		[self MoveCq: argInt R: Arg0Reg].
  	^self genMarshalledSend: (coInterpreter specialSelector: byte0 - self firstSpecialSelectorBytecodeOffset)
  		numArgs: 1!

Item was changed:
  ----- Method: SistaStackToRegisterMappingCogit>>genSpecialSelectorEqualsEquals (in category 'bytecode generators') -----
  genSpecialSelectorEqualsEquals
  	"Override to count inlined branches if followed by a conditional branch.
  	 We borrow the following conditional branch's counter and when about to
  	 inline the comparison we decrement the counter (without writing it back)
  	 and if it trips simply abort the inlining, falling back to the normal send which
  	 will then continue to the conditional branch which will trip and enter the abort."
  	| nextPC postBranchPC targetBytecodePC primDescriptor branchDescriptor nExts
  	  counterAddress countTripped unforwardArg unforwardRcvr |
  	<var: #countTripped type: #'AbstractInstruction *'>
  	<var: #primDescriptor type: #'BytecodeDescriptor *'>
  	<var: #branchDescriptor type: #'BytecodeDescriptor *'>
+ 
+ 	(coInterpreter isOptimizedMethod: methodObj) ifTrue: [ ^ super genSpecialSelectorEqualsEquals ].
+ 
  	self ssFlushTo: simStackPtr - 2.
  	primDescriptor := self generatorAt: byte0.
  
  	nextPC := bytecodePC + primDescriptor numBytes.
  	nExts := 0.
  	[branchDescriptor := self generatorAt: (objectMemory fetchByte: nextPC ofObject: methodObj) + (byte0 bitAnd: 256).
  	 branchDescriptor isExtension] whileTrue:
  		[nExts := nExts + 1.
  		 nextPC := nextPC + branchDescriptor numBytes].
  	"Only interested in inlining if followed by a conditional branch."
  	(branchDescriptor isBranchTrue or: [branchDescriptor isBranchFalse]) ifFalse:
  		[^self genSpecialSelectorSend].
  
  	targetBytecodePC := nextPC
  							+ branchDescriptor numBytes
  							+ (self spanFor: branchDescriptor at: nextPC exts: nExts in: methodObj).
  	postBranchPC := nextPC + branchDescriptor numBytes.
  	unforwardRcvr := (self ssValue: 1) type ~= SSConstant
  						or: [objectRepresentation shouldAnnotateObjectReference: (self ssValue: 1) constant].
  	unforwardArg := self ssTop type ~= SSConstant
  						or: [objectRepresentation shouldAnnotateObjectReference: self ssTop constant].
  	self marshallSendArguments: 1.
  
  	self ssAllocateRequiredReg: SendNumArgsReg. "Use this as the count reg."
  	counterAddress := counters + ((self sizeof: #sqInt) * counterIndex).
  	self flag: 'will need to use MoveAw32:R: if 64 bits'.
  	self assert: objectMemory wordSize = CounterBytes.
  	self MoveAw: counterAddress R: SendNumArgsReg.
  	self SubCq: 16r10000 R: SendNumArgsReg. "Count executed"
  	"If counter trips simply abort the inlined comparison and send continuing to the following
  	 branch *without* writing back.  A double decrement will not trip the second time."
  	countTripped := self JumpCarry: 0.
  	self MoveR: SendNumArgsReg Aw: counterAddress. "write back"
  	unforwardRcvr ifTrue:
  		[objectRepresentation genEnsureOopInRegNotForwarded: ReceiverResultReg scratchReg: TempReg].
  	unforwardArg ifTrue:
  		[objectRepresentation genEnsureOopInRegNotForwarded: Arg0Reg scratchReg: TempReg].
  	self CmpR: Arg0Reg R: ReceiverResultReg.
  	"Cmp is weird/backwards so invert the comparison.  Further since there is a following conditional
  	 jump bytecode define non-merge fixups and leave the cond bytecode to set the mergeness."
  	self gen: (branchDescriptor isBranchTrue ifTrue: [JumpZero] ifFalse: [JumpNonZero])
  		operand: (self ensureNonMergeFixupAt: targetBytecodePC - initialPC) asUnsignedInteger.
  	self SubCq: 1 R: SendNumArgsReg. "Count untaken"
  	self MoveR: SendNumArgsReg Aw: counterAddress. "write back"
  	self Jump: (self ensureNonMergeFixupAt: postBranchPC - initialPC).
  	countTripped jmpTarget: self Label.
  	^self genMarshalledSend: (coInterpreter specialSelector: byte0 - self firstSpecialSelectorBytecodeOffset)
  		numArgs: 1!

Item was changed:
  InterpreterPrimitives subclass: #StackInterpreter
  	instanceVariableNames: 'currentBytecode bytecodeSetSelector localFP localIP localSP stackLimit stackPage stackPages method instructionPointer stackPointer framePointer localReturnValue extA extB primitiveFunctionPointer methodCache atCache lkupClassTag lkupClass methodDictLinearSearchLimit highestRunnableProcessPriority nextWakeupUsecs nextPollUsecs inIOProcessEvents interruptKeycode interruptPending savedWindowSize imageHeaderFlags fullScreenFlag deferDisplayUpdates pendingFinalizationSignals extraVMMemory interpreterProxy showSurfaceFn primitiveTable primitiveAccessorDepthTable externalPrimitiveTable externalPrimitiveTableFirstFreeIndex overflowedPage extraFramesToMoveOnOverflow globalSessionID jmpBuf jmpDepth suspendedCallbacks suspendedMethods numStackPages desiredNumStackPages desiredEdenBytes classNameIndex thisClassIndex metaclassNumSlots interruptCheckChain suppressHeartbeatFlag breakSelector breakSelectorLength longRunningPrimitiveCheckMethod longRunningPrimitiveCheckSemaphore longRunningPrimitiveStartUsecs longRunningPrimitiveStopUsecs longRunningPrimitiveGCUsecs longRunningPrimitiveCheckSequenceNumber longRunningPrimitiveSignalUndelivered checkAllocFiller tempOop tempOop2 theUnknownShort the2ndUnknownShort imageFloatsBigEndian maxExtSemTabSizeSet lastMethodCacheProbeWrite gcSemaphoreIndex classByteArrayCompactIndex statForceInterruptCheck statStackOverflow statStackPageDivorce statCheckForEvents statProcessSwitch statIOProcessEvents statPendingFinalizationSignals'
+ 	classVariableNames: 'AltBytecodeEncoderClassName AltLongStoreBytecode AlternateHeaderHasPrimFlag AtCacheEntries AtCacheFixedFields AtCacheFmt AtCacheMask AtCacheOop AtCacheSize AtCacheTotalSize AtPutBase BytecodeEncoderClassName BytecodeTable CacheProbeMax DirBadPath DirEntryFound DirNoMoreEntries DumpStackOnLowSpace EnclosingMixinIndex EnclosingObjectIndex FailImbalancedPrimitives HeaderFlagBitPosition LongStoreBytecode MaxExternalPrimitiveTableSize MaxJumpBuf MaxPrimitiveIndex MaxQuickPrimitiveIndex MethodHeaderArgCountShift MethodHeaderOptimizedBitShift MethodHeaderTempCountShift MixinIndex PrimitiveExternalCallIndex PrimitiveTable StackPageReachedButUntraced StackPageTraceInvalid StackPageTraced StackPageUnreached V3PrimitiveBitsMask'
- 	classVariableNames: 'AltBytecodeEncoderClassName AltLongStoreBytecode AlternateHeaderHasPrimFlag AtCacheEntries AtCacheFixedFields AtCacheFmt AtCacheMask AtCacheOop AtCacheSize AtCacheTotalSize AtPutBase BytecodeEncoderClassName BytecodeTable CacheProbeMax DirBadPath DirEntryFound DirNoMoreEntries DumpStackOnLowSpace EnclosingMixinIndex EnclosingObjectIndex FailImbalancedPrimitives HeaderFlagBitPosition LongStoreBytecode MaxExternalPrimitiveTableSize MaxJumpBuf MaxPrimitiveIndex MaxQuickPrimitiveIndex MethodHeaderArgCountShift MethodHeaderTempCountShift MixinIndex PrimitiveExternalCallIndex PrimitiveTable StackPageReachedButUntraced StackPageTraceInvalid StackPageTraced StackPageUnreached V3PrimitiveBitsMask'
  	poolDictionaries: 'VMBasicConstants VMMethodCacheConstants VMObjectIndices VMSpurObjectRepresentationConstants VMSqueakClassIndices VMSqueakV3BytecodeConstants VMStackFrameOffsets'
  	category: 'VMMaker-Interpreter'!
  
  !StackInterpreter commentStamp: 'eem 12/5/2014 11:32' prior: 0!
  This class is a complete implementation of the Smalltalk-80 virtual machine, derived originally from the Blue Book specification but quite different in some areas.  This VM supports Closures but *not* old-style BlockContexts.
  
  It has been modernized with 32-bit pointers, better management of Contexts (see next item), and attention to variable use that allows the CCodeGenerator (qv) to keep, eg, the instruction pointer and stack pointer in registers as well as keeping most simple variables in a global array that seems to improve performance for most platforms.
  
  The VM does not use Contexts directly.  Instead Contexts serve as proxies for a more conventional stack format that is invisible to the image.  There is considerable explanation at http://www.mirandabanda.org/cogblog/2009/01/14/under-cover-contexts-and-the-big-frame-up.  The VM maintains a fixed-size stack zone divided into pages, each page being capable of holding several method/block activations.  A send establishes a new frame in the current stack page, a return returns to the previous frame.  This eliminates allocation/deallocation of contexts and the moving of receiver and arguments from caller to callee on each send/return.  Contexts are created lazily when an activation needs a context (creating a block, explicit use of thisContext, access to sender when sender is a frame, or linking of stack pages together).  Contexts are either conventional and heap-resident ("single") or "married" and serve as proxies for their corresponding frame or "widowed", meaning that their spouse frame has been returned from (died).  A married context is specially marked (more details in the code) and refers to its frame.  Likewise a married frame is specially marked and refers to its context.
  
  In addition to SmallInteger arithmetic and Floats, the VM supports logic on 32-bit PositiveLargeIntegers, thus allowing it to simulate itself much more effectively than would otherwise be the case.
  
  StackInterpreter and subclasses support multiple memory managers.  Currently there are two.  NewMemoryManager is a slightly refined version of ObjectMemory, and is the memory manager and garbage collector for the original Squeak object representation as described in "Back to the Future The Story of Squeak, A Practical Smalltalk Written in Itself", see http://ftp.squeak.org/docs/OOPSLA.Squeak.html.  Spur is a faster, more regular object representation that is designed for more performance and functionality, and to have a common header format for both 32-bit and 64-bit versions.  You can read about it in SpurMemoryManager's class comment.  There is also a video of a presentation at ESUG 2014 (https://www.youtube.com/watch?v=k0nBNS1aHZ4), along with slides (http://www.slideshare.net/esug/spur-a-new-object-representation-for-cog?related=1).!

Item was changed:
  ----- Method: StackInterpreter class>>initializeMethodIndices (in category 'initialization') -----
  initializeMethodIndices
  	| tagBits |
  	"Class CompiledMethod"
  	HeaderIndex := 0.
  	LiteralStart := 1.
  
  	tagBits := self objectMemoryClass numSmallIntegerTagBits.
  	LargeContextBit := 16r20000 << tagBits.  "This bit set in method headers if large context is needed."
+ 	MethodHeaderOptimizedBitShift := 28 + tagBits.
  	MethodHeaderTempCountShift := 18 + tagBits.
  	MethodHeaderArgCountShift := 24 + tagBits.
  	V3PrimitiveBitsMask := 16r100001FF << tagBits.
  	AlternateHeaderHasPrimFlag := 16r10000 << tagBits.
  	"The position of the unused flag bit in the method header, not including tag bit(s)"
  	HeaderFlagBitPosition := 29!

Item was added:
+ ----- Method: StackInterpreter>>isOptimizedMethod: (in category 'compiled methods') -----
+ isOptimizedMethod: header
+ 	<api>
+ 	^ (header >> MethodHeaderOptimizedBitShift bitAnd: 1) = 1 !



More information about the Vm-dev mailing list