[Vm-dev] VM Maker: VMMaker.oscog-eem.2029.mcz

commits at source.squeak.org commits at source.squeak.org
Wed Dec 7 23:04:33 UTC 2016


Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-eem.2029.mcz

==================== Summary ====================

Name: VMMaker.oscog-eem.2029
Author: eem
Time: 7 December 2016, 3:03:48.468304 pm
UUID: dced5a65-92cd-44de-9d84-41bb46cb09a7
Ancestors: VMMaker.oscog-eem.2028

Fix occasional start-up crashes on linux x64.  The choice of 16rBADA550 for a constant in closed PIC prototype generation (used to compute offsets in PICs) occasionally coincides with the address for code chosen by address space randomization, and so confuses the Cogit into generating pc-relative addressing to create this address, hence causing mayhem later on when the wrong sizes for instructions are used to decode closed PIC contents.  Hence also beef up the asserts regarding distinguishing constants planted in code on X64 for MoveCwR PushCw and ArithCwR.

Implement the Cogit side of mustBeBoolean processing in the RegisterAllocatingCogit.

Add a variable to help debug the crashes in callbacks experienced in the Pharo VM.  The variable is a set of bit flags tracing the path through returnAs:ThroughCallback:Context:.

Fix a compilation warning with the definition of sigsetjump & siglongjmp in the interpreter.

=============== Diff against VMMaker.oscog-eem.2028 ===============

Item was removed:
- ----- Method: CoInterpreter>>ceSendMustBeBooleanInterpreting: (in category 'trampolines') -----
- ceSendMustBeBooleanInterpreting: anObject
- 	"For RegisterAllocatingCogit we want the address following a conditional branch not to be reachable, so we
- 	 don't have to generate code to reload registers.  Instead simply convert to an interpreter frame."
- 	<api>
- 	self shouldBeImplemented.
- 	instructionPointer := self popStack.
- 	self push: anObject.
- 	self push: instructionPointer.
- 	^self
- 		ceSendAbort: (objectMemory splObj: SelectorMustBeBoolean)
- 		to: anObject
- 		numArgs: 0!

Item was added:
+ ----- Method: CoInterpreter>>ceSendMustBeBooleanTo:interpretingAtDelta: (in category 'trampolines') -----
+ ceSendMustBeBooleanTo: anObject interpretingAtDelta: jumpSize
+ 	"For RegisterAllocatingCogit we want the address following a conditional branch not to be reachable, so we
+ 	 don't have to generate code to reload registers.  Instead simply convert to an interpreter frame."
+ 	<api>
+ 	self flag: 'A nice way to implement this is to
+ 		- provide an additional argument that is the size of the branch (have trampolines for 1 & 2 byte jumps).
+ 		  Note that there is no need to back up before any extensions since the branch will not be taken.
+ 			ceSendMustBeBooleanTo: anObject interpretingAtDelta: jumpSize
+ 		- map the frame to an interpreter frame
+ 		- push anObject
+ 		- back up the pc to position the interpreter at the jump
+ 		- enter the interpreter, hence reexecuting the mustBeBoolean'.
+ 	self shouldBeImplemented.
+ 	instructionPointer := self popStack.
+ 	self push: anObject.
+ 	self push: instructionPointer.
+ 	^self
+ 		ceSendAbort: (objectMemory splObj: SelectorMustBeBoolean)
+ 		to: anObject
+ 		numArgs: 0!

Item was changed:
  ----- Method: CogInLineLiteralsX64Compiler>>concretizeArithCwR: (in category 'generate machine code') -----
  concretizeArithCwR: x64opcode
  	| value reg reverse |
  	value := operands at: 0.
  	reg := operands at: 1.
  	reverse := x64opcode = 16r85 or: [x64opcode = 16r39]. "Tst & Cmp; backwards"
  	machineCode
  		at:  0 put: (self rexR: RISCTempReg x: 0 b: RISCTempReg);
  		at:  1 put: 16rB8 + (RISCTempReg bitAnd: 7);
  		at:  2 put: (value bitAnd: 16rFF);
  		at:  3 put: (value >> 8 bitAnd: 16rFF);
  		at:  4 put: (value >> 16 bitAnd: 16rFF);
  		at:  5 put: (value >> 24 bitAnd: 16rFF);
  		at:  6 put: (value >> 32 bitAnd: 16rFF);
  		at:  7 put: (value >> 40 bitAnd: 16rFF);
  		at:  8 put: (value >> 48 bitAnd: 16rFF);
  		at:  9 put: (value >> 56 bitAnd: 16rFF);
  		at: 10 put: (reverse
  					ifTrue: [self rexR: RISCTempReg x: 0 b: reg]
  					ifFalse: [self rexR: reg x: 0 b: RISCTempReg]);
  		at: 11 put: x64opcode;
  		at: 12 put: (reverse
  					ifTrue: [self mod: ModReg RM: reg RO: RISCTempReg]
  					ifFalse: [self mod: ModReg RM: RISCTempReg RO: reg]).
+ 	self assert: (machineCode at: 12) > 16r90. "See literalBeforeFollowingAddress:"
  	^machineCodeSize := 13!

Item was changed:
  ----- Method: CogInLineLiteralsX64Compiler>>concretizeMoveCwR (in category 'generate machine code') -----
  concretizeMoveCwR
  	"Will get inlined into concretizeAt: switch."
  	<inline: true>
  	| value reg offset |
  	value := operands at: 0.
  	reg := operands at: 1.
  	(self isAnInstruction: (cogit cCoerceSimple: value to: #'AbstractInstruction *')) ifTrue:
  		[value := (cogit cCoerceSimple: value to: #'AbstractInstruction *') address].
  	(cogit addressIsInCurrentCompilation: value) ifTrue:
  		[offset := value - (address + 7).
  		 machineCode
  			at: 0 put: (self rexR: reg x: 0 b: 0);
  			at: 1 put: 16r8D; "LoadEffectiveAddress"
  			at: 2 put: (self mod: ModRegInd RM: 5 RO: reg);
  			at: 3 put: (offset bitAnd: 16rFF);
  			at: 4 put: (offset >> 8 bitAnd: 16rFF);
  			at: 5 put: (offset >> 16 bitAnd: 16rFF);
  			at: 6 put: (offset >> 24 bitAnd: 16rFF).
  		^machineCodeSize := 7].
  	machineCode
  		at:  0 put: (self rexR: 0 x: 0 b: reg);
  		at:  1 put: 16rB8 + (reg bitAnd: 7);
  		at:  2 put: (value bitAnd: 16rFF);
  		at:  3 put: (value >> 8 bitAnd: 16rFF);
  		at:  4 put: (value >> 16 bitAnd: 16rFF);
  		at:  5 put: (value >> 24 bitAnd: 16rFF);
  		at:  6 put: (value >> 32 bitAnd: 16rFF);
  		at:  7 put: (value >> 40 bitAnd: 16rFF);
  		at:  8 put: (value >> 48 bitAnd: 16rFF);
  		at:  9 put: (value >> 56 bitAnd: 16rFF).
+ 	"Add a nop to disambiguate between MoveCwR/PushCw and ArithCwR, which ends with a (self mod: ModReg RM: 0 RO: 0)"
- 	opcode = MoveCqR ifTrue:
- 		[^machineCodeSize := 10].
- 	"Add a nop to disambiguate between MoveCwR/PushCwR and ArithCwR, which ends with a (self mod: ModReg RM: 0 RO: 0)"
  	machineCode at: 10 put: 16r90.
- 	self assert: (self mod: ModReg RM: 0 RO: 0) > 16r90.
  	^machineCodeSize := 11!

Item was changed:
  ----- Method: CogInLineLiteralsX64Compiler>>concretizePushCw (in category 'generate machine code') -----
  concretizePushCw
  	"Will get inlined into concretizeAt: switch."
  	<inline: true>
  	| value offset |
  	value := operands at: 0.
  	(self isAnInstruction: (cogit cCoerceSimple: value to: #'AbstractInstruction *')) ifTrue:
  		[value := (cogit cCoerceSimple: value to: #'AbstractInstruction *') address].
  	(cogit addressIsInCurrentCompilation: value) ifTrue:
  		[offset := value - (address + 7).
  		 machineCode
  			at: 0 put: (self rexR: RISCTempReg x: 0 b: 0);
  			at: 1 put: 16r8D; "LoadEffectiveAddress"
  			at: 2 put: (self mod: ModRegInd RM: 5 RO: RISCTempReg);
  			at: 3 put: (offset bitAnd: 16rFF);
  			at: 4 put: (offset >> 8 bitAnd: 16rFF);
  			at: 5 put: (offset >> 16 bitAnd: 16rFF);
  			at: 6 put: (offset >> 24 bitAnd: 16rFF);
  			at: 7 put: 16r41;
  			at: 8 put: 16r48 + RISCTempReg.
  		^machineCodeSize := 9].
  	machineCode
  		at:  0 put: (self rexR: RISCTempReg x: 0 b: RISCTempReg);
  		at:  1 put: 16rB8 + (RISCTempReg bitAnd: 7);
  		at:  2 put: (value bitAnd: 16rFF);
  		at:  3 put: (value >> 8 bitAnd: 16rFF);
  		at:  4 put: (value >> 16 bitAnd: 16rFF);
  		at:  5 put: (value >> 24 bitAnd: 16rFF);
  		at:  6 put: (value >> 32 bitAnd: 16rFF);
  		at:  7 put: (value >> 40 bitAnd: 16rFF);
  		at:  8 put: (value >> 48 bitAnd: 16rFF);
  		at:  9 put: (value >> 56 bitAnd: 16rFF);
  		at: 10 put: 16r41;
+ 		at: 11 put: 16r48 + RISCTempReg. "The 48 will disambiguate between MoveCwR, PushCw and ArithCwR, which ends with a (self mod: ModReg RM: 0 RO: 0)"
+ 	self assert: (machineCode at: 11) < 16r90. "see literalBeforeFollowingAddress:"
- 		at: 11 put: 16r48 + RISCTempReg. "The 48 will disambiguate between MoveCwR, PushCwR and ArithCwR, which ends with a (self mod: ModReg RM: 0 RO: 0)"
- 		self assert: RISCTempReg >= 8.
- 	self assert: (self mod: ModReg RM: 0 RO: 0) > 16r57.
  	^machineCodeSize := 12!

Item was changed:
  ----- Method: CogInLineLiteralsX64Compiler>>literalBeforeFollowingAddress: (in category 'inline cacheing') -----
  literalBeforeFollowingAddress: followingAddress
  	"Answer the literal embedded in the instruction immediately preceding followingAddress.
+ 	 This is used in the MoveCwR, PushCw and ArithCwR cases; these are distinguished by a
+ 	 nop following the literal load in MoveCwR, a 16r48 + reg ending the PushCw sequence, and
+ 	 a (self mod: ModReg RM: rX RO: rY) ending the ArithCwR sequence, which is at least 16rC0."
- 	 This is used in the MoveCwR, PushCwR and CmpCwR cases; these are distinguished by a
- 	 nop following the literal load in MoveCwR, a 16r50 + reg ending the PushCwR sequence, and
- 	 a (self mod: ModReg RM: rX RO: rY) ending the CmpCwR sequence, which is at least 16rC0."
  	| lastByte base |
  	lastByte := objectMemory byteAt: followingAddress - 1.
+ 	base := followingAddress - (lastByte = 16r90
+ 									ifTrue: [9]				"MoveCwR"
+ 									ifFalse:
+ 										[lastByte < 16r90
+ 											ifTrue: [10]		"PushCw"
+ 											ifFalse: [11]]).	"ArithCwR"
+ 	^objectMemory unalignedLongAt: base
+ 	
+ 	"(Symbol allSymbols select: [:s| '*Cw:R:' match: s]), {#PushCw:} collect: [:s| {s. (self systemNavigation allCallsOn: s localToPackage: #VMMaker) size}]"!
- 	base := followingAddress - (lastByte <= 16r90
- 									ifTrue:
- 										[lastByte = 16r90
- 											ifTrue: [9]		"MoveCwR"
- 											ifFalse: [10]]	"PushCwR"
- 									ifFalse: [11]).			"ArithCwR"
- 	^objectMemory unalignedLongAt: base!

Item was changed:
  ----- Method: CogInLineLiteralsX64Compiler>>storeLiteral:beforeFollowingAddress: (in category 'inline cacheing') -----
  storeLiteral: literal beforeFollowingAddress: followingAddress
  	"Rewrite the literal in the instruction immediately preceding followingAddress.
+ 	 This is used in the MoveCwR, PushCw and CmpCwR cases; these are distinguished by a
+ 	 nop following the literal load in MoveCwR, a 16r50 + reg ending the PushCw sequence, and
- 	 This is used in the MoveCwR, PushCwR and CmpCwR cases; these are distinguished by a
- 	 nop following the literal load in MoveCwR, a 16r50 + reg ending the PushCwR sequence, and
  	 a (self mod: ModReg RM: rX RO: rY) ending the CmpCwR sequence, which is at least 16rC0."
  	| lastByte base |
  	lastByte := objectMemory byteAt: followingAddress - 1.
  	base := followingAddress - (lastByte <= 16r90
  									ifTrue:
  										[lastByte = 16r90
  											ifTrue: [9]		"MoveCwR"
+ 											ifFalse: [10]]	"PushCw"
- 											ifFalse: [10]]	"PushCwR"
  									ifFalse: [11]).			"ArithCwR"
  	objectMemory unalignedLongAt: base put: literal!

Item was changed:
  ----- Method: CogMIPSELCompiler>>relocateMethodReferenceBeforeAddress:by: (in category 'inline cacheing') -----
  relocateMethodReferenceBeforeAddress: pc by: delta
  	| oldValue newValue |
  	"cogit disassembleFrom: pc - 16 to: pc + 16 a StackToRegisterMappingCogit."
  	
  	((self opcodeAtAddress: pc - 8) = ADDIU and: [(self opcodeAtAddress: pc - 4) = SW]) ifTrue:
+ 		["PushCw"
- 		["PushCwR"
  		oldValue := self literalAtAddress: pc - 12.
  		newValue := oldValue + delta.
  		self literalAtAddress: pc - 12 put: newValue.	
  		self assert: (self literalAtAddress: pc - 12) = newValue.
  		^self].
  
  	"MoveCwR"
  	oldValue := self literalAtAddress: pc - 4.
  	newValue := oldValue + delta.
  	self literalAtAddress: pc - 4 put: newValue.
  	
  	"cogit disassembleFrom: pc - 8 to: pc."
  	self assert: (self literalAtAddress: pc - 4) = newValue.
  	!

Item was changed:
  ----- Method: CogMIPSELCompiler>>storeLiteral:beforeFollowingAddress: (in category 'inline cacheing') -----
  storeLiteral: literal beforeFollowingAddress: followingAddress
+ 	"Rewrite the long constant loaded by a MoveCwR or PushCw before the given address"
+ 	self flag: #bogus. "The caller ought to know what it is patching, and this should be split into separate methods with stricter checking. rmacnak 12/13/2015"
- 	"Rewrite the long constant loaded by a MoveCwR or PushCwR before the given address"
- 	self flag: #bogus. "The caller ought to know what it is patching, and this should be split into separate methods with stricter checking."
  	
  	"Cmp/MoveCwR
  	 pc-8	lui rx, uper
  	 pc-4	ori rx, rx, lower"
  	(self opcodeAtAddress: followingAddress - 4) = ORI ifTrue:
  		[^self literalAtAddress: followingAddress - 4 put: literal].
  
  	"PushCw
  	 pc-16	lui at, upper
  	 pc-12	ori at, at, lower
  	 pc-8	addiu sp, sp, -4
  	 pc-4	sw at, 0(sp)"
  	((self opcodeAtAddress: followingAddress - 4) = SW and:
  		[(self opcodeAtAddress: followingAddress - 8) = ADDIU]) ifTrue:
  			[^self literalAtAddress: followingAddress - 12 put: literal].
  	
  	self unreachable.
  	^0!

Item was changed:
  ----- Method: CogX64Compiler>>concretizeMoveCqR (in category 'generate machine code') -----
  concretizeMoveCqR
  	"Will get inlined into concretizeAt: switch.
+ 	 On x64 we can short-cut mov 0, reg using xor, and use signed 32-bit displacement, if possible."
- 	 On x64 we can short-cut mov 0, reg using xor, and use 32-bit displacement, signed or unsigned, if possible."
  	<inline: true>
  	| value reg |
  	value := operands at: 0.
- 	(self is32BitSignedImmediate: value) ifFalse:
- 		[^self concretizeMoveCwR].
  	reg := operands at: 1.
+ 	(self is32BitSignedImmediate: value) ifTrue:
+ 		[value = 0 ifTrue:
+ 			[machineCode
+ 				at: 0 put: (self rexR: reg x: 0 b: reg);
+ 				at: 1 put: 16r31;
+ 				at: 2 put: (self mod: ModReg RM: reg RO: reg).
+ 			 ^machineCodeSize := 3].
+ 		 machineCode
+ 			at: 0 put: (self rexR: 0 x: 0 b: reg);
+ 			at: 1 put: 16rC7;
+ 			at: 2 put: (self mod: ModReg RM: reg RO: 0);
+ 			at: 3 put: (value bitAnd: 16rFF);
+ 			at: 4 put: (value >> 8 bitAnd: 16rFF);
+ 			at: 5 put: (value >> 16 bitAnd: 16rFF);
+ 			at: 6 put: (value >> 24 bitAnd: 16rFF).
+ 		 ^machineCodeSize := 7].
+ 
- 	value = 0 ifTrue:
- 		[machineCode
- 			at: 0 put: (self rexR: reg x: 0 b: reg);
- 			at: 1 put: 16r31;
- 			at: 2 put: (self mod: ModReg RM: reg RO: reg).
- 		^machineCodeSize := 3].
  	machineCode
+ 		at:  0 put: (self rexR: 0 x: 0 b: reg);
+ 		at:  1 put: 16rB8 + (reg bitAnd: 7);
+ 		at:  2 put: (value bitAnd: 16rFF);
+ 		at:  3 put: (value >> 8 bitAnd: 16rFF);
+ 		at:  4 put: (value >> 16 bitAnd: 16rFF);
+ 		at:  5 put: (value >> 24 bitAnd: 16rFF);
+ 		at:  6 put: (value >> 32 bitAnd: 16rFF);
+ 		at:  7 put: (value >> 40 bitAnd: 16rFF);
+ 		at:  8 put: (value >> 48 bitAnd: 16rFF);
+ 		at:  9 put: (value >> 56 bitAnd: 16rFF).
+ 	^machineCodeSize := 10!
- 		at: 0 put: (self rexR: 0 x: 0 b: reg);
- 		at: 1 put: 16rC7;
- 		at: 2 put: (self mod: ModReg RM: reg RO: 0);
- 		at: 3 put: (value bitAnd: 16rFF);
- 		at: 4 put: (value >> 8 bitAnd: 16rFF);
- 		at: 5 put: (value >> 16 bitAnd: 16rFF);
- 		at: 6 put: (value >> 24 bitAnd: 16rFF).
- 	^machineCodeSize := 7!

Item was changed:
  ----- Method: Cogit>>addressIsInCurrentCompilation: (in category 'testing') -----
  addressIsInCurrentCompilation: address
+ 	<inline: true>
  	^address asUnsignedInteger >= methodLabel address
  	  and: [address asUnsignedInteger < methodZone youngReferrers]!

Item was changed:
  ----- Method: Cogit>>compileClosedPICPrototype (in category 'in-line cacheing') -----
  compileClosedPICPrototype
  	"Compile the abstract instructions for a full closed PIC, used to generate the chunk of code
  	 which is copied to form each closed PIC.  A Closed Polymorphic Inline Cache is a small jump
  	 table used to optimize sends with a limited degree of polymorphism (currently up to 6 cases).
  	 We call it closed because it deals only with a finite number of cases, as opposed to an Open PIC.
  	 When a monomorphic linked send (a send with a single case, linking direct to the checked entry
  	 point of a CogMethod) fails a class check, the Cogit attempts to create a two-entry PIC that will
  	 handle jumping to the original target for the original class and the relevant target for the new
  	 class.  This jump table will be extended on subsequent failures up to a limit (6).
  
  	 We avoid extending CPICs to Open PICs by linking the send site to an Open PIC if one already
  	 exists with the send's selector, a good policy since measurements show that sends of mega-
  	 morphic selectors usually become megamorphic at all send sites.  Hence the Open PIC list.
  
  	 A CPIC also optimizes MNUs and interpret-only methods.  Each case can load SendNumArgs with
  	 the oop of a method, or will load SendNumArgs with 0 if not.  MNUs are optimized by jumping to
  	 the mnuAbort in the CPIC, which calls code that creates the Message, thereby avoiding looking up
  	 the original message which will not be found, and either looks up doesNotUnderstand: or directly
  	 activates the method loaded into SendNumArgs, hence avoiding looking up doesNotUnderstand:.
  	 Interpret-only methods are handled by jumping to the picInterpretAbort, which enters the
  	 interpreter activating the method loaded in SendNumArgs.
  
  	 CPICs look like the following, where rClass is set at the original send site for the 1st case, and #Foo
  	 is some constant, either an oop, a class tag or an instruction address.
  
  		rTemp := (rRecever bitAnd: TagMask) = 0 ifTrue: [rReceiver class] ifFalse: [rRecever bitAnd: TagMask].
  		rTemp = rClass ifFalse:
  			[self goto: #Label].
  		rSendNumArgs := #MethodForCase1Or0.
  		self goto: #TargetForCase1.
  	 #Label
  		rTemp = #ClassTagForCase6 ifTrue:
  			[rSendNumArgs := #MethodForCase6Or0.
  			 self goto: #TargetForCase6].
  		...cases 5, 4 & 3
  		rTemp = #ClassTagForCase2 ifTrue:
  			[rSendNumArgs := #MethodForCase2Or0.
  			 self goto: #TargetForCase2].
  		self goto: #CPICMissTrampoline
  		literals (if out-of-line literals)
  
  	 where we short-cut as many cases as needed by making the self goto: #Label skip as many cases
  	 as needed."
  	<inline: true>
  	| numArgs jumpNext |
  	<var: #jumpNext type: #'AbstractInstruction *'>
  	self compilePICAbort: (numArgs := 0). "Will get rewritten to appropriate arity when configuring."
  	jumpNext := self compileCPICEntry.
  	"At the end of the entry code we need to jump to the first case code, which is actually the last chunk.
  	 On each entension we must update this jump to move back one case."
+ 	self MoveUniqueCw: self firstPrototypeMethodOop R: SendNumArgsReg.
- 	"16r5EAF00D is the method oop, or 0, for the 1st case."
- 	self MoveUniqueCw: 16r5EAF00D R: SendNumArgsReg.
  	self JumpLong: self cPICPrototypeCaseOffset + 16rCA5E10.
  	endCPICCase0 := self Label.
  	1 to: MaxCPICCases - 1 do:
  		[:h|
  		h = (MaxCPICCases - 1) ifTrue:
  			[jumpNext jmpTarget: self Label]. "this is where we jump to for the first case"
+ 		self MoveUniqueCw: self subsequentPrototypeMethodOop + h R: SendNumArgsReg.
- 		"16rBADA550+h is the method oop, or 0, for the Nth case."
- 		self MoveUniqueCw: 16rBADA550 + h R: SendNumArgsReg.
  		"16rBABE1F15+h is the class tag for the Nth case"
  		self CmpC32: 16rBABE1F15+h R: TempReg.
  		self JumpLongZero: self cPICPrototypeCaseOffset + 16rCA5E10 + (h * 16).
  		h = 1 ifTrue:
  			[endCPICCase1 := self Label]].
  	self MoveCw: methodLabel address R: ClassReg.
  	self JumpLong: (self cPICMissTrampolineFor: numArgs).	"Will get rewritten to appropriate arity when configuring."
  	cPICEndOfCodeLabel := self Label.
  	literalsManager dumpLiterals: false.
  	^0!

Item was changed:
  ----- Method: Cogit>>expectedClosedPICPrototype: (in category 'in-line cacheing') -----
  expectedClosedPICPrototype: cPIC
  	"Use asserts to check if the ClosedPICPrototype is as expected from compileClosedPICPrototype,
  	 and can be updated as required via rewriteCPICCaseAt:tag:objRef:target:.  If all asserts pass, answer
  	 0, otherwise answer a bit mask identifying all the errors."
  	"self disassembleFrom: methodZoneBase + (self sizeof: CogMethod) to: methodZoneBase + closedPICSize"
  	<var: #cPIC type: #'CogMethod *'>
  	| pc errors object classTag entryPoint |
  	errors := 0.
  	pc := cPIC asUnsignedInteger + firstCPICCaseOffset.
  	"First jump is unconditional; subsequent ones are conditional"
  	object := backEnd literalBeforeFollowingAddress: pc - backEnd jumpLongByteSize.
+ 	(self asserta: object = self firstPrototypeMethodOop) ifFalse:
- 	(self asserta: object = 16r5EAF00D) ifFalse:
  		[errors := 1].
  
  	entryPoint := backEnd jumpLongTargetBeforeFollowingAddress: pc.
  	(self asserta: entryPoint = (self cPICPrototypeCaseOffset + 16rCA5E10)) ifFalse:
  		[errors := errors + 2].
  
  	1 to: MaxCPICCases - 1 do:
  		[:i | | methodObjPC classTagPC |
  		pc := pc + cPICCaseSize.
  
  		"verify information in case is as expected."
  		methodObjPC := pc - backEnd jumpLongConditionalByteSize - backEnd cmpC32RTempByteSize.
  		object := backEnd literalBeforeFollowingAddress: methodObjPC.
+ 		(self asserta: object = (self subsequentPrototypeMethodOop+ i)) ifFalse:
- 		(self asserta: object = (16rBADA550 + i)) ifFalse:
  			[errors := errors bitOr: 4].
  
  		classTagPC := pc - backEnd jumpLongConditionalByteSize.
  		classTag := backEnd literal32BeforeFollowingAddress: classTagPC.
  		(self asserta: classTag = (16rBABE1F15 + i)) ifFalse:
  			[errors := errors bitOr: 8].
  
  		entryPoint := backEnd jumpLongConditionalTargetBeforeFollowingAddress: pc.
  		(self asserta: entryPoint = (self cPICPrototypeCaseOffset + 16rCA5E10 + (i * 16))) ifFalse:
  			[errors := errors bitOr: 16].
  
  		"change case via rewriteCPICCaseAt:tag:objRef:target:"
  		self rewriteCPICCaseAt: pc
  			tag: (classTag bitXor: 16r5A5A5A5A)
  			objRef: (object bitXor: 16rA5A5A5A5)
  			target: (entryPoint bitXor: 16r55AA50). "don't xor least 4 bits to leave instruction alignment undisturbed"
  
  		"verify information in case is as expected post update."
  		object := backEnd literalBeforeFollowingAddress: methodObjPC.
+ 		(self asserta: object = (self subsequentPrototypeMethodOop + i bitXor: 16rA5A5A5A5)) ifFalse:
- 		(self asserta: object = (16rBADA550 + i bitXor: 16rA5A5A5A5)) ifFalse:
  			[errors := errors bitOr: 32].
  		classTag := backEnd literal32BeforeFollowingAddress: classTagPC.
  		(self asserta: classTag = (16rBABE1F15 + i bitXor: 16r5A5A5A5A)) ifFalse:
  			[errors := errors bitOr: 64].
  		entryPoint := backEnd jumpLongConditionalTargetBeforeFollowingAddress: pc.
  		(self asserta: entryPoint = ((self cPICPrototypeCaseOffset + 16rCA5E10 + (i * 16)) bitXor: 16r55AA50)) ifFalse:
  			[errors := errors bitOr: 128].
  
  		"finally restore case to the original state"
  		self rewriteCPICCaseAt: pc
  			tag: (classTag bitXor: 16r5A5A5A5A)
  			objRef: (object bitXor: 16rA5A5A5A5)
  			target: (entryPoint bitXor: 16r55AA50)].
  
  	entryPoint := backEnd jumpLongTargetBeforeFollowingAddress: pc + cPICEndSize - literalsManager endSizeOffset.
  	(self asserta: entryPoint = (self cPICMissTrampolineFor: 0)) ifFalse:
  		[errors := errors + 256].
  	
  	^errors!

Item was added:
+ ----- Method: Cogit>>firstPrototypeMethodOop (in category 'in-line cacheing') -----
+ firstPrototypeMethodOop
+ 	"Answer a fake value for the first method oop in the PIC prototype.
+ 	 Since we use MoveUniqueCw:R: it must not be confused with a method-relative address."
+ 	<inline: false>
+ 	^(self addressIsInCurrentCompilation: 16r5EAF00D)
+ 		ifTrue: [16rCA7F00D]
+ 		ifFalse: [16r5EAF00D]!

Item was added:
+ ----- Method: Cogit>>genCallMustBeBooleanFor: (in category 'trampoline support') -----
+ genCallMustBeBooleanFor: boolean
+ 	^self CallRT: (boolean = objectMemory falseObject
+ 					ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
+ 					ifFalse: [ceSendMustBeBooleanAddTrueTrampoline])!

Item was added:
+ ----- Method: Cogit>>subsequentPrototypeMethodOop (in category 'in-line cacheing') -----
+ subsequentPrototypeMethodOop
+ 	"Answer a fake value for the method oop in other than the first case in the PIC prototype.
+ 	 Since we use MoveUniqueCw:R: it must not be confused with a method-relative address."
+ 	<inline: false>
+ 	^(self addressIsInCurrentCompilation: 16rBADA550)
+ 		ifTrue: [16rDEADEAD]
+ 		ifFalse: [16rBADA550]!

Item was changed:
  StackToRegisterMappingCogit subclass: #RegisterAllocatingCogit
+ 	instanceVariableNames: 'numFixups mergeSimStacksBase nextFixup scratchSimStack scratchSpillBase scratchOptStatus ceSendMustBeBooleanAddTrueLongTrampoline ceSendMustBeBooleanAddFalseLongTrampoline'
- 	instanceVariableNames: 'numFixups mergeSimStacksBase nextFixup scratchSimStack scratchSpillBase scratchOptStatus'
  	classVariableNames: ''
  	poolDictionaries: ''
  	category: 'VMMaker-JIT'!
  
  !RegisterAllocatingCogit commentStamp: 'cb 4/15/2016 14:58' prior: 0!
  RegisterAllocatingCogit is an optimizing code generator that is specialized in register allocation..
  
  On the contrary to StackToRegisterMappingCogit, RegisterAllocatingCogit keeps at each control flow merge point the state of the simulated stack to merge into and not only an integer fixup. Each branch and jump record the current state of the simulated stack, and each fixup is responsible for merging this state into the saved simulated stack.
  !

Item was added:
+ ----- Method: RegisterAllocatingCogit class>>numTrampolines (in category 'accessing') -----
+ numTrampolines
+ 	^super numTrampolines + 2 "includes long sendMustBeBoolean trampolines"
+ 
+ 	"Cogit withAllSubclasses, CogObjectRepresentation withAllSubclasses collect:
+ 		[:c| {c. (c instVarNames select: [:ea| ea beginsWith: 'ce']) size}]"
+ 	"self allInstVarNames select: [:ea| ea beginsWith: 'ce']"
+ 	"self instVarNames select: [:ea| ea beginsWith: 'ce']"!

Item was added:
+ ----- Method: RegisterAllocatingCogit>>genCallMustBeBooleanFor: (in category 'trampoline support') -----
+ genCallMustBeBooleanFor: boolean
+ 	self assert: ((self generatorAt: byte0) numBytes between: 1 and: 2).
+ 	^self CallRT: ((self generatorAt: byte0) numBytes = 1
+ 					ifTrue:
+ 						[boolean = objectMemory falseObject
+ 							ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
+ 							ifFalse: [ceSendMustBeBooleanAddTrueTrampoline]]
+ 					ifFalse:
+ 						[boolean = objectMemory falseObject
+ 							ifTrue: [ceSendMustBeBooleanAddFalseLongTrampoline]
+ 							ifFalse: [ceSendMustBeBooleanAddTrueLongTrampoline]])!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>genJumpIf:to: (in category 'bytecode generator support') -----
  genJumpIf: boolean to: targetBytecodePC
  	<inline: false>
  	| desc reg fixup ok |
  	<var: #desc type: #'CogSimStackEntry *'>
  	<var: #fixup type: #'BytecodeFixup *'>
  	<var: #ok type: #'AbstractInstruction *'>
  	desc := self ssTop.
  	self ssPop: 1.
  	(desc type == SSConstant
  	 and: [desc constant = objectMemory trueObject or: [desc constant = objectMemory falseObject]]) ifTrue:
  		["Must arrange there's a fixup at the target whether it is jumped to or
  		  not so that the simStackPtr can be kept correct."
  		 fixup := self ensureFixupAt: targetBytecodePC - initialPC.
  		 "Must annotate the bytecode for correct pc mapping."
  		 self annotateBytecode: (desc constant = boolean
  									ifTrue: [self Jump: fixup]
  									ifFalse: [self prevInstIsPCAnnotated
  												ifTrue: [self Nop]
  												ifFalse: [self Label]]).
  		 extA := 0.
  		 ^0].
  	"try and use the top entry's register if anty, but only if it can be destroyed."
  	reg := (desc type ~= SSRegister
  			or: [(self anyReferencesToRegister: desc register inAllButTopNItems: 0)
  			or: [(desc register = ReceiverResultReg and: [optStatus isReceiverResultRegLive])]])
  				ifTrue: [TempReg]
  				ifFalse: [desc register].
  	desc popToReg: reg.
  	"Cunning trick by LPD.  If true and false are contiguous subtract the smaller.
  	 Correct result is either 0 or the distance between them.  If result is not 0 or
  	 their distance send mustBeBoolean."
  	self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
  	self genSubConstant: boolean R: reg.
  	self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
  	
  	self extASpecifiesNoMustBeBoolean ifTrue: 
  		[extA := 0. 
  		 self annotateBytecode: self lastOpcode.
  		 ^0].
  	extA := 0.
  	
  .	self CmpCq: (boolean = objectMemory falseObject
  					ifTrue: [objectMemory trueObject - objectMemory falseObject]
  					ifFalse: [objectMemory falseObject - objectMemory trueObject])
  		R: reg.
  	ok := self JumpZero: 0.
  	reg ~= TempReg ifTrue:
  		[self MoveR: reg R: TempReg].
  	self copySimStackToScratch: simSpillBase.
  	self ssFlushTo: simStackPtr.
+ 	self genCallMustBeBooleanFor: boolean.
- 	self CallRT: (boolean = objectMemory falseObject
- 					ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
- 					ifFalse: [ceSendMustBeBooleanAddTrueTrampoline]).
  	"NOTREACHED"
  	ok jmpTarget: (self annotateBytecode: self Label).
  	self restoreSimStackFromScratch.
  	^0!

Item was added:
+ ----- Method: RegisterAllocatingCogit>>genMustBeBooleanLongTrampolineFor:called: (in category 'initialization') -----
+ genMustBeBooleanLongTrampolineFor: boolean called: trampolineName
+ 	<inline: true>
+ 	^self genMustBeBooleanTrampolineFor: boolean branchBytes: 2 called: trampolineName!

Item was added:
+ ----- Method: RegisterAllocatingCogit>>genMustBeBooleanTrampolineFor:branchBytes:called: (in category 'initialization') -----
+ genMustBeBooleanTrampolineFor: boolean branchBytes: branchBytes called: trampolineName
+ 	<var: #trampolineName type: #'char *'>
+ 	"For RegisterAllocatingCogit we want the address following a conditional branch not to be reachable, so we
+ 	 don't have to generate code to reload registers.  Instead simply convert to an interpreter frame."
+ 	<inline: false>
+ 	self zeroOpcodeIndex.
+ 	"If the objectRepresentation does want true & false to be mobile then we need to record these addresses."
+ 	self assert: (objectRepresentation shouldAnnotateObjectReference: boolean) not.
+ 	self AddCq: boolean R: TempReg.
+ 	^self genTrampolineFor: #ceSendMustBeBooleanTo:interpretingAtDelta:
+ 		called: trampolineName
+ 		numArgs: 2
+ 		arg: TempReg
+ 		arg: (self trampolineArgConstant: branchBytes)
+ 		arg: nil
+ 		arg: nil
+ 		regsToSave: self emptyRegisterMask
+ 		pushLinkReg: true
+ 		resultReg: NoReg
+ 		appendOpcodes: true!

Item was changed:
  ----- Method: RegisterAllocatingCogit>>genMustBeBooleanTrampolineFor:called: (in category 'initialization') -----
  genMustBeBooleanTrampolineFor: boolean called: trampolineName
+ 	<inline: true>
+ 	^self genMustBeBooleanTrampolineFor: boolean branchBytes: 1 called: trampolineName!
- 	<var: #trampolineName type: #'char *'>
- 	"For RegisterAllocatingCogit we want the address following a conditional branch not to be reachable, so we
- 	 don't have to generate code to reload registers.  Instead simply convert to an interpreter frame."
- 	<inline: false>
- 	self zeroOpcodeIndex.
- 	"If the objectRepresentation does want true & false to be mobile then we need to record these addresses."
- 	self assert: (objectRepresentation shouldAnnotateObjectReference: boolean) not.
- 	self AddCq: boolean R: TempReg.
- 	^self genTrampolineFor: #ceSendMustBeBooleanInterpreting:
- 		called: trampolineName
- 		numArgs: 1
- 		arg: TempReg
- 		arg: nil
- 		arg: nil
- 		arg: nil
- 		regsToSave: self emptyRegisterMask
- 		pushLinkReg: true
- 		resultReg: NoReg
- 		appendOpcodes: true!

Item was added:
+ ----- Method: RegisterAllocatingCogit>>generateRunTimeTrampolines (in category 'initialization') -----
+ generateRunTimeTrampolines
+ 	"Generate the run-time entries at the base of the native code zone and update the base."
+ 	
+ 	ceSendMustBeBooleanAddFalseLongTrampoline := self genMustBeBooleanLongTrampolineFor: objectMemory falseObject
+ 														called: 'ceSendMustBeBooleanAddFalseTrampoline'.
+ 	ceSendMustBeBooleanAddTrueLongTrampoline := self genMustBeBooleanLongTrampolineFor: objectMemory trueObject
+ 														called: 'ceSendMustBeBooleanAddTrueTrampoline'.
+ 	super generateRunTimeTrampolines!

Item was changed:
  ----- Method: SimpleStackBasedCogit>>genJumpIf:to: (in category 'bytecode generator support') -----
  genJumpIf: boolean to: targetBytecodePC
  	<inline: false>
  	"Cunning trick by LPD.  If true and false are contiguous subtract the smaller.
  	 Correct result is either 0 or the distance between them.  If result is not 0 or
  	 their distance send mustBeBoolean."
  	| ok |
  	<var: #ok type: #'AbstractInstruction *'>
  	extA := 0.
  	self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
  	self PopR: TempReg.
  	self genSubConstant: boolean R: TempReg.
  	self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
  	self CmpCq: (boolean == objectMemory falseObject
  					ifTrue: [objectMemory trueObject - objectMemory falseObject]
  					ifFalse: [objectMemory falseObject - objectMemory trueObject])
  		R: TempReg.
  	ok := self JumpZero: 0.
+ 	self genCallMustBeBooleanFor: boolean.
- 	self CallRT: (boolean = objectMemory falseObject
- 					ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
- 					ifFalse: [ceSendMustBeBooleanAddTrueTrampoline]).
  	ok jmpTarget: (self annotateBytecode: self Label).
  	^0!

Item was changed:
  ----- Method: SistaCogit>>genJumpIf:to: (in category 'bytecode generator support') -----
  genJumpIf: boolean to: targetBytecodePC
  	"The heart of performance counting in Sista.  Conditional branches are 6 times less
  	 frequent than sends and can provide basic block frequencies (send counters can't).
  	 Each conditional has a 32-bit counter split into an upper 16 bits counting executions
  	 and a lower half counting untaken executions of the branch.  Executing the branch
  	 decrements the upper half, tripping if the count goes negative.  Not taking the branch
  	 decrements the lower half.  N.B. We *do not* eliminate dead branches (true ifTrue:/true ifFalse:)
  	 so that scanning for send and branch data is simplified and that branch data is correct."
  	<inline: false>
  	| ok counterAddress countTripped retry nextPC nextDescriptor desc |
  	<var: #ok type: #'AbstractInstruction *'>
  	<var: #desc type: #'CogSimStackEntry *'>
  	<var: #retry type: #'AbstractInstruction *'>
  	<var: #countTripped type: #'AbstractInstruction *'>
  	<var: #nextDescriptor type: #'BytecodeDescriptor *'>
  
  	"In optimized code we don't generate counters to improve performance"
  	(coInterpreter isOptimizedMethod: methodObj) ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].
  	
  	"If the branch is reached only for the counter trip trampoline 
  	(typically, var1 == var2 ifTrue: falls through to the branch only for the trampoline)
  	we generate a specific path to drastically reduce the number of machine instructions"
  	branchReachedOnlyForCounterTrip ifTrue: 
  		[ branchReachedOnlyForCounterTrip := false.
  		^ self genCounterTripOnlyJumpIf: boolean to: targetBytecodePC ].
  	
  	"We detect and: / or:, if found, we don't generate the counters to avoid pathological counter slow down"
  	boolean == objectMemory falseObject ifTrue:
  		[ nextPC := bytecodePC + (self generatorAt: byte0) numBytes.
  		  nextDescriptor := self generatorAt: (objectMemory fetchByte: nextPC ofObject: methodObj) + bytecodeSetOffset.
  		  nextDescriptor generator ==  #genPushConstantTrueBytecode ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].
  		  nextDescriptor := self generatorAt: (objectMemory fetchByte: targetBytecodePC ofObject: methodObj) + bytecodeSetOffset.
  		  nextDescriptor generator ==  #genPushConstantFalseBytecode ifTrue: [ ^ super genJumpIf: boolean to: targetBytecodePC ].  ].
  
  	extA := 0. "We ignore the noMustBeBoolean flag. It should not be present in methods with counters, and if it is we don't care."
  
  	"We don't generate counters on branches on true/false, the basicblock usage can be inferred"
  	desc := self ssTop.
  	(desc type == SSConstant
  	 and: [desc constant = objectMemory trueObject or: [desc constant = objectMemory falseObject]]) ifTrue:
  		[ ^ super genJumpIf: boolean to: targetBytecodePC ].
  	
  	self ssFlushTo: simStackPtr - 1.
  	desc popToReg: TempReg.
  	self ssPop: 1.
  
  	"We need SendNumArgsReg because of the mustBeBooleanTrampoline"
  	self ssAllocateRequiredReg: SendNumArgsReg.
  
  	retry := self Label.
  	self 
  		genExecutionCountLogicInto: [ :cAddress :countTripBranch | 
  			counterAddress := cAddress. 
  			countTripped := countTripBranch ] 
  		counterReg: SendNumArgsReg.
  	counterIndex := counterIndex + 1.
  
  	"Cunning trick by LPD.  If true and false are contiguous subtract the smaller.
  	 Correct result is either 0 or the distance between them.  If result is not 0 or
  	 their distance send mustBeBoolean."
  	self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
  	self genSubConstant: boolean R: TempReg.
  	self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
  
  	self genFallsThroughCountLogicCounterReg: SendNumArgsReg counterAddress: counterAddress.
  
  	self CmpCq: (boolean == objectMemory falseObject
  					ifTrue: [objectMemory trueObject - objectMemory falseObject]
  					ifFalse: [objectMemory falseObject - objectMemory trueObject])
  		R: TempReg.
  	ok := self JumpZero: 0.
  	self MoveCq: 0 R: SendNumArgsReg. "if counterReg is 0 this is a mustBeBoolean, not a counter trip."
  	
+ 	countTripped jmpTarget: (self genCallMustBeBooleanFor: boolean).
- 	countTripped jmpTarget:
- 		(self CallRT: (boolean == objectMemory falseObject
- 						ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
- 						ifFalse: [ceSendMustBeBooleanAddTrueTrampoline])).
  						
  	"If we're in an image which hasn't got the Sista code loaded then the ceCounterTripped:
  	 trampoline will return directly to machine code, returning the boolean.  So the code should
  	 jump back to the retry point. The trampoline makes sure that TempReg has been reloaded."
  	self annotateBytecode: self Label. "For some reason if I write self annotateBytecode: (self Jump: retry) the annotation is not at the correct place."
  	self Jump: retry.
  	
  	ok jmpTarget: self Label.
  	^0!

Item was changed:
  InterpreterPrimitives subclass: #StackInterpreter
+ 	instanceVariableNames: 'currentBytecode bytecodeSetSelector localFP localIP localSP stackLimit stackPage stackPages method instructionPointer stackPointer framePointer localReturnValue localAbsentReceiver localAbsentReceiverOrZero extA extB primitiveFunctionPointer methodCache nsMethodCache atCache lkupClassTag lkupClass methodDictLinearSearchLimit highestRunnableProcessPriority nextWakeupUsecs nextPollUsecs inIOProcessEvents interruptKeycode interruptPending savedWindowSize imageHeaderFlags fullScreenFlag deferDisplayUpdates pendingFinalizationSignals extraVMMemory interpreterProxy showSurfaceFn primitiveTable primitiveAccessorDepthTable externalPrimitiveTable externalPrimitiveTableFirstFreeIndex overflowedPage extraFramesToMoveOnOverflow globalSessionID jmpBuf jmpDepth suspendedCallbacks suspendedMethods numStackPages desiredNumStackPages desiredEdenBytes classNameIndex thisClassIndex metaclassNumSlots interruptCheckChain suppressHeartbeatFlag breakSelector breakSelectorLength l
 ongRunningPrimitiveCheckMethod longRunningPrimitiveCheckSemaphore longRunningPrimitiveStartUsecs longRunningPrimitiveStopUsecs longRunningPrimitiveGCUsecs longRunningPrimitiveCheckSequenceNumber longRunningPrimitiveSignalUndelivered checkAllocFiller tempOop tempOop2 tempOop3 theUnknownShort the2ndUnknownShort imageFloatsBigEndian maxExtSemTabSizeSet lastMethodCacheProbeWrite gcSemaphoreIndex classByteArrayCompactIndex checkedPluginName statForceInterruptCheck statStackOverflow statStackPageDivorce statCheckForEvents statProcessSwitch statIOProcessEvents statPendingFinalizationSignals nativeSP nativeStackPointer lowcodeCalloutState shadowCallStackPointer debugCallbackPath'
- 	instanceVariableNames: 'currentBytecode bytecodeSetSelector localFP localIP localSP stackLimit stackPage stackPages method instructionPointer stackPointer framePointer localReturnValue localAbsentReceiver localAbsentReceiverOrZero extA extB primitiveFunctionPointer methodCache nsMethodCache atCache lkupClassTag lkupClass methodDictLinearSearchLimit highestRunnableProcessPriority nextWakeupUsecs nextPollUsecs inIOProcessEvents interruptKeycode interruptPending savedWindowSize imageHeaderFlags fullScreenFlag deferDisplayUpdates pendingFinalizationSignals extraVMMemory interpreterProxy showSurfaceFn primitiveTable primitiveAccessorDepthTable externalPrimitiveTable externalPrimitiveTableFirstFreeIndex overflowedPage extraFramesToMoveOnOverflow globalSessionID jmpBuf jmpDepth suspendedCallbacks suspendedMethods numStackPages desiredNumStackPages desiredEdenBytes classNameIndex thisClassIndex metaclassNumSlots interruptCheckChain suppressHeartbeatFlag breakSelector breakSelectorLength l
 ongRunningPrimitiveCheckMethod longRunningPrimitiveCheckSemaphore longRunningPrimitiveStartUsecs longRunningPrimitiveStopUsecs longRunningPrimitiveGCUsecs longRunningPrimitiveCheckSequenceNumber longRunningPrimitiveSignalUndelivered checkAllocFiller tempOop tempOop2 tempOop3 theUnknownShort the2ndUnknownShort imageFloatsBigEndian maxExtSemTabSizeSet lastMethodCacheProbeWrite gcSemaphoreIndex classByteArrayCompactIndex checkedPluginName statForceInterruptCheck statStackOverflow statStackPageDivorce statCheckForEvents statProcessSwitch statIOProcessEvents statPendingFinalizationSignals nativeSP nativeStackPointer lowcodeCalloutState shadowCallStackPointer'
  	classVariableNames: 'AccessModifierPrivate AccessModifierProtected AccessModifierPublic AltBytecodeEncoderClassName AltLongStoreBytecode AlternateHeaderHasPrimFlag AlternateHeaderIsOptimizedFlag AlternateHeaderNumLiteralsMask AtCacheFixedFields AtCacheFmt AtCacheMask AtCacheOop AtCacheSize AtCacheTotalSize AtPutBase BytecodeEncoderClassName BytecodeTable CacheProbeMax DirBadPath DirEntryFound DirNoMoreEntries DumpStackOnLowSpace EnclosingMixinIndex EnclosingObjectIndex EnforceAccessControl FailImbalancedPrimitives LongStoreBytecode MaxExternalPrimitiveTableSize MaxJumpBuf MaxPrimitiveIndex MaxQuickPrimitiveIndex MethodHeaderArgCountShift MethodHeaderFlagBitPosition MethodHeaderTempCountShift MixinIndex PrimNumberDoExternalCall PrimNumberExternalCall PrimNumberFFICall PrimitiveExternalCallIndex PrimitiveTable StackPageReachedButUntraced StackPageTraceInvalid StackPageTraced StackPageUnreached V3PrimitiveBitsMask'
  	poolDictionaries: 'VMBasicConstants VMBytecodeConstants VMMethodCacheConstants VMObjectIndices VMSpurObjectRepresentationConstants VMSqueakClassIndices VMStackFrameOffsets'
  	category: 'VMMaker-Interpreter'!
  
  !StackInterpreter commentStamp: 'eem 12/5/2014 11:32' prior: 0!
  This class is a complete implementation of the Smalltalk-80 virtual machine, derived originally from the Blue Book specification but quite different in some areas.  This VM supports Closures but *not* old-style BlockContexts.
  
  It has been modernized with 32-bit pointers, better management of Contexts (see next item), and attention to variable use that allows the CCodeGenerator (qv) to keep, eg, the instruction pointer and stack pointer in registers as well as keeping most simple variables in a global array that seems to improve performance for most platforms.
  
  The VM does not use Contexts directly.  Instead Contexts serve as proxies for a more conventional stack format that is invisible to the image.  There is considerable explanation at http://www.mirandabanda.org/cogblog/2009/01/14/under-cover-contexts-and-the-big-frame-up.  The VM maintains a fixed-size stack zone divided into pages, each page being capable of holding several method/block activations.  A send establishes a new frame in the current stack page, a return returns to the previous frame.  This eliminates allocation/deallocation of contexts and the moving of receiver and arguments from caller to callee on each send/return.  Contexts are created lazily when an activation needs a context (creating a block, explicit use of thisContext, access to sender when sender is a frame, or linking of stack pages together).  Contexts are either conventional and heap-resident ("single") or "married" and serve as proxies for their corresponding frame or "widowed", meaning that their spouse f
 rame has been returned from (died).  A married context is specially marked (more details in the code) and refers to its frame.  Likewise a married frame is specially marked and refers to its context.
  
  In addition to SmallInteger arithmetic and Floats, the VM supports logic on 32-bit PositiveLargeIntegers, thus allowing it to simulate itself much more effectively than would otherwise be the case.
  
  StackInterpreter and subclasses support multiple memory managers.  Currently there are two.  NewMemoryManager is a slightly refined version of ObjectMemory, and is the memory manager and garbage collector for the original Squeak object representation as described in "Back to the Future The Story of Squeak, A Practical Smalltalk Written in Itself", see http://ftp.squeak.org/docs/OOPSLA.Squeak.html.  Spur is a faster, more regular object representation that is designed for more performance and functionality, and to have a common header format for both 32-bit and 64-bit versions.  You can read about it in SpurMemoryManager's class comment.  There is also a video of a presentation at ESUG 2014 (https://www.youtube.com/watch?v=k0nBNS1aHZ4), along with slides (http://www.slideshare.net/esug/spur-a-new-object-representation-for-cog?related=1).!

Item was changed:
  ----- Method: StackInterpreter class>>mustBeGlobal: (in category 'translation') -----
  mustBeGlobal: var
  	"Answer if a variable must be global and exported.  Used for inst vars that are accessed from VM support code."
  
  	^(super mustBeGlobal: var)
  	   or: [(self objectMemoryClass mustBeGlobal: var)
  	   or: [(#('interpreterProxy' 'interpreterVersion' 'inIOProcessEvents'
  			'deferDisplayUpdates' 'extraVMMemory' 'showSurfaceFn'
  			'desiredNumStackPages' 'desiredEdenBytes'
  			'breakSelector' 'breakSelectorLength' 'sendTrace' 'checkAllocFiller' 'checkedPluginName'
+ 			'suppressHeartbeatFlag' 'debugCallbackPath') includes: var)
- 			'suppressHeartbeatFlag') includes: var)
  	   or: [ "This allows slow machines to define bytecodeSetSelector as 0
  			to avoid the interpretation overhead."
  			MULTIPLEBYTECODESETS not and: [var = 'bytecodeSetSelector']]]]!

Item was changed:
  ----- Method: StackInterpreter class>>preambleCCode (in category 'translation') -----
  preambleCCode
  	^	
  '/* Disable Intel compiler inlining of warning which is used for breakpoints */
  #pragma auto_inline(off)
  sqInt warnpid, erroronwarn;
  void
  warning(char *s) { /* Print an error message but don''t necessarily exit. */
  	if (erroronwarn) error(s);
  	if (warnpid)
  		printf("\n%s pid %ld\n", s, (long)warnpid);
  	else
  		printf("\n%s\n", s);
  }
  void
  warningat(char *s, int l) { /* ditto with line number. */
  	/* use alloca to call warning so one does not have to remember to set two breakpoints... */
  	char *sl = alloca(strlen(s) + 16);
  	sprintf(sl, "%s %d", s, l);
  	warning(sl);
  }
  #pragma auto_inline(on)
  
  void
  invalidCompactClassError(char *s) { /* Print a (compact) class index error message and exit. */
  #if SPURVM
  	printf("\nClass %s does not have the required class index\n", s);
  #else
  	printf("\nClass %s does not have the required compact class index\n", s);
  #endif
  	exit(-1);
  }
  
  /*
   * Define sigsetjmp and siglongjmp to be the most minimal setjmp/longjmp available on the platform.
   */
+ #undef sigsetjmp
+ #undef siglongjmp
  #if WIN32
  # define sigsetjmp(jb,ssmf) setjmp(jb)
  # define siglongjmp(jb,v) longjmp(jb,v)
  #else
  # define sigsetjmp(jb,ssmf) _setjmp(jb)
  # define siglongjmp(jb,v) _longjmp(jb,v)
  #endif
  
  #define odd(v) ((int)(v)&1)
  #define even(v) (!!odd(v))
  '!

Item was changed:
  ----- Method: StackInterpreter>>returnAs:ThroughCallback:Context: (in category 'callback support') -----
  returnAs: returnTypeOop ThroughCallback: vmCallbackContext Context: callbackMethodContext
  	"callbackMethodContext is an activation of invokeCallback:[stack:registers:jmpbuf:].
  	 Its sender is the VM's state prior to the callback.  Reestablish that state (via longjmp),
  	 and mark callbackMethodContext as dead."
  	<export: true>
  	<var: #vmCallbackContext type: #'VMCallbackContext *'>
  	| calloutMethodContext theFP thePage |
  	<var: #theFP type: #'char *'>
  	<var: #thePage type: #'StackPage *'>
+ 	debugCallbackPath := 0.
  	((self isIntegerObject: returnTypeOop)
  	 and: [self isLiveContext: callbackMethodContext]) ifFalse:
+ 		[debugCallbackPath := 1.
+ 		 ^false].
- 		[^false].
  	calloutMethodContext := self externalInstVar: SenderIndex ofContext: callbackMethodContext.
  	(self isLiveContext: calloutMethodContext) ifFalse:
+ 		[debugCallbackPath := 2.
+ 		 ^false].
+ 	debugCallbackPath := 4.
- 		[^false].
  	"We're about to leave this stack page; must save the current frame's instructionPointer."
  	self push: instructionPointer.
  	self externalWriteBackHeadFramePointers.
  	"Mark callbackMethodContext as dead; the common case is that it is the current frame.
  	 We go the extra mile for the debugger."
  	(self isSingleContext: callbackMethodContext)
+ 		ifTrue: [self markContextAsDead: callbackMethodContext. debugCallbackPath := debugCallbackPath bitOr: 8]
- 		ifTrue: [self markContextAsDead: callbackMethodContext]
  		ifFalse:
+ 			[debugCallbackPath := debugCallbackPath bitOr: 16.
+ 			 theFP := self frameOfMarriedContext: callbackMethodContext.
- 			[theFP := self frameOfMarriedContext: callbackMethodContext.
  			 framePointer = theFP "common case"
  				ifTrue:
+ 					[debugCallbackPath := debugCallbackPath bitOr: 32.
+ 					 (self isBaseFrame: theFP)
+ 						ifTrue: [stackPages freeStackPage: stackPage. debugCallbackPath := debugCallbackPath bitOr: 64]
- 					[(self isBaseFrame: theFP)
- 						ifTrue: [stackPages freeStackPage: stackPage]
  						ifFalse: "calloutMethodContext is immediately below on the same page.  Make it current."
+ 							[debugCallbackPath := debugCallbackPath bitOr: 128.
+ 							 instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
- 							[instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
  							 stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
  							 framePointer := self frameCallerFP: framePointer.
  							 self setMethod: (self frameMethodObject: framePointer).
  							 self restoreCStackStateForCallbackContext: vmCallbackContext.
  							 "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
  							  This matches the use of _setjmp in ia32abicc.c."
  							 self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
  							 ^true]]
  				ifFalse:
+ 					[debugCallbackPath := debugCallbackPath bitOr: 256.
+ 					 self externalDivorceFrame: theFP andContext: callbackMethodContext.
- 					[self externalDivorceFrame: theFP andContext: callbackMethodContext.
  					 self markContextAsDead: callbackMethodContext]].
  	"Make the calloutMethodContext the active frame.  The case where calloutMethodContext
  	 is immediately below callbackMethodContext on the same page is handled above."
  	(self isStillMarriedContext: calloutMethodContext)
  		ifTrue:
+ 			[debugCallbackPath := debugCallbackPath bitOr: 512.
+ 			 theFP := self frameOfMarriedContext: calloutMethodContext.
- 			[theFP := self frameOfMarriedContext: calloutMethodContext.
  			 thePage := stackPages stackPageFor: theFP.
  			 "findSPOf:on: points to the word beneath the instructionPointer, but
  			  there is no instructionPointer on the top frame of the current page."
  			 self assert: thePage ~= stackPage.
  			 stackPointer := (self findSPOf: theFP on: thePage) - objectMemory wordSize.
  			 framePointer := theFP]
  		ifFalse:
+ 			[debugCallbackPath := debugCallbackPath bitOr: 1024.
+ 			 thePage := self makeBaseFrameFor: calloutMethodContext.
- 			[thePage := self makeBaseFrameFor: calloutMethodContext.
  			 framePointer := thePage headFP.
  			 stackPointer := thePage headSP].
  	instructionPointer := self popStack.
  	self setMethod: (objectMemory fetchPointer: MethodIndex ofObject: calloutMethodContext).
  	self setStackPageAndLimit: thePage.
  	self restoreCStackStateForCallbackContext: vmCallbackContext.
+ 	debugCallbackPath := debugCallbackPath bitOr: 2048.
  	 "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
  	  This matches the use of _setjmp in ia32abicc.c."
  	self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
  	"NOTREACHED"
  	^true!

Item was changed:
  ----- Method: StackToRegisterMappingCogit>>genJumpIf:to: (in category 'bytecode generator support') -----
  genJumpIf: boolean to: targetBytecodePC
  	<inline: false>
  	| desc fixup ok |
  	<var: #desc type: #'CogSimStackEntry *'>
  	<var: #fixup type: #'BytecodeFixup *'>
  	<var: #ok type: #'AbstractInstruction *'>
  	self ssFlushTo: simStackPtr - 1.
  	desc := self ssTop.
  	self ssPop: 1.
  	(desc type == SSConstant
  	 and: [desc constant = objectMemory trueObject or: [desc constant = objectMemory falseObject]]) ifTrue:
  		["Must arrange there's a fixup at the target whether it is jumped to or
  		  not so that the simStackPtr can be kept correct."
  		 fixup := self ensureFixupAt: targetBytecodePC - initialPC.
  		 "Must annotate the bytecode for correct pc mapping."
  		 self annotateBytecode: (desc constant = boolean
  									ifTrue: [self Jump: fixup]
  									ifFalse: [self prevInstIsPCAnnotated
  												ifTrue: [self Nop]
  												ifFalse: [self Label]]).
  		 extA := 0.
  		 ^0].
  	desc popToReg: TempReg.
  	"Cunning trick by LPD.  If true and false are contiguous subtract the smaller.
  	 Correct result is either 0 or the distance between them.  If result is not 0 or
  	 their distance send mustBeBoolean."
  	self assert: (objectMemory objectAfter: objectMemory falseObject) = objectMemory trueObject.
  	self genSubConstant: boolean R: TempReg.
  	self JumpZero: (self ensureFixupAt: targetBytecodePC - initialPC).
  	
  	self extASpecifiesNoMustBeBoolean ifTrue: 
  		[ extA := 0. 
  		self annotateBytecode: self lastOpcode.
  		^ 0].
  	extA := 0.
  	
  .	self CmpCq: (boolean = objectMemory falseObject
  					ifTrue: [objectMemory trueObject - objectMemory falseObject]
  					ifFalse: [objectMemory falseObject - objectMemory trueObject])
  		R: TempReg.
  	ok := self JumpZero: 0.
+ 	self genCallMustBeBooleanFor: boolean.
- 	self CallRT: (boolean = objectMemory falseObject
- 					ifTrue: [ceSendMustBeBooleanAddFalseTrampoline]
- 					ifFalse: [ceSendMustBeBooleanAddTrueTrampoline]).
  	ok jmpTarget: (self annotateBytecode: self Label).
  	^0!



More information about the Vm-dev mailing list