[Vm-dev] VM Maker: VMMaker.oscog-eem.1527.mcz

commits at source.squeak.org commits at source.squeak.org
Thu Nov 19 01:30:26 UTC 2015


Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-eem.1527.mcz

==================== Summary ====================

Name: VMMaker.oscog-eem.1527
Author: eem
Time: 18 November 2015, 5:28:31.956 pm
UUID: f3bbc339-3ee1-4fbf-afc3-0d49b67a54f6
Ancestors: VMMaker.oscog-eem.1526

Cogit: Beef up the asserts in expectedClosedPICPrototype so that updating the class and method entries is tested.

Use CmpC32:R: to compare the class tag in CPICs to match in-line caches on 64-bit Spur.  Add CmpC32R generatoion for X64.

=============== Diff against VMMaker.oscog-eem.1526 ===============

Item was changed:
  ----- Method: CogX64Compiler>>computeMaximumSize (in category 'generate machine code') -----
(excessive size, no diff calculated)

Item was added:
+ ----- Method: CogX64Compiler>>concretizeCmpC32R (in category 'generate machine code') -----
+ concretizeCmpC32R
+ 	"Will get inlined into concretizeAt: switch."
+ 	<inline: true>
+ 	| value reg |
+ 	value := operands at: 0.
+ 	reg := self concreteRegister: (operands at: 1).
+ 	reg = RAX ifTrue:
+ 		[machineCode
+ 			at: 0 put: 16r48;
+ 			at: 1 put: 16r3D;
+ 			at: 2 put: (value bitAnd: 16rFF);
+ 			at: 3 put: (value >> 8 bitAnd: 16rFF);
+ 			at: 4 put: (value >> 16 bitAnd: 16rFF);
+ 			at: 5 put: (value >> 24 bitAnd: 16rFF).
+ 		 ^machineCodeSize := 6].
+ 	machineCode
+ 		at: 0 put: 16r49;
+ 		at: 1 put: 16r81;
+ 		at: 2 put: (self mod: ModReg RM: reg RO: 7);
+ 		at: 3 put: (value bitAnd: 16rFF);
+ 		at: 4 put: (value >> 8 bitAnd: 16rFF);
+ 		at: 5 put: (value >> 16 bitAnd: 16rFF);
+ 		at: 6 put: (value >> 24 bitAnd: 16rFF).
+ 	 ^machineCodeSize := 7!

Item was changed:
  ----- Method: CogX64Compiler>>dispatchConcretize (in category 'generate machine code') -----
  dispatchConcretize
  	"Attempt to generate concrete machine code for the instruction at address.
  	 This is the inner dispatch of concretizeAt: actualAddress which exists only
  	 to get around the branch size limits in the SqueakV3 (blue book derived)
  	 bytecode set."
  	<returnTypeC: #void>
  	opcode caseOf: {
  		"Noops & Pseudo Ops"
  		[Label]				-> [^self concretizeLabel].
  		[AlignmentNops]	-> [^self concretizeAlignmentNops].
  		[Fill16]				-> [^self concretizeFill16].
  		[Fill32]				-> [^self concretizeFill32].
  		[FillFromWord]		-> [^self concretizeFillFromWord].
  		[Nop]				-> [^self concretizeNop].
  		"Specific Control/Data Movement"
  		[CDQ]					-> [^self concretizeCDQ].
  		[IDIVR]					-> [^self concretizeIDIVR].
  		[IMULRR]				-> [^self concretizeMulRR].
  		[CPUID]					-> [^self concretizeCPUID].
  		[CMPXCHGAwR]			-> [^self concretizeCMPXCHGAwR].
  		[CMPXCHGMwrR]		-> [^self concretizeCMPXCHGMwrR].
  		[LFENCE]				-> [^self concretizeFENCE: 5].
  		[MFENCE]				-> [^self concretizeFENCE: 6].
  		[SFENCE]				-> [^self concretizeFENCE: 7].
  		[LOCK]					-> [^self concretizeLOCK].
  		[XCHGAwR]				-> [^self concretizeXCHGAwR].
  		[XCHGMwrR]			-> [^self concretizeXCHGMwrR].
  		[XCHGRR]				-> [^self concretizeXCHGRR].
  		"Control"
  		[Call]					-> [^self concretizeCall].
  		[CallFull]				-> [^self concretizeCallFull].
  		[JumpR]					-> [^self concretizeJumpR].
  		[JumpFull]				-> [^self concretizeJumpFull].
  		[JumpLong]				-> [^self concretizeJumpLong].
  		[JumpLongZero]		-> [^self concretizeConditionalJump: 16r4].
  		[JumpLongNonZero]	-> [^self concretizeConditionalJump: 16r5].
  		[Jump]					-> [^self concretizeJump].
  		"Table B-1 Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture"
  		[JumpZero]				-> [^self concretizeConditionalJump: 16r4].
  		[JumpNonZero]			-> [^self concretizeConditionalJump: 16r5].
  		[JumpNegative]			-> [^self concretizeConditionalJump: 16r8].
  		[JumpNonNegative]		-> [^self concretizeConditionalJump: 16r9].
  		[JumpOverflow]			-> [^self concretizeConditionalJump: 16r0].
  		[JumpNoOverflow]		-> [^self concretizeConditionalJump: 16r1].
  		[JumpCarry]			-> [^self concretizeConditionalJump: 16r2].
  		[JumpNoCarry]			-> [^self concretizeConditionalJump: 16r3].
  		[JumpLess]				-> [^self concretizeConditionalJump: 16rC].
  		[JumpGreaterOrEqual]	-> [^self concretizeConditionalJump: 16rD].
  		[JumpGreater]			-> [^self concretizeConditionalJump: 16rF].
  		[JumpLessOrEqual]		-> [^self concretizeConditionalJump: 16rE].
  		[JumpBelow]			-> [^self concretizeConditionalJump: 16r2].
  		[JumpAboveOrEqual]	-> [^self concretizeConditionalJump: 16r3].
  		[JumpAbove]			-> [^self concretizeConditionalJump: 16r7].
  		[JumpBelowOrEqual]	-> [^self concretizeConditionalJump: 16r6].
  		[JumpFPEqual]				-> [^self concretizeConditionalJump: 16r4].
  		[JumpFPNotEqual]			-> [^self concretizeConditionalJump: 16r5].
  		[JumpFPLess]				-> [^self concretizeConditionalJump: 16r2].
  		[JumpFPGreaterOrEqual]	-> [^self concretizeConditionalJump: 16r3].
  		[JumpFPGreater]			-> [^self concretizeConditionalJump: 16r7].
  		[JumpFPLessOrEqual]		-> [^self concretizeConditionalJump: 16r6].
  		[JumpFPOrdered]			-> [^self concretizeConditionalJump: 16rB].
  		[JumpFPUnordered]			-> [^self concretizeConditionalJump: 16rA].
  		[RetN]						-> [^self concretizeRetN].
  		[Stop]						-> [^self concretizeStop].
  		"Arithmetic"
  		[AddCqR]					-> [^self concretizeArithCqRWithRO: 0 raxOpcode: 15r05].
  		[AddCwR]					-> [^self concretizeArithCwR: 16r03].
  		[AddRR]						-> [^self concretizeAddRR].
  		[AddRdRd]					-> [^self concretizeSEE2OpRdRd: 16r58].
  		[AndCqR]					-> [^self concretizeArithCqRWithRO: 4 raxOpcode: 16r25].
  		[AndCwR]					-> [^self concretizeArithCwR: 16r23].
  		[AndRR]						-> [^self concretizeAndRR].
  		[TstCqR]					-> [^self concretizeTstCqR].
  		[CmpCqR]					-> [^self concretizeArithCqRWithRO: 7 raxOpcode: 16r3D].
  		[CmpCwR]					-> [^self concretizeArithCwR: 16r39].
+ 		[CmpC32R]					-> [^self concretizeCmpC32R].
  		[CmpRR]					-> [^self concretizeCmpRR].
  		[CmpRdRd]					-> [^self concretizeCmpRdRd].
  		[DivRdRd]					-> [^self concretizeSEE2OpRdRd: 16r5E].
  		[MulRdRd]					-> [^self concretizeSEE2OpRdRd: 16r59].
  		[OrCqR]						-> [^self concretizeArithCqRWithRO: 1 raxOpcode: 16r0D].
  		[OrCwR]					-> [^self concretizeArithCwR: 16r0B].
  		[OrRR]						-> [^self concretizeOrRR].
  		[SubCqR]					-> [^self concretizeArithCqRWithRO: 5 raxOpcode: 16r2D].
  		[SubCwR]					-> [^self concretizeArithCwR: 16r2B].
  		[SubRR]						-> [^self concretizeSubRR].
  		[SubRdRd]					-> [^self concretizeSEE2OpRdRd: 16r5C].
  		[SqrtRd]					-> [^self concretizeSqrtRd].
  		[XorCwR]					-> [^self concretizeArithCwR: 16r33].
  		[XorRR]						-> [^self concretizeXorRR].
  		[NegateR]					-> [^self concretizeNegateR].
  		[LoadEffectiveAddressMwrR]	-> [^self concretizeLoadEffectiveAddressMwrR].
  		[ArithmeticShiftRightCqR]		-> [^self concretizeShiftCqRegOpcode: 7].
  		[LogicalShiftRightCqR]			-> [^self concretizeShiftCqRegOpcode: 5].
  		[LogicalShiftLeftCqR]			-> [^self concretizeShiftCqRegOpcode: 4].
  		[ArithmeticShiftRightRR]			-> [^self concretizeShiftRegRegOpcode: 7].
  		[LogicalShiftLeftRR]				-> [^self concretizeShiftRegRegOpcode: 4].
  		"Data Movement"
  		[MoveCqR]			-> [^self concretizeMoveCqR].
  		[MoveCwR]			-> [^self concretizeMoveCwR].
  		[MoveC32R]		-> [^self concretizeMoveC32R].
  		[MoveRR]			-> [^self concretizeMoveRR].
  		[MoveAwR]			-> [^self concretizeMoveAwR].
  		[MoveRAw]			-> [^self concretizeMoveRAw].
  		[MoveAbR]			-> [^self concretizeMoveAbR].
  		[MoveRAb]			-> [^self concretizeMoveRAb].
  		[MoveMbrR]			-> [^self concretizeMoveMbrR].
  		[MoveRMbr]			-> [^self concretizeMoveRMbr].
  		[MoveM16rR]		-> [^self concretizeMoveM16rR].
  		[MoveM64rRd]		-> [^self concretizeMoveM64rRd].
  		[MoveMwrR]		-> [^self concretizeMoveMwrR].
  		[MoveXbrRR]		-> [^self concretizeMoveXbrRR].
  		[MoveRXbrR]		-> [^self concretizeMoveRXbrR].
  		[MoveXwrRR]		-> [^self concretizeMoveXwrRR].
  		[MoveRXwrR]		-> [^self concretizeMoveRXwrR].
  		[MoveX32rRR]		-> [^self concretizeMoveX32rRR].
  		[MoveRMwr]		-> [^self concretizeMoveRMwr].
  		[MoveRdM64r]		-> [^self concretizeMoveRdM64r].
  		[PopR]				-> [^self concretizePopR].
  		[PushR]				-> [^self concretizePushR].
  		[PushCq]			-> [^self concretizePushCq].
  		[PushCw]			-> [^self concretizePushCw].
  		[PrefetchAw]		-> [^self concretizePrefetchAw].
  		"Conversion"
  		[ConvertRRd]		-> [^self concretizeConvertRRd] }!

Item was changed:
  ----- Method: Cogit>>compileClosedPICPrototype (in category 'in-line cacheing') -----
  compileClosedPICPrototype
  	"Compile the abstract instructions for a full closed PIC, used to generate the chunk of code
  	 which is copied to form each closed PIC.  A Closed Polymorphic Inline Cache is a small jump
  	 table used to optimize sends with a limited degree of polymorphism (currently up to 6 cases).
  	 We call it closed because it deals only with a finite number of cases, as opposed to an Open PIC.
  	 When a monomorphic linked send (a send with a single case, linking direct to the checked entry
  	 point of a CogMethod) fails a class check, the Cogit attempts to create a two-entry PIC that will
  	 handle jumping to the original target for the original class and the relevant target for the new
  	 class.  This jump table will be extended on subsequent failures up to a limit (6).
  
  	 We avoid extending CPICs to Open PICs by linking the send site to an Open PIC if one already
  	 exists with the send's selector, a good policy since measurements show that sends of mega-
  	 morphic selectors usually become megamorphic at all send sites.  Hence the Open PIC list.
  
  	 A CPIC also optimizes MNUs and interpret-only methods.  Each case can load SendNumArgs with
  	 the oop of a method, or will load SendNumArgs with 0 if not.  MNUs are optimized by jumping to
  	 the mnuAbort in the CPIC, which calls code that creates the Message, thereby avoiding looking up
  	 the original message which will not be found, and either looks up doesNotUnderstand: or directly
  	 activates the method loaded into SendNumArgs, hence avoiding looking up doesNotUnderstand:.
  	 Interpret-only methods are handled by jumping to the picInterpretAbort, which enters the
  	 interpreter activating the method loaded in SendNumArgs.
  
  	 CPICs look like the following, where rClass is set at the original send site for the 1st case, and #Foo
  	 is some constant, either an oop, a class tag or an instruction address.
  
  		rTemp := (rRecever bitAnd: TagMask) = 0 ifTrue: [rReceiver class] ifFalse: [rRecever bitAnd: TagMask].
  		rTemp = rClass ifFalse:
  			[self goto: #Label].
  		rSendNumArgs := #MethodForCase1Or0.
  		self goto: #TargetForCase1.
  	 #Label
  		rTemp = #ClassTagForCase6 ifTrue:
  			[rSendNumArgs := #MethodForCase6Or0.
  			 self goto: #TargetForCase6].
  		...cases 5, 4 & 3
  		rTemp = #ClassTagForCase2 ifTrue:
  			[rSendNumArgs := #MethodForCase2Or0.
  			 self goto: #TargetForCase2].
  		self goto: #CPICMissTrampoline
  		literals (if out-of-line literals)
  
  	 where we short-cut as many cases as needed by making the self goto: #Label skip as many cases
  	 as needed."
  	<inline: true>
  	| numArgs jumpNext |
  	<var: #jumpNext type: #'AbstractInstruction *'>
  	self compilePICAbort: (numArgs := 0). "Will get rewritten to appropriate arity when configuring."
  	jumpNext := self compileCPICEntry.
  	"At the end of the entry code we need to jump to the first case code, which is actually the last chunk.
  	 On each entension we must update this jump to move back one case."
  	"16r5EAF00D is the method oop, or 0, for the 1st case."
  	self MoveUniqueCw: 16r5EAF00D R: SendNumArgsReg.
  	self JumpLong: self cPICPrototypeCaseOffset + 16rCA5E10.
  	endCPICCase0 := self Label.
  	1 to: maxCPICCases - 1 do:
  		[:h|
  		h = (maxCPICCases - 1) ifTrue:
  			[jumpNext jmpTarget: self Label]. "this is where we jump to for the first case"
  		"16rBABE1F15+h is the class tag for the Nth case"
+ 		self CmpC32: 16rBABE1F15+h R: TempReg.
- 		self CmpCw: 16rBABE1F15+h R: TempReg.
  		"16rBADA550+h is the method oop, or 0, for the Nth case."
  		self MoveUniqueCw: 16rBADA550 + h R: SendNumArgsReg.
  		self JumpLongZero: self cPICPrototypeCaseOffset + 16rCA5E10 + (h * 16).
  		h = 1 ifTrue:
  			[endCPICCase1 := self Label]].
  	self MoveCw: methodLabel address R: ClassReg.
  	self JumpLong: (self cPICMissTrampolineFor: numArgs).	"Will get rewritten to appropriate arity when configuring."
  	cPICEndOfCodeLabel := self Label.
  	literalsManager dumpLiterals: false.
  	^0!

Item was changed:
+ ----- Method: Cogit>>expectedClosedPICPrototype: (in category 'in-line cacheing') -----
- ----- Method: Cogit>>expectedClosedPICPrototype: (in category 'garbage collection') -----
  expectedClosedPICPrototype: cPIC
  	"Answer 0 if the ClosedPIC is as expected from compileClosedPICPrototype,
  	 otherwise answer an error code identifying the first discrepancy found."
  	"self disassembleFrom: methodZoneBase + (self sizeof: CogMethod) to: methodZoneBase + closedPICSize"
  	<var: #cPIC type: #'CogMethod *'>
  	| pc offsetToLiteral object entryPoint |
  	pc := cPIC asInteger + firstCPICCaseOffset.
  	"First jump is unconditional; subsequent ones are conditional"
  	offsetToLiteral := backEnd jumpLongByteSize.
  	
  	object := literalsManager objRefInClosedPICAt: pc - offsetToLiteral.
  	self assert: (object = 16r5EAF00D).
  
  	entryPoint := backEnd jumpLongTargetBeforeFollowingAddress: pc.
  	self assert: (entryPoint = (self cPICPrototypeCaseOffset + 16rCA5E10)).
  
  	1 to: maxCPICCases - 1 do:
  		[:i |
  		pc := pc + cPICCaseSize.
  		offsetToLiteral := backEnd jumpLongConditionalByteSize.
+ 
- 				
  		object := literalsManager classRefInClosedPICAt: pc - offsetToLiteral.
+ 		self assert: object = (16rBABE1F15 + i).
+ 		literalsManager storeClassRef: (object bitXor: 16r5A5A5A5A) inClosedPICAt: pc - offsetToLiteral.
+ 		object := literalsManager classRefInClosedPICAt: pc - offsetToLiteral.
+ 		self assert: object = ((16rBABE1F15 + i bitXor: 16r5A5A5A5A)).
+ 		literalsManager storeClassRef: (object bitXor: 16r5A5A5A5A) inClosedPICAt: pc - offsetToLiteral.
- 		self assert: (object = (16rBABE1F15 + i)).
  
  		object := literalsManager objRefInClosedPICAt: pc - offsetToLiteral.
+ 		self assert: object = (16rBADA550 + i).
+ 		literalsManager storeObjRef: (object bitXor: 16rA5A5A5A5) inClosedPICAt: pc - offsetToLiteral.
+ 		object := literalsManager objRefInClosedPICAt: pc - offsetToLiteral.
+ 		self assert: object = ((16rBADA550 + i) bitXor: 16rA5A5A5A5).
+ 		literalsManager storeObjRef: (object bitXor: 16rA5A5A5A5) inClosedPICAt: pc - offsetToLiteral.
- 		self assert: (object = (16rBADA550 + i)).
  
  		entryPoint := backEnd jumpLongConditionalTargetBeforeFollowingAddress: pc.
+ 		self assert: entryPoint = (self cPICPrototypeCaseOffset + 16rCA5E10 + (i * 16))].
- 		self assert: (entryPoint = (self cPICPrototypeCaseOffset + 16rCA5E10 + (i * 16)))].
  
  	entryPoint := backEnd jumpLongTargetBeforeFollowingAddress: pc + cPICEndSize - literalsManager endSizeOffset.
  	self assert: (entryPoint = (self cPICMissTrampolineFor: 0)).
  	
  	^0!



More information about the Vm-dev mailing list