[Vm-dev] VM Maker: VMMaker.oscog-eem.2795.mcz

commits at source.squeak.org commits at source.squeak.org
Sun Aug 30 01:12:00 UTC 2020


Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-eem.2795.mcz

==================== Summary ====================

Name: VMMaker.oscog-eem.2795
Author: eem
Time: 29 August 2020, 6:11:51.399954 pm
UUID: 94ce8042-5549-46ab-b693-8d00f36d6d71
Ancestors: VMMaker.oscog-eem.2794

ThreadedARM64Plugin: Implement support for Homogenous Float Arrays (HVAs, structs with up to four float fields, or up to four double fields).  These are passed and returned in floating-point argument registers, on call if sufficient are available.  To implement this the ThreadedARM64Plugin uses a union of a struct containing four doubles, and a struct containing eight floats.  All float/double/HVA returns are handled by a call that expects a struct of four doubles.  hence Slang changes are needed (see below) to allow the struct to be conveniently defined with local methods.

Mark all methods required to be inlined to be in the same funcuton as the alloca as inline: #always.  Hence their code will only occur inlined, not a second time in an unused function.

Tidy up, pulling the unaligned accessor macros out of the preamble and explicitly into methods, whether Slang has a chance to generate code correctly given their presence.

Also make sure that all references to a type spec are typed as unsigned int/unsigned int *, including the callout state's ffiArgSpec.

This fixes about five test cases in the FFI tests.

Fix a warning by typing InterpreterProxy>>characterObjectOf:'s argument as int to agree with sqVirtualMachine.h.

Slang: Fix several issues with inlining and type inferrence to support the above.  Distinguish macros from struct accessors; previously isStructSend: could be confused.  Make sure that structTargetKindForDeclaration: answers #pointer only for types endign with a *; previously it could be confused by e.g. a struct containing pointers.  Make isTypePointerToStruct: more robust, answering false for anything that isn't a string and then analysing the string. emitCCodeAsFieldReferenceOn:level:generator: must also check for shouldGenerateAsInterpreterProxySend:. tryToInlineMethodsIn: must push the current method's declarations onto the scope stack to allow priper type inferrence while inlining.  Since these changes now allow e.g. a structure method to be inlined, extend node:typeCompatibleWith:inliningInto:in: to inline such arguments; it needs to take the address of the argument to derive the lined pointer to the actual argument.

=============== Diff against VMMaker.oscog-eem.2794 ===============

Item was added:
+ ----- Method: CCodeGenerator>>isKernelMacroSelector: (in category 'utilities') -----
+ isKernelMacroSelector: sel
+ 	"Answer if the given selector is one of the selectors implemented as a macro in platform header fiels."
+ 
+ 	^(self isKernelSelector: sel)
+ 	 or: [VMBasicConstants mostBasicConstantSelectors includes: sel]!

Item was added:
+ ----- Method: CCodeGenerator>>isMacroSelector: (in category 'utilities') -----
+ isMacroSelector: sel
+ 	"Answer if the given selector is one of the selectors implemented as a macro in platform header fiels."
+ 
+ 	^(self isKernelSelector: sel)
+ 	 or: [(VMBasicConstants mostBasicConstantSelectors includes: sel)
+ 	 or: [(self methodNamed: sel)
+ 			ifNil: [false]
+ 			ifNotNil: [:m| m definedAsMacro]]]!

Item was changed:
  ----- Method: CCodeGenerator>>node:typeCompatibleWith:inliningInto:in: (in category 'inlining') -----
  node: exprNode typeCompatibleWith: argName inliningInto: targetMethod in: aTMethod
  	"Answer either exprNode or, if required, a cast of exprNode to the type of argName.
  	 The cast is required if
  		- argName is typed and exprNode is untyped
  		- argName is untyped and exprNode is an arithmetic type of size > #sqInt
  		- both argName and exprNode are typed but they are incompatible"
  	| formalType actualType |
  	formalType := targetMethod typeFor: argName in: self.
  	actualType := self typeFor: exprNode in: aTMethod.
+ 	"First check for inlining a struct method/accessor"
+ 	((argName beginsWith: 'self_in_')
+ 	 and: [formalType last == $*
+ 	 and: [(formalType beginsWith: actualType)
+ 	 and: [(formalType allButFirst: actualType size) withBlanksTrimmed = '*']]]) ifTrue:
+ 		[^TSendNode new
+ 			setSelector: #addressOf:
+ 			receiver: (TVariableNode new setName: 'self')
+ 			arguments: { exprNode }
+ 			isBuiltInOp: false].
+ 	"Second check for arithmetic coercion"
+ 	((exprNode isSend or: [exprNode isVariable])
- 	^((exprNode isSend or: [exprNode isVariable])
  	   and: [(formalType notNil and: [actualType isNil])
  			or: [(formalType isNil and: [actualType notNil and: [(self isIntegralCType: actualType) and: [(self sizeOfIntegralCType: actualType) > (self sizeOfIntegralCType: #sqInt)]]])
  			or: [(self variableOfType: formalType acceptsValue: exprNode ofType: actualType) not]]])
+ 		ifTrue:
+ 			[^self nodeToCast: exprNode to: (formalType ifNil: [#sqInt])].
+ 	"Next check for type errors..."
+ 	((exprNode isSend or: [exprNode isVariable])
+ 	  and: [(self
+ 			variableOfType: (targetMethod typeFor: argName in: self)
+ 			acceptsValue: exprNode
+ 			ofType: (self typeFor: exprNode in: aTMethod)) not]) ifTrue:
+ 		[logger
+ 			nextPutAll:
+ 				'type mismatch for formal ', argName, ' and actual "', exprNode asString,
+ 				'" when inlining ', targetMethod selector, ' in ', aTMethod selector, '. Use a cast.';
+ 			cr; flush].
+ 	"No conversion is necessary, or there's a type error..."
+ 	^exprNode!
- 		ifTrue: [self nodeToCast: exprNode to: (formalType ifNil: [#sqInt])]
- 		ifFalse:
- 			[((exprNode isSend or: [exprNode isVariable])
- 			  and: [(self
- 					variableOfType: (targetMethod typeFor: argName in: self)
- 					acceptsValue: exprNode
- 					ofType: (self typeFor: exprNode in: aTMethod)) not]) ifTrue:
- 				[logger
- 					nextPutAll:
- 						'type mismatch for formal ', argName, ' and actual "', exprNode asString,
- 						'" when inlining ', targetMethod selector, ' in ', aTMethod selector, '. Use a cast.';
- 					cr; flush]. 
- 			exprNode]!

Item was changed:
  ----- Method: InterpreterProxy>>characterObjectOf: (in category 'object access') -----
  characterObjectOf: characterCode
  	<option: #(atLeastVMProxyMajor:minor: 1 13)>
+ 	<var: 'characterCode' type: #int>
  	^StackInterpreter objectMemoryClass characterObjectOf: characterCode!

Item was changed:
  ----- Method: TMethod>>checkForCompletenessIn: (in category 'inlining support') -----
  checkForCompletenessIn: aCodeGen
  	"Set the complete flag if the parse tree contains no further candidates for inlining."
  	| foundIncompleteSend incompleteSends |
  	aCodeGen maybeBreakForTestOfInliningOf: selector.
  
  	foundIncompleteSend := false.
  	incompleteSends := IdentitySet new.
  
  	parseTree
  		nodesDo:
  			[:node|
  			 node isSend ifTrue:
  				[(self methodIsEffectivelyComplete: node selector in: aCodeGen)
  					ifTrue:
  						[(self inlineableFunctionCall: node in: aCodeGen) ifTrue:
+ 							[incompleteSends add: node.
+ 							 complete := false.  "more inlining to do"
+ 							 ^self]]
- 							[complete := false.  "more inlining to do"
- 							^self]]
  					ifFalse:
  						[foundIncompleteSend := true.
  						 incompleteSends add: node]]]
  		unless:
  			[:node|
  			node isSend
  			and: [node selector == #cCode:inSmalltalk:
  				or: [aCodeGen isAssertSelector: node selector]]].
  
  	foundIncompleteSend ifFalse:
  		[complete := true]!

Item was changed:
  ----- Method: TMethod>>inlineableFunctionCall:in: (in category 'inlining') -----
  inlineableFunctionCall: aNode in: aCodeGen
  	"Answer if the given send node is a call to a 'functional' method--a method whose body is a single return statement of some expression and whose actual parameters can all be directly substituted."
  
  	aCodeGen maybeBreakForTestToInline: aNode in: self.
  	aNode isSend ifFalse:
  		[^false].
+ 	((aCodeGen shouldGenerateAsInterpreterProxySend: aNode)
+ 	 or: [aCodeGen isStructSend: aNode]) ifTrue:
+ 		[^false].
  	^(aCodeGen methodNamed: aNode selector)
  		ifNil:
  			[aNode asTransformedConstantPerform
  				ifNil: [self isInlineableConditional: aNode in: aCodeGen]
  				ifNotNil: [:n| self inlineableFunctionCall: n in: aCodeGen]]
  		ifNotNil:
  			[:m|
  			 (m ~~ self
  			  and: [((m isFunctionalIn: aCodeGen) or: [m mustBeInlined and: [m isComplete]])
  			  and: [m mayBeInlined
  			  and: [(aCodeGen mayInline: m selector)
  			  and: [aNode args allSatisfy: [:a| self isSubstitutableNode: a intoMethod: m in: aCodeGen]]]]])
  			 or: [m checkForRequiredInlinability]]!

Item was changed:
  ----- Method: TMethod>>methodIsEffectivelyComplete:in: (in category 'inlining support') -----
  methodIsEffectivelyComplete: selector in: aCodeGen
  	"Answer if selector is effectively not inlineable in the receiver.
  	 This is tricky because block inlining requires that certain methods must be inlined, which
  	 can be at odds wuth the opportunistic strategy the inliner takes.  Since the inliner only
  	 inlines complete methods and certain methods may never be marked as complete (e.g.
  	 recursive methods) we have to short-cut certain kinds of send.  In particular, short-cut
  	 sends that turn into jumps in the interpret routine (sharedCase and sharedLabel below)."
  	^(aCodeGen methodNamed: selector)
  		ifNil: [true] "builtins or externals are not inlineable"
  		ifNotNil:
  			[:m|
  			 m isComplete
  			 "unlinable methods can't be inlined"
  			 or: [m mayBeInlined not
  			 "Methods which are inlined as jumps don't need inlining"
+ 			 or: [m sharedCase notNil or: [m sharedLabel notNil
+ 			 "Macros and struct accessors don't need inlining"
+ 			 or: [m definedAsMacro or: [m isStructAccessor]]]]]]!
- 			 or: [m sharedCase notNil or: [m sharedLabel notNil]]]]!

Item was changed:
  ----- Method: TMethod>>tryToInlineMethodsIn: (in category 'inlining') -----
  tryToInlineMethodsIn: aCodeGen
  	"Expand any (complete) inline methods sent by this method.
  	 Set the complete flag when all inlining has been done.
  	 Answer if something was inlined."
  
- 	| didSomething statementLists |
- 	"complete ifTrue:
- 		[^false]."
- 
  	self definedAsMacro ifTrue:
  		[complete ifTrue:
  			[^false].
  		 ^complete := true].
  
+ 	^aCodeGen
+ 		pushScope: declarations
+ 		while:
+ 			[| didSomething statementLists |
+ 			self ensureConditionalAssignmentsAreTransformedIn: aCodeGen.
+ 			didSomething := self tryToInlineMethodStatementsIn: aCodeGen statementListsInto: [:stmtLists| statementLists := stmtLists].
+ 			didSomething := (self tryToInlineMethodExpressionsIn: aCodeGen) or: [didSomething].
- 	self ensureConditionalAssignmentsAreTransformedIn: aCodeGen.
- 	didSomething := self tryToInlineMethodStatementsIn: aCodeGen statementListsInto: [:stmtLists| statementLists := stmtLists].
- 	didSomething := (self tryToInlineMethodExpressionsIn: aCodeGen) or: [didSomething].
  
+ 			didSomething ifTrue:
+ 				[writtenToGlobalVarsCache := nil].
- 	didSomething ifTrue:
- 		[writtenToGlobalVarsCache := nil].
  
+ 			complete ifFalse:
+ 				[self checkForCompletenessIn: aCodeGen.
+ 				 complete ifTrue: [didSomething := true]].  "marking a method complete is progress"
+ 			didSomething]!
- 	complete ifFalse:
- 		[self checkForCompletenessIn: aCodeGen.
- 		 complete ifTrue: [didSomething := true]].  "marking a method complete is progress"
- 	^didSomething!

Item was changed:
  ----- Method: TSendNode>>emitCCodeAsFieldReferenceOn:level:generator: (in category 'C code generation') -----
  emitCCodeAsFieldReferenceOn: aStream level: level generator: aCodeGen
  	"If appropriate, translate this message send as a pointer dereference"
  
  	| parenCount |
  	(aCodeGen isStructSend: self) ifFalse:
  		[^false].
+ 	(aCodeGen shouldGenerateAsInterpreterProxySend: self) ifTrue:
+ 		[^false].
  
  	parenCount := receiver isSend ifTrue: [2] ifFalse: [1].
  	aStream next: parenCount put: $(.
  	receiver  emitCCodeAsExpressionOn: aStream level: 0 generator: aCodeGen.
  	parenCount > 1 ifTrue:
  		[aStream nextPut: $)].
  	(receiver structTargetKindIn: aCodeGen) caseOf: {
  		[#pointer] -> [aStream nextPut: $-; nextPut: $>].
  		[#struct] -> [aStream nextPut: $.] }.
  	aStream nextPutAll: (aCodeGen cFunctionNameFor: selector).
  	arguments isEmpty ifFalse:
  		[self assert: arguments size = 1.
  		 aStream nextPutAll: ' = '.
  		 arguments first emitCCodeAsExpressionOn: aStream level: level generator: aCodeGen].
  	aStream nextPut: $).
  	^true!

Item was changed:
  ----- Method: ThreadedARM32FFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  	<var: #procAddr type: #'void *'>
  	<var: #calloutState type: #'CalloutState *'>
  	<var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double, double, double, double, double)'>
  	"Go out, call this guy and create the return value.  This *must* be inlined because of
  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
  	| myThreadIndex atomicType floatRet intRet |
  	<var: #floatRet type: #double>
  	<var: #intRet type: #usqLong>
+ 	<inline: #always>
- 	<inline: true>
  	myThreadIndex := interpreterProxy disownVM: (self disownFlagsFor: calloutState).
  
  	calloutState floatRegisterIndex > 0 ifTrue:
  		[self 
  			load: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 0)) to: 'double *') at: 0)
  			Flo: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 2)) to: 'double *') at: 0)
  			a: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 4)) to: 'double *') at: 0)
  			t: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 6)) to: 'double *') at: 0)
  			R: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 8)) to: 'double *') at: 0)
  			e: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 10)) to: 'double *') at: 0)
  			g: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 12)) to: 'double *') at: 0)
  			s: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 14)) to: 'double *') at: 0)].
  
  	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
  		[self setsp: calloutState argVector].
  
  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
  	(atomicType >> 1) = (FFITypeSingleFloat >> 1) ifTrue:
  		[atomicType = FFITypeSingleFloat
  			ifTrue:
  				[floatRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)]
  			ifFalse: "atomicType = FFITypeDoubleFloat"
  				[floatRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)].
  
  		 "undo any callee argument pops because it may confuse stack management with the alloca."
  		 (self isCalleePopsConvention: calloutState callFlags) ifTrue:
  			[self setsp: calloutState argVector].
  		 interpreterProxy ownVM: myThreadIndex.
  
  		 ^interpreterProxy floatObjectOf: floatRet].
  
  	intRet := self 
  				dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqIntptr_t (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  				with: (calloutState integerRegisters at: 0)
  				with: (calloutState integerRegisters at: 1)
  				with: (calloutState integerRegisters at: 2)
  				with: (calloutState integerRegisters at: 3).
  
  	"undo any callee argument pops because it may confuse stack management with the alloca."
  	(self isCalleePopsConvention: calloutState callFlags) ifTrue:
  		[self setsp: calloutState argVector].
  	interpreterProxy ownVM: myThreadIndex.
  
  	(calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
  		["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
  		 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
  		 (calloutState ffiRetHeader anyMask: FFIFlagPointer) ifTrue:
  			[^self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
  		 ^self ffiReturnStruct: (self addressOf: intRet) ofType: (self ffiReturnType: specOnStack) in: calloutState].
  	
  	^self ffiCreateIntegralResultOop: intRet ofAtomicType: atomicType in: calloutState!

Item was added:
+ ----- Method: ThreadedARM64FFIPlugin class>>isStructType: (in category 'translation') -----
+ isStructType: typeName
+ 	| space |
+ 	^(space := typeName indexOf: Character space) > 0
+ 	 and: [#(union struct) includes: (typeName copyFrom: 1 to: space - 1)]!

Item was added:
+ ----- Method: ThreadedARM64FFIPlugin>>d (in category 'translation support') -----
+ d
+ 	"Hack for floatRet in ffiCalloutTo:SpecOnStack:in:"
+ 	<returnTypeC: 'struct { double doubles[4]; }'>
+ 	<inline: #always> "to make it disappear..."
+ 	^0!

Item was added:
+ ----- Method: ThreadedARM64FFIPlugin>>doubles (in category 'translation support') -----
+ doubles
+ 	"Hack for floatRet in ffiCalloutTo:SpecOnStack:in:"
+ 	<returnTypeC: #'double *'>
+ 	<inline: #always> "to make it disappear..."
+ 	^0!

Item was added:
+ ----- Method: ThreadedARM64FFIPlugin>>f (in category 'translation support') -----
+ f
+ 	"Hack for floatRet in ffiCalloutTo:SpecOnStack:in:"
+ 	<returnTypeC: 'struct { float floats[8]; }'>
+ 	<inline: #always> "to make it disappear..."
+ 	^0!

Item was changed:
  ----- Method: ThreadedARM64FFIPlugin>>ffiCall:ArgArrayOrNil:NumArgs: (in category 'callout support') -----
  ffiCall: externalFunction ArgArrayOrNil: argArrayOrNil NumArgs: nArgs
  	"Generic callout. Does the actual work.  If argArrayOrNil is nil it takes args from the stack
  	 and the spec from the method.  If argArrayOrNil is not nil takes args from argArrayOrNil
  	 and the spec from the receiver."
  	| flags argTypeArray address argType oop argSpec argClass err theCalloutState calloutState requiredStackSize stackSize allocation result primNumArgs |
+ 	<inline: false>
- 	<inline: true>
  	<var: #theCalloutState type: #'CalloutState'>
  	<var: #calloutState type: #'CalloutState *'>
  	<var: #allocation type: #'char *'>
  
  	primNumArgs := interpreterProxy methodArgumentCount.
  	(interpreterProxy is: externalFunction KindOfClass: interpreterProxy classExternalFunction) ifFalse:
  		[^self ffiFail: FFIErrorNotFunction].
  	"Load and check the values in the externalFunction before we call out"
  	flags := interpreterProxy fetchInteger: ExternalFunctionFlagsIndex ofObject: externalFunction.
  	interpreterProxy failed ifTrue:
  		[^self ffiFail: FFIErrorBadArgs].
  
  	"This must come early for compatibility with the old FFIPlugin.  Image-level code
  	 may assume the function pointer is loaded eagerly.  Thanks to Nicolas Cellier."
  	address := self ffiLoadCalloutAddress: externalFunction.
  	interpreterProxy failed ifTrue:
  		[^0 "error code already set by ffiLoadCalloutAddress:"].
  	
  	argTypeArray := interpreterProxy fetchPointer: ExternalFunctionArgTypesIndex ofObject: externalFunction.
  	"must be array of arg types"
  	((interpreterProxy isArray: argTypeArray)
  	and: [(interpreterProxy slotSizeOf: argTypeArray) = (nArgs + 1)]) ifFalse:
  		[^self ffiFail: FFIErrorBadArgs].
  	"check if the calling convention is supported"
  	self cppIf: COGMTVM
  		ifTrue:
  			[(self ffiSupportsCallingConvention: (flags bitAnd: FFICallTypesMask)) ifFalse:
  				[^self ffiFail: FFIErrorCallType]]
  		ifFalse: "not masking causes threaded calls to fail, which is as they should if the plugin is not threaded."
  			[(self ffiSupportsCallingConvention: flags) ifFalse:
  				[^self ffiFail: FFIErrorCallType]].
  		
  	requiredStackSize := self externalFunctionHasStackSizeSlot
  							ifTrue: [interpreterProxy
  										fetchInteger: ExternalFunctionStackSizeIndex
  										ofObject: externalFunction]
  							ifFalse: [-1].
  	interpreterProxy failed ifTrue:
  		[^interpreterProxy primitiveFailFor: (argArrayOrNil isNil
  												ifTrue: [PrimErrBadMethod]
  												ifFalse: [PrimErrBadReceiver])].
  	stackSize := requiredStackSize < 0 ifTrue: [DefaultMaxStackSize] ifFalse: [requiredStackSize].
  	self cCode: [] inSmalltalk: [theCalloutState := self class calloutStateClass new].
  	calloutState := self addressOf: theCalloutState.
  	self cCode: [self memset: calloutState _: 0 _: (self sizeof: #CalloutState)].
  	calloutState callFlags: flags.
  	"Fetch return type and args"
  	argType := interpreterProxy fetchPointer: 0 ofObject: argTypeArray.
  	argSpec := interpreterProxy fetchPointer: 0 ofObject: argType.
  	argClass := interpreterProxy fetchPointer: 1 ofObject: argType.
  	(err := self ffiCheckReturn: argSpec With: argClass in: calloutState) ~= 0 ifTrue:
  		[^self ffiFail: err]. "cannot return"
  	"alloca the outgoing stack frame, leaving room for marshalling args, and including space for the return struct, if any.
  	Additional space reserved for saving register args like mandated by Win64 X64 or PPC ABI, will be managed by the call itself"
  	allocation := self alloca: stackSize + calloutState structReturnSize + self cStackAlignment.
  	self mustAlignStack ifTrue:
  		[allocation := self cCoerce: (allocation asUnsignedIntegerPtr bitClear: self cStackAlignment - 1) to: #'char *'].
  	calloutState
  		argVector: allocation;
  		currentArg: allocation;
  		limit: allocation + stackSize.
  	1 to: nArgs do:
  		[:i|
  		argType := interpreterProxy fetchPointer: i ofObject: argTypeArray.
  		argSpec := interpreterProxy fetchPointer: 0 ofObject: argType.
  		argClass := interpreterProxy fetchPointer: 1 ofObject: argType.
  		oop := argArrayOrNil isNil
  				ifTrue: [interpreterProxy stackValue: nArgs - i]
  				ifFalse: [interpreterProxy fetchPointer: i - 1 ofObject: argArrayOrNil].
  		err := self ffiArgument: oop Spec: argSpec Class: argClass in: calloutState.
  		err ~= 0 ifTrue:
  			[self cleanupCalloutState: calloutState.
  			 self cppIf: COGMTVM ifTrue:
  			 [err = PrimErrObjectMayMove negated ifTrue:
  				[^PrimErrObjectMayMove]]. "N.B. Do not fail if object may move because caller will GC and retry."
  			 ^self ffiFail: err]]. "coercion failed or out of stack space"
  	"Failures must be reported back from ffiArgument:Spec:Class:in:.
  	 Should not fail from here on in."
  	self assert: interpreterProxy failed not.
  	self ffiLogCallout: externalFunction.
  	(requiredStackSize < 0
  	 and: [self externalFunctionHasStackSizeSlot]) ifTrue:
  		[stackSize := calloutState currentArg - calloutState argVector.
  		 interpreterProxy storeInteger: ExternalFunctionStackSizeIndex ofObject: externalFunction withValue: stackSize].
  	"Go out and call this guy"
  	result := self ffiCalloutTo: address SpecOnStack: argArrayOrNil notNil in: calloutState.
  	self cleanupCalloutState: calloutState.
  	"Can not safely use argumentCount (via e.g. methodReturnValue:) since it may have been changed by a callback."
  	interpreterProxy pop: primNumArgs + 1 thenPush: result. 
  	^result!

Item was changed:
  ----- Method: ThreadedARM64FFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  	<var: #procAddr type: #'void *'>
  	<var: #calloutState type: #'CalloutState *'>
  	<var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double, double, double, double, double)'>
  	"Go out, call this guy and create the return value.  This *must* be inlined because of
  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
+ 	| myThreadIndex atomicType floatRet intRet x1Ret specSize |
+ 	<var: #floatRet type: 'union { struct { float floats[8]; } f; struct dprr { double doubles[4]; } d; }'>
- 	| myThreadIndex atomicType floatRet intRet x1Ret |
- 	<var: #floatRet type: #double>
  	<var: #intRet type: #usqLong>
  	<var: #x1Ret type: #usqLong>
+ 	<inline: #always>
+ 	self cCode: [] inSmalltalk: [floatRet := ByteArray new: 32]. "Just a hack to placate the Smalltalk compiler; this shoudl be a proper struct..."
- 	<inline: true>
  	myThreadIndex := interpreterProxy disownVM: (self disownFlagsFor: calloutState).
  	
  	calloutState floatRegisterIndex > 0 ifTrue:
  		[self loadFloatRegs:
  			   (calloutState floatRegisters at: 0)
  			_: (calloutState floatRegisters at: 1)
  			_: (calloutState floatRegisters at: 2)
  			_: (calloutState floatRegisters at: 3)
  			_: (calloutState floatRegisters at: 4)
  			_: (calloutState floatRegisters at: 5)
  			_: (calloutState floatRegisters at: 6)
  			_: (calloutState floatRegisters at: 7)].
  
  	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
  		[self setsp: calloutState argVector].
  
  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
+ 	((atomicType >> 1) = (FFITypeSingleFloat >> 1)
+ 	or: [(calloutState ffiRetHeader bitAnd: FFIFlagPointer+FFIFlagStructure) = FFIFlagStructure
+ 		and: [self structIsHomogenousFloatArrayOfSize: (calloutState ffiRetHeader bitAnd: FFIStructSizeMask)
+ 				typeSpec: (self cCoerce: (interpreterProxy firstIndexableField: calloutState ffiRetSpec) to: #'unsigned int *')
+ 				ofLength: (specSize := interpreterProxy byteSizeOf: calloutState ffiRetSpec) / (self sizeof: #'unsigned int')]]) ifTrue:
+ 		[floatRet d: (self 
+ 					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'struct dprr (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
- 	(atomicType >> 1) = (FFITypeSingleFloat >> 1) ifTrue:
- 		[atomicType = FFITypeSingleFloat
- 			ifTrue:
- 				[floatRet := self 
- 					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)
  					with: (calloutState integerRegisters at: 4)
  					with: (calloutState integerRegisters at: 5)
  					with: (calloutState integerRegisters at: 6)
+ 					with: (calloutState integerRegisters at: 7)).
- 					with: (calloutState integerRegisters at: 7)]
- 			ifFalse: "atomicType = FFITypeDoubleFloat"
- 				[floatRet := self 
- 					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
- 					with: (calloutState integerRegisters at: 0)
- 					with: (calloutState integerRegisters at: 1)
- 					with: (calloutState integerRegisters at: 2)
- 					with: (calloutState integerRegisters at: 3)
- 					with: (calloutState integerRegisters at: 4)
- 					with: (calloutState integerRegisters at: 5)
- 					with: (calloutState integerRegisters at: 6)
- 					with: (calloutState integerRegisters at: 7)].
  
  		 "undo any callee argument pops because it may confuse stack management with the alloca."
  		 (self isCalleePopsConvention: calloutState callFlags) ifTrue:
  			[self setsp: calloutState argVector].
  		 interpreterProxy ownVM: myThreadIndex.
  
+ 		 atomicType = FFITypeDoubleFloat ifTrue:
+ 			[^interpreterProxy floatObjectOf: (floatRet d doubles at: 0)].
+ 		 atomicType = FFITypeSingleFloat ifTrue:
+ 			[^interpreterProxy floatObjectOf: (floatRet f floats at: 0)].
+ 		"If the struct is a vector of floats then move float[2] to float[1], float[4] to float[2] and float[6] to float[3],
+ 		 to pack the float data in the double fields.  We can tell if the struct is composed of floats if its size is less
+ 		 than the spec size, since the spec size is (1 + n fields) * 4 bytes, and the struct size is n fields * 4 bytes
+ 		 for floats and n fields * 8 bytes for doubles.  We can't access the spec post call because it may have moved."
+ 		specSize > calloutState structReturnSize ifTrue:
+ 			[floatRet f floats at: 1 put: (floatRet f floats at: 2).
+ 			 floatRet f floats at: 2 put: (floatRet f floats at: 4).
+ 			 floatRet f floats at: 3 put: (floatRet f floats at: 6)].
+ 		^self ffiReturnStruct: (self addressOf: floatRet) ofType: (self ffiReturnType: specOnStack) in: calloutState].
- 		 ^interpreterProxy floatObjectOf: floatRet].
  
  	"If struct address used for return value, call is special"
  	(self mustReturnStructOnStack: calloutState structReturnSize) 
  	ifTrue: [
  		intRet := 0.
+ 		self setReturnRegister: (self cCoerceSimple: calloutState limit to: #sqLong) "stack alloca'd struct"
+ 			 andCall: (self cCoerceSimple: procAddr to: #sqLong)
+ 			 withArgsArray: (self cCoerceSimple: (self addressOf: calloutState integerRegisters) to: #sqLong).
- 		self setReturnRegister: (self cCoerceSimple: calloutState limit to: 'sqLong') "stack alloca'd struct"
- 			 andCall: (self cCoerceSimple: procAddr to: 'sqLong')
- 			 withArgsArray: (self cCoerceSimple: (self addressOf: calloutState integerRegisters) to: 'sqLong').
  	] ifFalse: [
  		intRet := self 
  				dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqIntptr_t (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
  				with: (calloutState integerRegisters at: 0)
  				with: (calloutState integerRegisters at: 1)
  				with: (calloutState integerRegisters at: 2)
  				with: (calloutState integerRegisters at: 3)
  				with: (calloutState integerRegisters at: 4)
  				with: (calloutState integerRegisters at: 5)
  				with: (calloutState integerRegisters at: 6)
  				with: (calloutState integerRegisters at: 7).
  	
  	 x1Ret := self getX1register. "Capture x1 immediately. No problem if unused"
  	].
  	"If struct returned in registers, 
  	 place register values into calloutState integerRegisters"
  	(calloutState structReturnSize > 0
  	 and: [self returnStructInRegisters: calloutState]) ifTrue: 
  		["Only 2 regs used in ARMv8/Aarch64 current"
  		 calloutState integerRegisters at: 0 put: intRet. "X0"
  		 calloutState integerRegisters at: 1 put: x1Ret]. "X1"
  
  	"undo any callee argument pops because it may confuse stack management with the alloca."
  	(self isCalleePopsConvention: calloutState callFlags) ifTrue:
  		[self setsp: calloutState argVector].
  	interpreterProxy ownVM: myThreadIndex.
  
  	(calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
  		["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
  		 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
  		 (calloutState ffiRetHeader anyMask: FFIFlagPointer) ifTrue:
  			[^self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
+ 		 ^self ffiReturnStruct: (((self returnStructInRegisters: calloutState)
+ 								ifTrue: [self addressOf: calloutState integerRegisters]
+ 								ifFalse: [calloutState limit]))
+ 				ofType: (self ffiReturnType: specOnStack)
+ 				in: calloutState].
- 		 ^self ffiReturnStruct: (self addressOf: intRet) ofType: (self ffiReturnType: specOnStack) in: calloutState].
  	
  	^self ffiCreateIntegralResultOop: intRet ofAtomicType: atomicType in: calloutState!

Item was changed:
  ----- Method: ThreadedARM64FFIPlugin>>ffiPushStructure:ofSize:typeSpec:ofLength:in: (in category 'marshalling') -----
  ffiPushStructure: pointer ofSize: structSize typeSpec: argSpec ofLength: argSpecSize in: calloutState
  	<var: #pointer type: #'void *'>
+ 	<var: #argSpec type: #'unsigned int *'>
- 	<var: #argSpec type: #'sqInt *'>
  	<var: #calloutState type: #'CalloutState *'>
  	<inline: true>
  	| availableRegisterSpace stackPartSize roundedSize |
+ 	"See IHI0055B_aapcs64.pdf sections 4.3.5 & 5.4.2 Stage C; we don't yet support HVA's"
+ 	(self structIsHomogenousFloatArrayOfSize: structSize typeSpec: argSpec ofLength: argSpecSize)
+ 		ifTrue:
+ 			[availableRegisterSpace := (NumFloatRegArgs - calloutState floatRegisterIndex) * self wordSize.
+ 			 structSize <= availableRegisterSpace ifTrue: "Stage C, step C.2, all in floating-point registers (!!!!)"
+ 				[self 
+ 					memcpy: (self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: calloutState floatRegisterIndex)) to: #'void *') 
+ 					_: pointer 
+ 					_: structSize.
+ 					"Round structSize up and divide by 8 ( NB: _not_ 4 !!)"
+ 				 calloutState floatRegisterIndex: calloutState floatRegisterIndex + (structSize + 7 bitShift: -3).
+ 				 ^0].
+ 			 "Stage C, step C.3"
+ 			 availableRegisterSpace := 0.
+ 			 calloutState floatRegisterIndex: 8]
+ 		ifFalse:
+ 			[availableRegisterSpace := (NumIntRegArgs - calloutState integerRegisterIndex) * self wordSize].
- 
- 	availableRegisterSpace := (NumIntRegArgs - calloutState integerRegisterIndex) * self wordSize.
  	stackPartSize := structSize.
+ 	availableRegisterSpace > 0 ifTrue: 
+ 		[structSize <= availableRegisterSpace ifTrue:"all in integer registers"
+ 			[self 
+ 				memcpy: (self cCoerceSimple: (self addressOf: (calloutState integerRegisters at: calloutState integerRegisterIndex)) to: #'void *') 
+ 				_: pointer 
+ 				_: structSize.
+ 				"Round structSize up and divide by 8 ( NB: _not_ 4 !!)"
+ 			 calloutState integerRegisterIndex: calloutState integerRegisterIndex + (structSize + 7 bitShift: -3).
+ 			 ^0].
+ 		 "If no previous co-processor candidate arg has already been pushed on the stack, then split the struct between registers and stack.
+ 		  Otherwise push entire struct on stack."
+ 		 calloutState currentArg = calloutState argVector
+ 			ifTrue: 
+ 		 		[stackPartSize := structSize - availableRegisterSpace.
+ 		 		self 
+ 					memcpy: (self cCoerceSimple: (self addressOf: (calloutState integerRegisters at: calloutState integerRegisterIndex)) to: 'void *') 
+ 					_: pointer 
+ 					_: availableRegisterSpace]
+ 			ifFalse:
+ 				[availableRegisterSpace := 0].
+ 		"Stage C, step C.11"
+ 		calloutState integerRegisterIndex: NumIntRegArgs].
- 	availableRegisterSpace > 0
- 		ifTrue: 
- 			[structSize <= availableRegisterSpace
- 				ifTrue:
- 					["all in registers"
- 					 stackPartSize := 0.
- 					 self 
- 						memcpy: (self cCoerceSimple: (self addressOf: (calloutState integerRegisters at: calloutState integerRegisterIndex)) to: 'void *') 
- 						_: pointer 
- 						_: structSize.
- 						"Round structSize up and divide by 8 ( NB: _not_ 4 !!)"
- 					 calloutState integerRegisterIndex: calloutState integerRegisterIndex + (structSize + 3 bitShift: -3) ]
- 				ifFalse:
- 					["If no previous co-processor candidate arg has already been pushed on the stack, then split the struct between registers and stack.
- 					  Otherwise push entire struct on stack."
- 					 calloutState currentArg = calloutState argVector
- 						ifTrue: 
- 					 		[stackPartSize := structSize - availableRegisterSpace.
- 					 		self 
- 								memcpy: (self cCoerceSimple: (self addressOf: (calloutState integerRegisters at: calloutState integerRegisterIndex)) to: 'void *') 
- 								_: pointer 
- 								_: availableRegisterSpace]
- 						ifFalse:
- 							[availableRegisterSpace := 0].
- 					calloutState integerRegisterIndex: NumIntRegArgs]].
  
+ 	stackPartSize > 0 ifTrue: 
+ 		[roundedSize := stackPartSize + 3 bitClear: 3.
+ 		 calloutState currentArg + roundedSize > calloutState limit ifTrue:
+ 			 [^FFIErrorCallFrameTooBig].
+ 		 self memcpy: calloutState currentArg _: (self addressOf: ((self cCoerceSimple: pointer to: #'char *') at: availableRegisterSpace)) _: stackPartSize.
+ 		 calloutState currentArg: calloutState currentArg + roundedSize].
- 	stackPartSize > 0
- 		ifTrue: 
- 			[roundedSize := stackPartSize + 3 bitClear: 3.
- 			 calloutState currentArg + roundedSize > calloutState limit ifTrue:
- 				 [^FFIErrorCallFrameTooBig].
- 			 self memcpy: calloutState currentArg _: (self addressOf: ((self cCoerceSimple: pointer to: 'char *') at: availableRegisterSpace)) _: stackPartSize.
- 			 calloutState currentArg: calloutState currentArg + roundedSize].
  	^0!

Item was changed:
  ----- Method: ThreadedARM64FFIPlugin>>ffiReturnStruct:ofType:in: (in category 'callout support') -----
  ffiReturnStruct: longLongRetPtr ofType: ffiRetType in: calloutState
  	<var: #longLongRetPtr type: #'void *'>
  	<var: #calloutState type: #'CalloutState *'>
  	"Create a structure return value from an external function call.  The value has been stored in
  	 alloca'ed space pointed to by the calloutState or in the integer registers."
  	| retOop retClass oop |
  	<inline: true>
  	retClass := interpreterProxy fetchPointer: 1 ofObject: ffiRetType.
  	retOop := interpreterProxy instantiateClass: retClass indexableSize: 0.
  	self remapOop: retOop
  		in: [oop := interpreterProxy 
  					instantiateClass: interpreterProxy classByteArray 
  					indexableSize: calloutState structReturnSize].
  	self memcpy: (interpreterProxy firstIndexableField: oop)
+ 		_: longLongRetPtr
- 		_: ((self returnStructInRegisters: calloutState)
- 				ifTrue: [self addressOf: calloutState integerRegisters]
- 				ifFalse: [calloutState limit])
  		 _: calloutState structReturnSize.
  	interpreterProxy storePointer: 0 ofObject: retOop withValue: oop.
  	^retOop!

Item was added:
+ ----- Method: ThreadedARM64FFIPlugin>>floats (in category 'translation support') -----
+ floats
+ 	"Hack for floatRet in ffiCalloutTo:SpecOnStack:in:"
+ 	<returnTypeC: #'float *'>
+ 	<inline: #always> "to make it disappear..."
+ 	^0!

Item was added:
+ ----- Method: ThreadedARM64FFIPlugin>>structIsHomogenousFloatArrayOfSize:typeSpec:ofLength: (in category 'marshalling') -----
+ structIsHomogenousFloatArrayOfSize: structSize typeSpec: argSpec ofLength: argSpecSize
+ 	<var: #argSpec type: #'unsigned int *'>
+ 	| firstField typeOfFirstField |
+ 	(structSize <= (4 * (self sizeof: #double))
+ 	 and: [argSpecSize <= 5]) "header plus up to four fields" ifFalse:
+ 		[^false].
+ 	typeOfFirstField := self atomicTypeOf: (firstField := argSpec at: 1).
+ 	(typeOfFirstField ~= FFITypeSingleFloat and: [typeOfFirstField ~= FFITypeDoubleFloat]) ifTrue:
+ 		[^false].
+ 	2 to: argSpecSize - 1 do:
+ 		[:idx|
+ 		firstField ~= (argSpec at: idx) ifTrue:
+ 			[^false]].
+ 	^true!

Item was changed:
  ----- Method: ThreadedFFICalloutState class>>instVarNamesAndTypesForTranslationDo: (in category 'translation') -----
  instVarNamesAndTypesForTranslationDo: aBinaryBlock
  	"enumerate aBinaryBlock with the names and C type strings for the inst vars to include in a ThreadedFFICalloutState struct."
  
  	self instVarNames do:
  		[:ivn|
  		aBinaryBlock
  			value: ivn
  			value: (ivn caseOf: {
  						['argVector']	-> [#'char *'].
  						['currentArg']	-> [#'char *'].
  						['limit']			-> [#'char *'].
+ 						['ffiArgSpec']	-> [#'unsigned int *']. "ffiArgSpecs are WordArrays"
- 						['ffiArgSpec']	-> [#'void *'].
  						['stringArgs']	-> [{#'char *'. '[MaxNumArgs]'}] }
  					otherwise:
  						[#sqInt])]!

Item was changed:
  ----- Method: ThreadedFFIPlugin class>>preambleCCode (in category 'translation') -----
  preambleCCode
  	"For a source of builtin defines grep for builtin_define in a gcc release config directory.
+ 	 See platforms/Cross/vm/sqCogStackAlignment.h for per-platform definitions for
+ 	 STACK_ALIGN_BYTES MUST_ALIGN_STACK, getsp, et al."
- 	 See See platforms/Cross/vm/sqCogStackAlignment.h for per-platform definitions for
- 	 STACK_ALIGN_BYTES MUST_ALIGN_STACK et al."
  	^'
  #include "sqAssert.h" /* for assert */
  #define ThreadedFFIPlugin 1 /* to filter-out unwanted declarations from sqFFI.h */
  #include "sqFFI.h" /* for logging and surface functions */
  #include "sqCogStackAlignment.h" /* for STACK_ALIGN_BYTES and getsp() */
  
  #ifdef _MSC_VER
  # define alloca _alloca
  #endif
+ #if !!defined(setsp) && defined(__GNUC__)
+ # if defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__)
+ #	define setsp(spval) asm volatile ("movl %0,%%esp" : : "m"(spval))
+ # elif defined(__amd64__) || defined(__x86_64__) ||  defined(__amd64) || defined(__x86_64)
+ #	define setsp(spval) asm volatile ("movq %0,%%rsp" : : "m"(spval))
- #if defined(__GNUC__) && (defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__))
- # define setsp(sp) asm volatile ("movl %0,%%esp" : : "m"(sp))
- # elif defined(__GNUC__) && (defined(__amd64__) || defined(__x86_64__) ||  defined(__amd64) || defined(__x86_64))
- # define setsp(sp) asm volatile ("movq %0,%%rsp" : : "m"(sp))
  # elif defined(__arm64__) || defined(__aarch64__) || defined(ARM64)
          /* https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Extended-Asm
           * http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/index.html
           */
+ #	define setsp(spval) asm volatile ("mov sp, %0"  : : "r"(spval))
+ # elif defined(__arm__)
+ #	define setsp(spval) asm volatile ("ldr %%sp, %0" : : "m"(spval))
+ # endif
- #  if __GNUC__
- #   define getfp() ({ usqIntptr_t fp;                                                           \
-                                           asm volatile ("mov x0, x29" : "=r"(x29) : );  \
-                                           fp; })
- #   define getsp() ({ usqIntptr_t sp;                                                           \
-                                           asm volatile ("mov x0, sp" : "=r"(sp) : );    \
-                                           sp; })
- # define setsp(sp) asm volatile ("ldr x16, %0 \n\t" "mov sp, x16"  : : "m"(sp) )
- #  endif
- # elif defined(__GNUC__) && (defined(__arm__))
- # define setsp(sp) asm volatile ("ldr %%sp, %0" : : "m"(sp))
  #endif
  #if !!defined(getsp)
  # define getsp() 0
  #endif 
  #if !!defined(setsp)
  # define setsp(ignored) 0
  #endif 
  
  #if !!defined(STACK_ALIGN_BYTES)
  #  define STACK_ALIGN_BYTES 0
  #endif /* !!defined(STACK_ALIGN_BYTES) */
  
  /* For ABI that require stack alignment greater than natural word size */
  #define MUST_ALIGN_STACK (STACK_ALIGN_BYTES > sizeof(void*))
  
  #if defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__)
  /* Both Mac OS X x86 and Win32 x86 return structs of a power of two in size
   * less than or equal to eight bytes in length in registers. Linux never does so.
   */
  # if __linux__
  #	define WIN32_X86_STRUCT_RETURN 0
  # else
  #	define WIN32_X86_STRUCT_RETURN 1
  # endif
  # if _WIN32
  #	define PLATFORM_API_USES_CALLEE_POPS_CONVENTION 1
  # endif
+ #elif defined(__amd64__) || defined(__x86_64__) ||  defined(__amd64) || defined(__x86_64)
- # elif defined(__amd64__) || defined(__x86_64__) ||  defined(__amd64) || defined(__x86_64)
  # if _WIN32 | _WIN64
  #	define PLATFORM_API_USES_CALLEE_POPS_CONVENTION 1
  # endif
  #endif /* defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__) */
  
  #if !!defined(ALLOCA_LIES_SO_SETSP_BEFORE_CALL)
  # if defined(__MINGW32__) && !!defined(__clang__) && (__GNUC__ >= 3) && (defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__))
      /*
       * cygwin -mno-cygwin (MinGW) gcc 3.4.x''s alloca is a library routine that answers
       * %esp + xx, so the outgoing stack is offset by one or more word if uncorrected.
       * Grab the actual stack pointer to correct.
       */
  #	define ALLOCA_LIES_SO_SETSP_BEFORE_CALL 1
  # else
  #	define ALLOCA_LIES_SO_SETSP_BEFORE_CALL 0
  # endif
  #endif /* !!defined(ALLOCA_LIES_SO_SETSP_BEFORE_CALL) */
  
  #if !!defined(PLATFORM_API_USES_CALLEE_POPS_CONVENTION)
  # define PLATFORM_API_USES_CALLEE_POPS_CONVENTION 0
  #endif
  
- /* This alignment stuff is a hack for integerAt:put:size:signed:/primitiveFFIIntegerAt[Put].
-  * The assumption right now is that all processors support unaligned access.  That only
-  * holds true for x86, x86-64 & ARMv6 & later.  But this keeps us going until we can address
-  * it properly.
-  */
- #define unalignedShortAt(a) shortAt(a)
- #define unalignedShortAtput(a,v) shortAtput(a,v)
- #define unalignedLong32At(a) long32At(a)
- #define unalignedLong32Atput(a,v) long32Atput(a,v)
- #define unalignedLong64At(a) long64At(a)
- #define unalignedLong64Atput(a,v) long64Atput(a,v)
- 
  /* The dispatchOn:in:with:with: generates an unwanted call on error.  Just squash it. */
  #define error(foo) 0
- #ifndef SQUEAK_BUILTIN_PLUGIN
- /* but print assert failures. */
- void
- warning(char *s) { /* Print an error message but don''t exit. */
- 	printf("\n%s\n", s);
- }
- #endif
  
  /* sanitize */
  #ifdef SQUEAK_BUILTIN_PLUGIN
  # define EXTERN 
  #else
  # define EXTERN extern
  #endif
  '!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>alignmentOfStructSpec:OfLength:StartingAt: (in category 'marshalling-struct') -----
  alignmentOfStructSpec: specs OfLength: specSize StartingAt: indexPtr
  	"Answer with the alignment requirement for a structure/union.
  	Note that indexPtr is a pointer so as to be changed on return.
  	On input, the index points to the structure header (the one with FFIFlagStructure + structSize).
  	On output, the index points the the structure trailer (the FFIFlagStructure)."
  	| spec byteAlignment thisAlignment |
+ 	<var: #specs type: #'unsigned int *'>
+ 	<var: #indexPtr type: #'sqInt *'>
- 	<var: #specs type: #'unsigned int*'>
- 	<var: #indexPtr type: #'unsigned int*'>
  	<inline: false>
  	spec := specs at: (indexPtr at: 0).
  	self assert: (spec bitAnd: FFIFlagPointer + FFIFlagAtomic + FFIFlagStructure) = FFIFlagStructure.
  	byteAlignment := 1.
  	[indexPtr at: 0 put: (indexPtr at: 0) + 1.
  	(indexPtr at: 0) < specSize]
  		whileTrue:
  			[spec := specs at: (indexPtr at: 0).
  			spec = FFIFlagStructure
  				ifTrue: [^byteAlignment].
  			thisAlignment := (spec anyMask: FFIFlagPointer)
  				ifTrue: [BytesPerWord]
  				ifFalse: [(spec anyMask: FFIFlagStructure)
  					ifTrue: [self alignmentOfStructSpec: specs OfLength: specSize StartingAt: indexPtr]
  					ifFalse: [spec bitAnd: FFIStructSizeMask]].
  			byteAlignment := byteAlignment max: thisAlignment].
  	self assert: false. "should not reach here - because only ever called for sub-struct"
  	^byteAlignment!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>checkAlignmentOfStructSpec:OfLength:StartingAt: (in category 'marshalling-struct') -----
  checkAlignmentOfStructSpec: specs OfLength: specSize StartingAt: startIndex
  	"Check the alignment of a structure and return true if correctly aligned.
  	If computed size = declared size, then the struct is assumed correctly aligned."
  	| index spec computedSize fieldAlignment fieldSize declaredSize maxAlignment |
+ 	<var: #specs type: #'unsigned int *'>
- 	<var: #specs type: #'unsigned int*'>
  	<inline: false>
  	index := startIndex.
  	spec := specs at: index.
  	self assert: (spec bitAnd: FFIFlagPointer + FFIFlagAtomic + FFIFlagStructure) = FFIFlagStructure.
+ 	(self isUnionSpec: specs OfLength: specSize StartingAt: index) ifTrue:
+ 		[^self checkAlignmentOfUnionSpec: specs OfLength: specSize StartingAt: startIndex].
- 	(self isUnionSpec: specs OfLength: specSize StartingAt: index)
- 		ifTrue:
- 			[^self checkAlignmentOfUnionSpec: specs OfLength: specSize StartingAt: startIndex].
  	declaredSize := spec bitAnd: FFIStructSizeMask.
  	computedSize := 0.
  	maxAlignment := 1.
+ 	[(index := index + 1) < specSize] whileTrue:
+ 		[spec := specs at: index.
+ 		spec = FFIFlagStructure ifTrue:
+ 			[^(computedSize - 1 bitOr: maxAlignment - 1) + 1 = declaredSize].
+ 		(spec anyMask: FFIFlagPointer)
+ 			ifTrue:
+ 				[fieldSize := BytesPerWord.
+ 				fieldAlignment := fieldSize]
+ 			ifFalse:
+ 				[fieldSize := spec bitAnd: FFIStructSizeMask.
+ 				(spec anyMask: FFIFlagStructure)
+ 					ifTrue:
+ 						[(self checkAlignmentOfStructSpec: specs OfLength: specSize StartingAt: index) ifFalse:
+ 							[^false].
+ 						 fieldAlignment := self alignmentOfStructSpec: specs OfLength: specSize StartingAt: (self addressOf: index put: [:v| index := v])]
+ 					ifFalse: [fieldAlignment := fieldSize]].
+ 		"round to fieldAlignment"
+ 		maxAlignment := maxAlignment max: fieldAlignment.
+ 		computedSize := (computedSize - 1 bitOr: fieldAlignment - 1) + 1.
+ 		computedSize := computedSize + fieldSize].
- 	[index := index + 1.
- 	index < specSize]
- 		whileTrue:
- 			[spec := specs at: index.
- 			spec = FFIFlagStructure
- 				ifTrue: [^(computedSize - 1 bitOr: maxAlignment - 1) + 1 = declaredSize].
- 			(spec anyMask: FFIFlagPointer)
- 				ifTrue:
- 					[fieldSize := BytesPerWord.
- 					fieldAlignment := fieldSize]
- 				ifFalse:
- 					[fieldSize := spec bitAnd: FFIStructSizeMask.
- 					(spec anyMask: FFIFlagStructure)
- 						ifTrue:
- 							[(self checkAlignmentOfStructSpec: specs OfLength: specSize StartingAt: index)
- 								ifFalse: [^false].
- 							 fieldAlignment := self alignmentOfStructSpec: specs OfLength: specSize StartingAt: (self addressOf: index)]
- 						ifFalse: [fieldAlignment := fieldSize]].
- 			"round to fieldAlignment"
- 			maxAlignment := maxAlignment max: fieldAlignment.
- 			computedSize := (computedSize - 1 bitOr: fieldAlignment - 1) + 1.
- 			computedSize := computedSize + fieldSize].
  	^(computedSize - 1 bitOr: maxAlignment - 1) + 1 = declaredSize!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>externalFunctionHasStackSizeSlot (in category 'symbol loading') -----
  externalFunctionHasStackSizeSlot
+ 	<inline: #always>
- 	<inline: true>
  	^externalFunctionInstSize > ExternalFunctionStackSizeIndex!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>ffiArgument:Spec:Class:in: (in category 'callout support') -----
  ffiArgument: oop Spec: argSpec Class: argClass in: calloutState
  	"Callout support. Prepare the given oop as argument.
  	argSpec defines the compiled spec for the argument.
  	argClass (if non-nil) defines the required (super)class for the argument."
  	<var: #calloutState type: #'CalloutState *'>
  	| valueOop oopClass isStruct nilOop |
  	<inline: false>
  	oopClass := interpreterProxy fetchClassOf: oop. "Prefetch class (we'll need it)"
  	nilOop :=  interpreterProxy nilObject.
  	"Do the necessary type checks"
+ 	argClass = nilOop ifFalse:
- 	argClass = nilOop ifFalse:[
  		"Type check 1: 
+ 		Is the required class of the argument a general instance of ExternalStructure?"
+ 		[(interpreterProxy
+ 			includesBehavior: argClass 
+ 			ThatOf: interpreterProxy classExternalStructure) ifFalse:
+ 				[^FFIErrorWrongType].
- 		Is the required class of the argument a subclass of ExternalStructure?"
- 		(interpreterProxy includesBehavior: argClass 
- 						ThatOf: interpreterProxy classExternalStructure)
- 			ifFalse:[^FFIErrorWrongType]. "Nope. Fail."
  		"Type check 2:
+ 		Is the class of the argument a general instance of the required class?"
+ 		(nilOop = oop or: [interpreterProxy includesBehavior: oopClass ThatOf: argClass]) ifFalse:
+ 			[^FFIErrorCoercionFailed]].
+ 	"Okay, we've passed the type check (so far)"
- 		Is the class of the argument a subclass of required class?"
- 		((nilOop = oop) or:[interpreterProxy includesBehavior: oopClass ThatOf: argClass])
- 				ifFalse:[^FFIErrorCoercionFailed]. "Nope. Fail."
- 		"Okay, we've passed the type check (so far)"
- 	].
  
+ 	"Check if oopClass is a general instance of ExternalStructure.
+ 	 If this is the case we'll work on its handle and not the actual oop."
- 	"Check if oopClass is a subclass of ExternalStructure.
- 	If this is the case we'll work on it's handle and not the actual oop."
  	isStruct := false.
  	(oop ~= nilOop
  	 and: [interpreterProxy isPointers: oop]) ifTrue: "#isPointers: will fail if oop is immediate so don't even attempt to use it"
  		[isStruct := interpreterProxy
  						includesBehavior: oopClass 
  						ThatOf: interpreterProxy classExternalStructure.
  		 (argClass = nilOop or: [isStruct]) ifFalse:
  			[^FFIErrorCoercionFailed]].
  	"note: the test for #isPointers: above should speed up execution since no pointer type
  	 ST objects are allowed in external calls and thus if #isPointers: is true then the arg must
  	 be ExternalStructure to work. If it isn't then the code fails anyways so speed isn't an issue."
  
  	"Determine valueOop (e.g., the actual oop to pass as argument)"
+ 	valueOop := isStruct
+ 					ifTrue: [interpreterProxy fetchPointer: 0 ofObject: oop]
+ 					ifFalse: [oop].
- 	isStruct
- 		ifTrue:[valueOop := interpreterProxy fetchPointer: 0 ofObject: oop]
- 		ifFalse:[valueOop := oop].
  
  	"Fetch and check the contents of the compiled spec"
+ 	(interpreterProxy isWords: argSpec) ifFalse:
+ 		[^FFIErrorWrongType].
+ 	calloutState ffiArgSpecSize: (interpreterProxy byteSizeOf: argSpec) / (self sizeof: #'unsigned int').
+ 	calloutState ffiArgSpecSize = 0 ifTrue:
+ 		[^FFIErrorWrongType].
+ 	calloutState ffiArgSpec: (self cCoerce: (interpreterProxy firstIndexableField: argSpec) to: #'unsigned int *').
+ 	calloutState ffiArgHeader: (calloutState ffiArgSpec at: 0).
- 	(interpreterProxy isWords: argSpec)
- 		ifFalse:[^FFIErrorWrongType].
- 	calloutState ffiArgSpecSize: (interpreterProxy slotSizeOf: argSpec).
- 	calloutState ffiArgSpecSize = 0 ifTrue:[^FFIErrorWrongType].
- 	calloutState ffiArgSpec: (interpreterProxy firstIndexableField: argSpec).
- 	calloutState ffiArgHeader: (interpreterProxy longAt: calloutState ffiArgSpec).
  
  	"Do the actual preparation of the argument"
  	"Note: Order is important since FFIFlagStructure + FFIFlagPointer is used to represent 'typedef void* VoidPointer' and VoidPointer really is *struct* not pointer."
  
+ 	(calloutState ffiArgHeader anyMask: FFIFlagStructure) ifTrue: "argument must be ExternalStructure"
+ 		[isStruct ifFalse:
+ 			[^FFIErrorCoercionFailed].
+ 		(calloutState ffiArgHeader anyMask: FFIFlagAtomic) ifTrue:
+ 			[^FFIErrorWrongType]. "bad combination"
- 	(calloutState ffiArgHeader anyMask: FFIFlagStructure) ifTrue:[
- 		"argument must be ExternalStructure"
- 		isStruct ifFalse:[^FFIErrorCoercionFailed].
- 		(calloutState ffiArgHeader anyMask: FFIFlagAtomic) 
- 			ifTrue:[^FFIErrorWrongType]. "bad combination"
  		^self ffiPushStructureContentsOf: valueOop in: calloutState].
  
+ 	(calloutState ffiArgHeader anyMask: FFIFlagPointer) ifTrue: "no integers (or characters) for pointers please"
+ 		[(interpreterProxy isImmediate: oop) ifTrue:
+ 			[^FFIErrorIntAsPointer].
- 	(calloutState ffiArgHeader anyMask: FFIFlagPointer) ifTrue:[
- 		"no integers (or characters) for pointers please"
- 		(interpreterProxy isImmediate: oop) 
- 			ifTrue:[^FFIErrorIntAsPointer].
  
  		"but allow passing nil pointer for any pointer type"
+ 		oop = nilOop ifTrue:
+ 			[^self ffiPushPointer: nil in: calloutState].
- 		oop = nilOop ifTrue:[^self ffiPushPointer: nil in: calloutState].
  
  		"argument is reference to either atomic or structure type"
+ 		(calloutState ffiArgHeader anyMask: FFIFlagAtomic) ifTrue:
+ 			[isStruct ifTrue:"e.g., ExternalData"
+ 				[^self ffiAtomicStructByReference: oop Class: oopClass in: calloutState].
+ 			^self ffiAtomicArgByReference: oop Class: oopClass in: calloutState].
- 		(calloutState ffiArgHeader anyMask: FFIFlagAtomic) ifTrue:[
- 			isStruct "e.g., ExternalData"
- 				ifTrue:[^self ffiAtomicStructByReference: oop Class: oopClass in: calloutState]
- 				ifFalse:[^self ffiAtomicArgByReference: oop Class: oopClass in: calloutState].
  			"********* NOTE: The above uses 'oop' not 'valueOop' (for ExternalData) ******"
- 		].
  
  		"Needs to be external structure here"
+ 		isStruct ifTrue:
+ 			[^self ffiPushPointerContentsOf: valueOop in: calloutState].
+ 		^FFIErrorCoercionFailed].
- 		isStruct ifFalse:[^FFIErrorCoercionFailed].
- 		^self ffiPushPointerContentsOf: valueOop in: calloutState].
  
+ 	(calloutState ffiArgHeader anyMask: FFIFlagAtomic) ifTrue: "argument is atomic value"
+ 		[^self ffiArgByValue: valueOop in: calloutState].
+ 
- 	(calloutState ffiArgHeader anyMask: FFIFlagAtomic) ifTrue:[
- 		"argument is atomic value"
- 		^self ffiArgByValue: valueOop in: calloutState].
  	"None of the above - bad spec"
  	^FFIErrorWrongType!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  	<var: #procAddr type: #'void *'>
  	<arg: #calloutState type: #'CalloutState *'>
  	"Perform the callout, collect the result and and create the return value.
  	 If a threaded call, disown and own VM around the call.  If there are floating-point
  	 arguments that are passed in registers then call a dummy function to load them.
  	 This *must* be inlined because of the alloca of the outgoing stack frame in
  	 ffiCall:SpecOnStack:Flags:NumArgs:Args:AndTypes:"
+ 	<inline: #always>
- 	<inline: true>
  	self subclassResponsibility!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>ffiCreateIntegralResultOop:ofAtomicType:in: (in category 'callout support') -----
  ffiCreateIntegralResultOop: retVal ofAtomicType: atomicType in: calloutState
  	<inline: true>
  	<var: #calloutState type: #'CalloutState *'>
  	<var: #retVal type: #usqLong>
  	"Callout support. Return the appropriate oop for the given atomic type"
  	| shift value mask byteSize |
  	<var: 'value' type: #usqLong>
  	<var: 'mask' type: #usqLong>
  	self assert: atomicType < FFITypeSingleFloat.
  
  	atomicType = FFITypeBool ifTrue:
  		["Make sure bool honors the byte size requested"
  		 byteSize := calloutState ffiRetHeader bitAnd: FFIStructSizeMask.
  		 value := byteSize = (self sizeof: retVal)
  					ifTrue:[retVal]
  					ifFalse:[retVal bitAnd: 1 asUnsignedLongLong << (byteSize * 8) - 1].
  		 ^value = 0
  			ifTrue:[interpreterProxy falseObject]
  			ifFalse:[interpreterProxy trueObject]].
  	atomicType <= FFITypeSignedInt ifTrue:
  		["these are all generall integer returns"
  		atomicType <= (BytesPerWord = 8 ifTrue: [FFITypeSignedInt] ifFalse: [FFITypeSignedShort]) ifTrue:
  			["byte/short. first extract partial word, then sign extend"
  			shift := (BytesPerWord = 8 and: [atomicType >= FFITypeUnsignedInt])
  						ifTrue: [32]
  						ifFalse: [(atomicType >> 1) * 8]. "# of significant bits"
  			value := retVal bitAnd: (1 asUnsignedLongLong << shift - 1). 
  			(atomicType anyMask: 1) ifTrue:
  				["make the guy signed"
  				mask := 1 asUnsignedLongLong << (shift-1).
  				value := (value bitAnd: mask-1) - (value bitAnd: mask)].
  			^interpreterProxy integerObjectOf: value].
  		"Word sized integer return"
  		^(atomicType anyMask: 1)
  			ifTrue:[interpreterProxy signedMachineIntegerFor: retVal] "signed return"
  			ifFalse:[interpreterProxy positiveMachineIntegerFor: retVal]]. "unsigned return"
  
  	"longlong, char"
+ 	(atomicType >> 1) = (FFITypeSignedLongLong >> 1) ifTrue:
+ 		[^(atomicType anyMask: 1)
+ 			ifTrue: [interpreterProxy signed64BitIntegerFor: retVal] "signed return"
+ 			ifFalse: [interpreterProxy positive64BitIntegerFor: retVal]].
+ 	self cppIf: #SPURVM
+ 		ifTrue: [^interpreterProxy characterObjectOf: (retVal bitAnd: 16r3FFFFFFF)]
+ 		ifFalse: [^interpreterProxy characterObjectOf: (retVal bitAnd: 16rFF)]!
- 	^(atomicType >> 1) = (FFITypeSignedLongLong >> 1) 
- 		ifTrue:
- 			[(atomicType anyMask: 1)
- 				ifTrue:[interpreterProxy signed64BitIntegerFor: retVal] "signed return"
- 				ifFalse:[interpreterProxy positive64BitIntegerFor: retVal]]
- 		ifFalse:
- 			[interpreterProxy characterObjectOf: (retVal bitAnd: 16rFF)]!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>ffiPushPointerContentsOf:in: (in category 'marshalling') -----
  ffiPushPointerContentsOf: oop in: calloutState
  	<var: #calloutState type: #'CalloutState *'>
  	"Push the contents of the given external structure"
  	| ptrClass ptrAddress |
  	<inline: false>
  	ptrClass := interpreterProxy fetchClassOf: oop.
  	ptrClass = interpreterProxy classExternalAddress ifTrue:
  		[ptrAddress := (interpreterProxy fetchPointer: 0 ofObject: oop) asVoidPointer.
  		"Don't you dare to pass pointers into object memory"
+ 		(interpreterProxy isInMemory: ptrAddress asUnsignedIntegerPtr) ifTrue:
- 		(interpreterProxy isInMemory: ptrAddress) ifTrue:
  			[^FFIErrorInvalidPointer].
  		^self ffiPushPointer: ptrAddress in: calloutState].
  
  	ptrClass = interpreterProxy classByteArray ifTrue:
  		["Since this involves passing the address of the first indexable field we need to fail
  		  the call if it is threaded and the object is young, since it may move during the call."
  		 self cppIf: COGMTVM ifTrue:
  		 [((calloutState callFlags anyMask: FFICallFlagThreaded)
  		   and: [interpreterProxy isYoung: oop]) ifTrue:
  			[^PrimErrObjectMayMove negated]].
  
  		ptrAddress := interpreterProxy firstIndexableField: oop.
  		^self ffiPushPointer: ptrAddress in: calloutState].
  
  	(interpreterProxy includesBehavior: ptrClass ThatOf: interpreterProxy classAlien) ifTrue:
  		[self cppIf: COGMTVM ifTrue:
  		 [((calloutState callFlags anyMask: FFICallFlagThreaded)
  		   and: [(self isDirectAlien: oop)
  		   and: [interpreterProxy isYoung: oop]]) ifTrue:
  			[^PrimErrObjectMayMove negated]].
  
+ 		ptrAddress := self cCoerce: (self startOfData: oop) to: #'void *'.
- 		ptrAddress := self startOfData: oop.
  		^self ffiPushPointer: ptrAddress in: calloutState].
  
  	^FFIErrorBadArg!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>ffiPushStructureContentsOf:in: (in category 'marshalling-struct') -----
  ffiPushStructureContentsOf: oop in: calloutState
  	<var: #calloutState type: #'CalloutState *'>
  	"Push the contents of the given external structure"
  	| ptrClass ptrAddress |
  	<inline: true>
  	ptrClass := interpreterProxy fetchClassOf: oop.
  	ptrClass = interpreterProxy classExternalAddress ifTrue: "ExternalAddress is bytes"
  		[ptrAddress := (interpreterProxy fetchPointer: 0 ofObject: oop) asVoidPointer.
  		"There is no way we can make sure the structure is valid.
  		But we can at least check for attempts to pass pointers to ST memory."
+ 		(interpreterProxy isInMemory: ptrAddress asUnsignedIntegerPtr) ifTrue:
- 		(interpreterProxy isInMemory: ptrAddress) ifTrue:
  			[^FFIErrorInvalidPointer].
  		^self ffiPushStructure: ptrAddress
  			ofSize: (calloutState ffiArgHeader bitAnd: FFIStructSizeMask)
  			typeSpec: calloutState ffiArgSpec
  			ofLength: calloutState ffiArgSpecSize
  			in: calloutState].
  	ptrClass = interpreterProxy classByteArray ifTrue:
  		["The following is a somewhat pessimistic test but I like being sure..."
  		(interpreterProxy byteSizeOf: oop) = (calloutState ffiArgHeader bitAnd: FFIStructSizeMask)
  			ifFalse:[^FFIErrorStructSize].
  		ptrAddress := interpreterProxy firstIndexableField: oop.
  		(calloutState ffiArgHeader anyMask: FFIFlagPointer) ifFalse:
  			"Since this involves passing the address of the first indexable field we need to fail
  			  the call if it is threaded and the object is young, since it may move during the call."
  			[self cppIf: COGMTVM ifTrue:
  			 [((calloutState callFlags anyMask: FFICallFlagThreaded)
  			 and: [interpreterProxy isYoung: oop]) ifTrue:
  				[^PrimErrObjectMayMove negated]].
  			^self ffiPushStructure: ptrAddress
  				ofSize: (calloutState ffiArgHeader bitAnd: FFIStructSizeMask)
  				typeSpec: calloutState ffiArgSpec
  				ofLength: calloutState ffiArgSpecSize
  				in: calloutState].
  		"If FFIFlagPointer + FFIFlagStructure is set use ffiPushPointer on the contents"
  		(calloutState ffiArgHeader bitAnd: FFIStructSizeMask) = BytesPerWord ifFalse:
  			[^FFIErrorStructSize].
  		ptrAddress := (interpreterProxy fetchPointer: 0 ofObject: oop) asVoidPointer.
+ 		(interpreterProxy isInMemory: ptrAddress asUnsignedIntegerPtr) ifTrue:
- 		(interpreterProxy isInMemory: ptrAddress) ifTrue:
  			[^FFIErrorInvalidPointer].
  		^self ffiPushPointer: ptrAddress in: calloutState].
  	^FFIErrorBadArg!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>isUnionSpec:OfLength:StartingAt: (in category 'marshalling-struct') -----
  isUnionSpec: specs OfLength: specSize StartingAt: startIndex
  	"We can't easily distinguish union from structures with available flags.
  	But we have a trick: a union should have one field size equal to its own size."
  	| index spec unionSize thisSize |
  	<var: #specs type: #'unsigned int*'>
  	<inline: false>
  	index := startIndex.
  	spec := specs at: index.
  	self assert: (spec bitAnd: FFIFlagPointer + FFIFlagAtomic + FFIFlagStructure) = FFIFlagStructure.
  	unionSize := spec bitAnd: FFIStructSizeMask.
  	[index := index + 1.
  	index < specSize]
  		whileTrue:
  			[spec := specs at: index.
  			spec = FFIFlagStructure
  				ifTrue: [^false].
  			thisSize := spec bitAnd: FFIStructSizeMask.
  			thisSize = unionSize ifTrue: [^true].
  			((spec bitAnd: FFIFlagPointer + FFIFlagStructure) = FFIFlagStructure)
  				ifTrue:
  					["Asking for alignment is a trick for skipping this sub structure/union"
+ 					self alignmentOfStructSpec: specs OfLength: specSize StartingAt: (self addressOf: index put: [:v| index := v])]].
- 					self alignmentOfStructSpec: specs OfLength: specSize StartingAt: (self addressOf: index)]].
  	^false!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>primitiveFFIDoubleAt (in category 'primitives') -----
  primitiveFFIDoubleAt
  	"Return a (signed or unsigned) n byte integer from the given byte offset."
  	| byteOffset rcvr addr floatValue |
  	<export: true>
  	<inline: false>
  	<var: #floatValue type: #double>
  	byteOffset := interpreterProxy stackIntegerValue: 0.
  	rcvr := interpreterProxy stackObjectValue: 1.
  	interpreterProxy failed ifTrue:[^0].
  	addr := self ffiAddressOf: rcvr startingAt: byteOffset size: 8.
  	addr = 0 ifTrue: [^interpreterProxy primitiveFail].
  	self memcpy: (self addressOf: floatValue) _: addr _: (self sizeof: floatValue).
+ 	interpreterProxy methodReturnFloat: floatValue
- 	interpreterProxy pop: 2.
- 	^interpreterProxy pushFloat: floatValue
  !

Item was changed:
  ----- Method: ThreadedFFIPlugin>>primitiveFFIDoubleAtPut (in category 'primitives') -----
  primitiveFFIDoubleAtPut
  	"Return a (signed or unsigned) n byte integer from the given byte offset."
  	| byteOffset rcvr addr floatValue floatOop |
  	<export: true>
  	<inline: false>
  	<var: #floatValue type: #double>
  	floatOop := interpreterProxy stackValue: 0.
  	(interpreterProxy isIntegerObject: floatOop)
+ 		ifTrue:[floatValue := self cCoerce: (interpreterProxy integerValueOf: floatOop) to: #double]
+ 		ifFalse:[floatValue := self cCoerce: (interpreterProxy floatValueOf: floatOop) to: #double].
- 		ifTrue:[floatValue := self cCoerce: (interpreterProxy integerValueOf: floatOop) to:'double']
- 		ifFalse:[floatValue := self cCoerce: (interpreterProxy floatValueOf: floatOop) to:'double'].
  	byteOffset := interpreterProxy stackIntegerValue: 1.
  	rcvr := interpreterProxy stackObjectValue: 2.
  	interpreterProxy failed ifTrue:[^0].
  	addr := self ffiAddressOf: rcvr startingAt: byteOffset size: 8.
  	addr = 0 ifTrue: [^interpreterProxy primitiveFail].
  	self memcpy: addr _: (self addressOf: floatValue) _: (self sizeof: floatValue).
  	^interpreterProxy pop: 3 thenPush: floatOop!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>primitiveFFIFloatAt (in category 'primitives') -----
  primitiveFFIFloatAt
  	"Return a (signed or unsigned) n byte integer from the given byte offset."
  	| byteOffset rcvr addr floatValue |
  	<export: true>
  	<inline: false>
  	<var: #floatValue type: #float>
  	byteOffset := interpreterProxy stackIntegerValue: 0.
  	rcvr := interpreterProxy stackObjectValue: 1.
  	interpreterProxy failed ifTrue:[^0].
  	addr := self ffiAddressOf: rcvr startingAt: byteOffset size: 4.
  	addr = 0 ifTrue: [^interpreterProxy primitiveFail].
  	self memcpy: (self addressOf: floatValue) _: addr _: (self sizeof: floatValue).
+ 	interpreterProxy methodReturnFloat: floatValue!
- 	interpreterProxy pop: 2.
- 	^interpreterProxy pushFloat: floatValue!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>primitiveFFIFloatAtPut (in category 'primitives') -----
  primitiveFFIFloatAtPut
  	"Return a (signed or unsigned) n byte integer from the given byte offset."
  	| byteOffset rcvr addr floatValue floatOop |
  	<export: true>
  	<inline: false>
  	<var: #floatValue type: #float>
  	floatOop := interpreterProxy stackValue: 0.
  	(interpreterProxy isIntegerObject: floatOop)
+ 		ifTrue:[floatValue := self cCoerce: (interpreterProxy integerValueOf: floatOop) to: #float]
+ 		ifFalse:[floatValue := self cCoerce: (interpreterProxy floatValueOf: floatOop) to: #float].
- 		ifTrue:[floatValue := self cCoerce: (interpreterProxy integerValueOf: floatOop) to:'float']
- 		ifFalse:[floatValue := self cCoerce: (interpreterProxy floatValueOf: floatOop) to:'float'].
  	byteOffset := interpreterProxy stackIntegerValue: 1.
  	rcvr := interpreterProxy stackObjectValue: 2.
  	interpreterProxy failed ifTrue:[^0].
  	addr := self ffiAddressOf: rcvr startingAt: byteOffset size: 4.
  	addr = 0 ifTrue: [^interpreterProxy primitiveFail].
  	self memcpy: addr _: (self addressOf: floatValue) _: (self sizeof: floatValue).
  	^interpreterProxy pop: 3 thenPush: floatOop!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>primitiveFFIFree (in category 'primitives') -----
  primitiveFFIFree
  	"Primitive. Free the object pointed to on the external heap."
  	| addr oop ptr |
  	<export: true>
  	<inline: false>
  	<var: #ptr type: #'sqIntptr_t *'>
  	oop := interpreterProxy stackObjectValue: 0.
  	((interpreterProxy fetchClassOf: oop) = interpreterProxy classExternalAddress
  	 and: [(interpreterProxy byteSizeOf: oop) = (self sizeof: #'sqIntptr_t')]) ifFalse:
  		[^interpreterProxy primitiveFail].
  	ptr := interpreterProxy firstIndexableField: oop.
  	addr := ptr at: 0.
  	"Don't you dare to free Squeak's memory!!"
  	(addr = 0
  	 or: [(addr asUnsignedIntegerPtr bitAnd: (self sizeof: #'sqIntptr_t') - 1) ~= 0
+ 	 or: [interpreterProxy isInMemory: addr asUnsignedIntegerPtr]]) ifTrue:
- 	 or: [interpreterProxy isInMemory: addr]]) ifTrue:
  		[^interpreterProxy primitiveFail].
  	self ffiFree: addr.
  	^ptr at: 0 put: 0 "cleanup"!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>primitiveStructureElementAlignment (in category 'primitives') -----
  primitiveStructureElementAlignment
  	"Answer the alignment of an element of an atomic type, or a structure, within a structure on the current platform."
  	<export: true>
  	| typeCode alignment |
+ 	<var: 'alignment' type: #'void *'>
  	typeCode := interpreterProxy stackValue: 0.
  	((interpreterProxy isIntegerObject: typeCode)
  	 and: [((typeCode := interpreterProxy integerValueOf: typeCode) between: FFITypeUnsignedByte and: FFITypeDoubleFloat)
  		or: [typeCode = FFIFlagStructure]]) ifFalse:
  		[^self primitiveFailFor: PrimErrBadArgument].
  
  	alignment := typeCode
  					caseOf: {
  						[FFITypeUnsignedByte]			-> [self structOffsetOf: 'structByte *' atomicTypeCode: FFITypeUnsignedByte].
  						[FFITypeSignedByte]			-> [self structOffsetOf: 'structByte *' atomicTypeCode: FFITypeUnsignedByte].
  						[FFITypeUnsignedShort]		-> [self structOffsetOf: 'structShort *' atomicTypeCode: FFITypeUnsignedShort].
  						[FFITypeSignedShort]			-> [self structOffsetOf: 'structShort *' atomicTypeCode: FFITypeUnsignedShort].
  						[FFITypeUnsignedInt]			-> [self structOffsetOf: 'structInt *' atomicTypeCode: FFITypeUnsignedInt].
  						[FFITypeSignedInt]				-> [self structOffsetOf: 'structInt *' atomicTypeCode: FFITypeUnsignedInt].
  						[FFITypeUnsignedLongLong]	-> [self structOffsetOf: 'structLongLong *' atomicTypeCode: FFITypeUnsignedLongLong].
  						[FFITypeSignedLongLong]		-> [self structOffsetOf: 'structLongLong *' atomicTypeCode: FFITypeUnsignedLongLong].
  						[FFITypeSingleFloat]			-> [self structOffsetOf: 'structFloat *' atomicTypeCode: FFITypeSingleFloat].
  						[FFITypeDoubleFloat]			-> [self structOffsetOf: 'structDouble *' atomicTypeCode: FFITypeDoubleFloat].
  					}
  					otherwise: [self structOffsetOf: 'structStruct *' atomicTypeCode: FFIFlagStructure].
+ 	^interpreterProxy methodReturnInteger: alignment asUnsignedIntegerPtr!
- 	^interpreterProxy methodReturnInteger: alignment!

Item was added:
+ ----- Method: ThreadedFFIPlugin>>unalignedLong32At: (in category 'primitive support') -----
+ unalignedLong32At: index
+ 	"This alignment stuff is a hack for integerAt:put:size:signed:/primitiveFFIIntegerAt[Put].
+ 	 The assumption right now is that all processors support unaligned access.  That only holds true
+ 	 for x86, x86-64 & ARMv6 & later.  But this keeps us going until we can address it properly."
+ 	<cmacro: '(index) long32At(index)'>
+ 	^interpreterProxy long32At: index!

Item was added:
+ ----- Method: ThreadedFFIPlugin>>unalignedLong32At:put: (in category 'primitive support') -----
+ unalignedLong32At: index put: value
+ 	"This alignment stuff is a hack for integerAt:put:size:signed:/primitiveFFIIntegerAt[Put].
+ 	 The assumption right now is that all processors support unaligned access.  That only holds true
+ 	 for x86, x86-64 & ARMv6 & later.  But this keeps us going until we can address it properly."
+ 	<cmacro: '(index,value) long32Atput(index,value)'>
+ 	^interpreterProxy long32At: index put: value!

Item was added:
+ ----- Method: ThreadedFFIPlugin>>unalignedLong64At: (in category 'primitive support') -----
+ unalignedLong64At: index
+ 	"This alignment stuff is a hack for integerAt:put:size:signed:/primitiveFFIIntegerAt[Put].
+ 	 The assumption right now is that all processors support unaligned access.  That only holds true
+ 	 for x86, x86-64 & ARMv6 & later.  But this keeps us going until we can address it properly."
+ 	<cmacro: '(index) long64At(index)'>
+ 	^interpreterProxy long64At: index!

Item was added:
+ ----- Method: ThreadedFFIPlugin>>unalignedLong64At:put: (in category 'primitive support') -----
+ unalignedLong64At: index put: value
+ 	"This alignment stuff is a hack for integerAt:put:size:signed:/primitiveFFIIntegerAt[Put].
+ 	 The assumption right now is that all processors support unaligned access.  That only holds true
+ 	 for x86, x86-64 & ARMv6 & later.  But this keeps us going until we can address it properly."
+ 	<cmacro: '(index,value) long64Atput(index,value)'>
+ 	^interpreterProxy long64At: index put: value!

Item was added:
+ ----- Method: ThreadedFFIPlugin>>unalignedShortAt: (in category 'primitive support') -----
+ unalignedShortAt: index
+ 	"This alignment stuff is a hack for integerAt:put:size:signed:/primitiveFFIIntegerAt[Put].
+ 	 The assumption right now is that all processors support unaligned access.  That only holds true
+ 	 for x86, x86-64 & ARMv6 & later.  But this keeps us going until we can address it properly."
+ 	<cmacro: '(index) shortAt(index)'>
+ 	^interpreterProxy shortAt: index!

Item was added:
+ ----- Method: ThreadedFFIPlugin>>unalignedShortAt:put: (in category 'primitive support') -----
+ unalignedShortAt: index put: value
+ 	"This alignment stuff is a hack for integerAt:put:size:signed:/primitiveFFIIntegerAt[Put].
+ 	 The assumption right now is that all processors support unaligned access.  That only holds true
+ 	 for x86, x86-64 & ARMv6 & later.  But this keeps us going until we can address it properly."
+ 	<cmacro: '(index,value) shortAtput(index,value)'>
+ 	^interpreterProxy shortAt: index put: value!

Item was changed:
  ----- Method: ThreadedIA32FFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  	<var: #procAddr type: #'void *'>
  	<var: #calloutState type: #'CalloutState *'>
  	"Go out, call this guy and create the return value.  This *must* be inlined because of
  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
  	| myThreadIndex atomicType floatRet intRet |
  	<var: #floatRet type: #double>
  	<var: #intRet type: #usqLong>
+ 	<inline: #always>
- 	<inline: true>
  	myThreadIndex := interpreterProxy disownVM: (self disownFlagsFor: calloutState).
  
  	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
  		[self setsp: calloutState argVector].
  
  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
  	(atomicType >> 1) = (FFITypeSingleFloat >> 1) ifTrue:
  		[floatRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)()').
  
  		 "undo any callee argument pops because it may confuse stack management with the alloca."
  		 (self isCalleePopsConvention: calloutState callFlags) ifTrue:
  			[self setsp: calloutState argVector].
  		 interpreterProxy ownVM: myThreadIndex.
  
  		 ^interpreterProxy floatObjectOf: floatRet].
  
  	intRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqLong (*)()').
  
  	"undo any callee argument pops because it may confuse stack management with the alloca."
  	(self isCalleePopsConvention: calloutState callFlags) ifTrue:
  		[self setsp: calloutState argVector].
  	interpreterProxy ownVM: myThreadIndex.
  
  	(calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
  		["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
  		 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
  		 (calloutState ffiRetHeader anyMask: FFIFlagPointer) ifTrue:
  			[^self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
  		 ^self ffiReturnStruct: (self addressOf: intRet) ofType: (self ffiReturnType: specOnStack) in: calloutState].
  	
  	^self ffiCreateIntegralResultOop: intRet ofAtomicType: atomicType in: calloutState!

Item was changed:
  ----- Method: ThreadedX64SysVFFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  	<var: #procAddr type: #'void *'>
  	<var: #calloutState type: #'CalloutState *'>
  	<var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double, double, double, double, double)'>
  	"Go out, call this guy and create the return value.  This *must* be inlined because of
  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
  	| myThreadIndex atomicType floatRet intRet sddRet sdiRet sidRet siiRet returnStructByValue registerType sRetPtr |
  	<var: #floatRet type: #double>
  	<var: #intRet type: #sqInt>
  	<var: #siiRet type: #SixteenByteReturnII>
  	<var: #sidRet type: #SixteenByteReturnID>
  	<var: #sdiRet type: #SixteenByteReturnDI>
  	<var: #sddRet type: #SixteenByteReturnDD>
  	<var: #sRetPtr type: #'void *'>
+ 	<inline: #always>
- 	<inline: true>
  	
  	returnStructByValue := (calloutState ffiRetHeader bitAnd: FFIFlagStructure + FFIFlagPointer + FFIFlagAtomic) = FFIFlagStructure.
  	
  	myThreadIndex := interpreterProxy disownVM: (self disownFlagsFor: calloutState).
  
  	calloutState floatRegisterIndex > 0 ifTrue:
  		[self 
  			load: (calloutState floatRegisters at: 0)
  			Flo: (calloutState floatRegisters at: 1)
  			a: (calloutState floatRegisters at: 2)
  			t: (calloutState floatRegisters at: 3)
  			R: (calloutState floatRegisters at: 4)
  			e: (calloutState floatRegisters at: 5)
  			g: (calloutState floatRegisters at: 6)
  			s: (calloutState floatRegisters at: 7)].
  
  	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
  		[self setsp: calloutState argVector].
  
  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
  	(atomicType >> 1) = (FFITypeSingleFloat >> 1) ifTrue:
  		[atomicType = FFITypeSingleFloat
  			ifTrue:
  				[floatRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)
  					with: (calloutState integerRegisters at: 4)
  					with: (calloutState integerRegisters at: 5)]
  			ifFalse: "atomicType = FFITypeDoubleFloat"
  				[floatRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)
  					with: (calloutState integerRegisters at: 4)
  					with: (calloutState integerRegisters at: 5)].
  
  		 interpreterProxy ownVM: myThreadIndex.
  
  		 ^interpreterProxy floatObjectOf: floatRet].
  
  	returnStructByValue  ifFalse:
  		[intRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'sqInt (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)
  					with: (calloutState integerRegisters at: 4)
  					with: (calloutState integerRegisters at: 5).
  		interpreterProxy ownVM: myThreadIndex.
  		(calloutState ffiRetHeader anyMask: FFIFlagPointer) ifTrue:
  			[^self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
  		^self ffiCreateIntegralResultOop: intRet ofAtomicType: atomicType in: calloutState].
  
  	registerType := calloutState structReturnType.
  	registerType
  		caseOf:
  			{[2r00] ->
  				[sddRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'SixteenByteReturnDD (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)
  					with: (calloutState integerRegisters at: 4)
  					with: (calloutState integerRegisters at: 5).
  				sRetPtr := (self addressOf: sddRet) asVoidPointer].
  			 [2r01] ->
  				[sidRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'SixteenByteReturnID (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)
  					with: (calloutState integerRegisters at: 4)
  					with: (calloutState integerRegisters at: 5).
  				sRetPtr := (self addressOf: sidRet) asVoidPointer].
  			 [2r10] ->
  				[sdiRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'SixteenByteReturnDI (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)
  					with: (calloutState integerRegisters at: 4)
  					with: (calloutState integerRegisters at: 5).
  				sRetPtr := (self addressOf: sdiRet) asVoidPointer].
  			 [2r11] ->
  				[siiRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'SixteenByteReturnII (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)
  					with: (calloutState integerRegisters at: 4)
  					with: (calloutState integerRegisters at: 5).
  				sRetPtr := (self addressOf: siiRet) asVoidPointer].
  			 [2r100] ->
  				[floatRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)
  					with: (calloutState integerRegisters at: 4)
  					with: (calloutState integerRegisters at: 5).
  				sRetPtr := (self addressOf: floatRet) asVoidPointer].
  			 [2r101] ->
  				[intRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'sqInt (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)
  					with: (calloutState integerRegisters at: 4)
  					with: (calloutState integerRegisters at: 5).
  				sRetPtr := (self addressOf: intRet) asVoidPointer].
  			 [2r110] ->
  				["return a pointer to alloca'd memory"
  				intRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'sqInt (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)
  					with: (calloutState integerRegisters at: 4)
  					with: (calloutState integerRegisters at: 5).
  				sRetPtr := intRet asVoidPointer "address of struct is returned in RAX, which also is calloutState limit"]}
  		otherwise:
  			[interpreterProxy ownVM: myThreadIndex.
  			self ffiFail: FFIErrorWrongType. ^nil].
  
  	interpreterProxy ownVM: myThreadIndex.
  	^self ffiReturnStruct: sRetPtr ofType: (self ffiReturnType: specOnStack) in: calloutState!

Item was changed:
  ----- Method: ThreadedX64SysVFFIPlugin>>registerType:ForStructSpecs:OfLength:StartingAt:ByteOffset:EightbyteOffset: (in category 'marshalling') -----
  registerType: initialRegisterType ForStructSpecs: specs OfLength: specSize StartingAt: indexPtr ByteOffset: initialByteOffset EightbyteOffset: initialEightbyteOffset
  	"Answer with a number characterizing the register type for passing a struct of size <= 16 bytes.
  	On input, the index points to the structure header (the one with FFIFlagStructure + structSize)
  	On output, the index points to the structure trailer (the FFIFlagStructure)."
  
+ 	<var: #specs type: #'unsigned int *'>
+ 	<var: #indexPtr type: #'sqInt *'>
- 	<var: #specs type: #'unsigned int*'>
- 	<var: #indexPtr type: #'unsigned int*'>
- 	<var: #subIndex type: #'unsigned int'>
  	<inline: false>
  	| registerType eightbyteOffset byteOffset spec fieldSize alignment atomic subIndex isInt recurse subLevel |
  	registerType := initialRegisterType.
  	byteOffset := initialByteOffset.
  	eightbyteOffset := initialEightbyteOffset.
  	[indexPtr at: 0 put: (indexPtr at: 0) + 1.
  	subLevel := 0.
  	(indexPtr at: 0) < specSize]
  		whileTrue:
  			[spec := specs at: (indexPtr at: 0).
  			isInt := false.
  			recurse := false.
  			spec = FFIFlagStructure "this marks end of structure/union"
  				ifTrue:
  					[subLevel = 0 ifTrue: [^registerType].
  					subLevel := subLevel - 1]
  				ifFalse:
  					[(spec anyMask: FFIFlagPointer)
  						ifTrue:
  							[fieldSize := BytesPerWord.
  							alignment := fieldSize.
  							isInt := true]
  						ifFalse:
  							[(spec bitAnd: FFIFlagStructure + FFIFlagAtomic)
  								caseOf:
  									{[FFIFlagStructure] ->
  										[fieldSize := 0.
  										subIndex := indexPtr at: 0.
+ 										alignment := self alignmentOfStructSpec: specs OfLength: specSize StartingAt: (self addressOf: subIndex put: [:v| subIndex := v]).
- 										alignment := self alignmentOfStructSpec: specs OfLength: specSize StartingAt: (self addressOf: subIndex).
  										recurse := self isUnionSpec: specs OfLength: specSize StartingAt: (indexPtr at: 0).
  										recurse
  											ifTrue: [fieldSize := spec bitAnd: FFIStructSizeMask]
  											ifFalse: [subLevel := subLevel + 1]].
  									[FFIFlagAtomic] ->
  										[fieldSize := spec bitAnd: FFIStructSizeMask.
  										alignment := fieldSize.
  										atomic := self atomicTypeOf: spec.
  										isInt := (atomic >> 1) ~= (FFITypeSingleFloat >> 1)]}
  								otherwise: ["invalid spec" ^-1]].
  					(byteOffset bitAnd: alignment - 1) = 0
  						ifFalse:
  							["this field requires alignment"
  							byteOffset := (byteOffset bitClear: alignment - 1) + alignment].
  					byteOffset + fieldSize > 8
  						ifTrue:
  							["Not enough room on current Eightbyte for this field, skip to next one"
  							eightbyteOffset := eightbyteOffset + 1.
  							byteOffset := 0].
  					isInt
  						ifTrue:
  							["If this eightbyte contains an int field, then we must use an int register"
  							registerType := registerType bitOr: 1 << eightbyteOffset].
  					recurse ifTrue:
  						["union in structs require a recursive form, because we handle byteOffset/eightbyteOffset differently"
  						registerType := self
  								registerType: registerType
  								ForUnionSpecs: specs
  								OfLength: specSize
  								StartingAt: indexPtr
  								ByteOffset: byteOffset
  								EightbyteOffset: eightbyteOffset].
  					"where to put the next field?"
  					byteOffset := byteOffset + fieldSize.
  					byteOffset >= 8
  						ifTrue:
  							["This eightbyte is full, skip to next one"
  							eightbyteOffset := eightbyteOffset + 1.
  							byteOffset := 0]]].
  	self assert: subLevel = 0.
  	^registerType!

Item was changed:
  ----- Method: ThreadedX64SysVFFIPlugin>>registerType:ForUnionSpecs:OfLength:StartingAt:ByteOffset:EightbyteOffset: (in category 'marshalling') -----
  registerType: initialRegisterType ForUnionSpecs: specs OfLength: specSize StartingAt: indexPtr ByteOffset: byteOffset EightbyteOffset: eightbyteOffset
  	"Answer with a number characterizing the register type for passing a union of size <= 16 bytes.
  	On input, the index points to the structure header (the one with FFIFlagStructure + structSize)
  	On output, the index points to the structure trailer (the FFIFlagStructure)."
  
+ 	<var: #specs type: #'unsigned int *'>
+ 	<var: #indexPtr type: #'sqInt *'>
- 	<var: #specs type: #'unsigned int*'>
- 	<var: #indexPtr type: #'unsigned int*'>
  	<inline: false>
  	| registerType spec atomic isInt recurse subLevel |
  	registerType := initialRegisterType.
  	[indexPtr at: 0 put: (indexPtr at: 0) + 1.
  	subLevel := 0.
  	(indexPtr at: 0) < specSize]
  		whileTrue:
  			[spec := specs at: (indexPtr at: 0).
  			isInt := false.
  			recurse := false.
  			spec = FFIFlagStructure "this marks end of structure/union"
  				ifTrue:
  					[subLevel = 0 ifTrue: [^registerType].
  					subLevel := subLevel - 1]
  				ifFalse:
  					[(spec anyMask: FFIFlagPointer)
  						ifTrue:
  							[isInt := true]
  						ifFalse:
  							[(spec bitAnd: FFIFlagStructure + FFIFlagAtomic)
  								caseOf:
  									{[FFIFlagStructure] ->
  										[recurse := (self isUnionSpec: specs OfLength: specSize StartingAt: (indexPtr at: 0))not.
  										recurse ifFalse: [subLevel := subLevel + 1]].
  									[FFIFlagAtomic] ->
  										[atomic := self atomicTypeOf: spec.
  										isInt := (atomic >> 1) ~= (FFITypeSingleFloat >> 1)]}
  								otherwise: ["invalid spec" ^-1]].
  					isInt
  						ifTrue:
  							["If this eightbyte contains an int field, then we must use an int register"
  							registerType := registerType bitOr: 1 << eightbyteOffset].
  					recurse ifTrue:
  						["struct in union require a recursive form, because we handle byteOffset/eightbyteOffset differently"
  						registerType := self
  								registerType: registerType
  								ForStructSpecs: specs
  								OfLength: specSize
  								StartingAt: indexPtr
  								ByteOffset: byteOffset
  								EightbyteOffset: eightbyteOffset]]].
  	self assert: subLevel = 0.
  	^registerType!

Item was changed:
  ----- Method: ThreadedX64SysVFFIPlugin>>registerTypeForStructSpecs:OfLength: (in category 'marshalling') -----
  registerTypeForStructSpecs: specs OfLength: specSize
  	"Answer with a number characterizing the register type for passing a struct of size <= 16 bytes.
  	The bit at offset i of registerType is set to 1 if eightbyte at offset i is a int register (RAX ...)
  	The bit at offset 2 indicates if there is a single eightbyte (struct size <= 8)
  	* 2r00 for float float (XMM0 XMM1)
  	* 2r01 for int float (RAX XMM0)
  	* 2r10 for float int (XMM0 RAX)
  	* 2r11 for int int (RAX RDX)
  	* 2r100 for float (XMM0)
  	* 2r101 for int (RAX)
  	* 2r110 INVALID (not aligned)
  	Beware, the bits must be read from right to left for decoding register type.
  	Note: this method reconstructs the struct layout according to X64 alignment rules.
  	Therefore, it will not work for packed struct or other exotic alignment."
  
+ 	<var: #specs type: #'unsigned int *'>
- 	<var: #specs type: #'unsigned int*'>
  	<inline: false>
  	| index byteSize registerType |
  	index := 0.
  	byteSize := (specs at: index) bitAnd: FFIStructSizeMask.
  	byteSize > 16 ifTrue: [^2r110].
  	(self checkAlignmentOfStructSpec: specs OfLength: specSize StartingAt: index)
  		ifFalse: [^2r110].
  	registerType := byteSize <= 8 ifTrue: [2r100] ifFalse: [0].
  	^(self isUnionSpec: specs OfLength: specSize StartingAt: 0)
  		ifTrue: [ self 
  			registerType: registerType
  			ForUnionSpecs: specs
  			OfLength: specSize
+ 			StartingAt: (self addressOf: index put: [:v| index := v])
- 			StartingAt: (self addressOf: index)
  			ByteOffset: 0
  			EightbyteOffset: 0 ]
  		ifFalse: [ self 
  			registerType: registerType
  			ForStructSpecs: specs
  			OfLength: specSize
+ 			StartingAt: (self addressOf: index put: [:v| index := v])
- 			StartingAt: (self addressOf: index)
  			ByteOffset: 0
  			EightbyteOffset: 0 ]!

Item was changed:
  ----- Method: ThreadedX64Win64FFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  	<var: #procAddr type: #'void *'>
  	<var: #calloutState type: #'CalloutState *'>
  	<var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double)'>
  	"Go out, call this guy and create the return value.  This *must* be inlined because of
  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
  	| myThreadIndex atomicType floatRet intRet |
  	<var: #floatRet type: #double>
  	<var: #intRet type: #usqLong>
+ 	<inline: #always>
- 	<inline: true>
  	myThreadIndex := interpreterProxy disownVM: (self disownFlagsFor: calloutState).
  
  	calloutState floatRegisterSignature > 0 ifTrue:
  		[self 
  			load: (calloutState floatRegisters at: 0)
  			Flo: (calloutState floatRegisters at: 1)
  			atR: (calloutState floatRegisters at: 2)
  			egs: (calloutState floatRegisters at: 3)].
  
  	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
  		[self setsp: calloutState argVector].
  
  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
  	(atomicType >> 1) = (FFITypeSingleFloat >> 1) ifTrue:
  		[atomicType = FFITypeSingleFloat
  			ifTrue:
  				[floatRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)]
  			ifFalse: "atomicType = FFITypeDoubleFloat"
  				[floatRet := self 
  					dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  					with: (calloutState integerRegisters at: 0)
  					with: (calloutState integerRegisters at: 1)
  					with: (calloutState integerRegisters at: 2)
  					with: (calloutState integerRegisters at: 3)].
  
  		 "undo any callee argument pops because it may confuse stack management with the alloca."
  		 (self isCalleePopsConvention: calloutState callFlags) ifTrue:
  			[self setsp: calloutState argVector].
  		 interpreterProxy ownVM: myThreadIndex.
  
  		 ^interpreterProxy floatObjectOf: floatRet].
  
  	intRet := self 
  				dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqIntptr_t (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  				with: (calloutState integerRegisters at: 0)
  				with: (calloutState integerRegisters at: 1)
  				with: (calloutState integerRegisters at: 2)
  				with: (calloutState integerRegisters at: 3).
  
  	"undo any callee argument pops because it may confuse stack management with the alloca."
  	(self isCalleePopsConvention: calloutState callFlags) ifTrue:
  		[self setsp: calloutState argVector].
  	interpreterProxy ownVM: myThreadIndex.
  
  	(calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
  		["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
  		 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
  		 (calloutState ffiRetHeader anyMask: FFIFlagPointer) ifTrue:
  			[^self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
  		 ^self ffiReturnStruct: (self addressOf: intRet) ofType: (self ffiReturnType: specOnStack) in: calloutState].
  	
  	^self ffiCreateIntegralResultOop: intRet ofAtomicType: atomicType in: calloutState!

Item was changed:
  ----- Method: ThreadedX64Win64FFIPlugin>>ffiPushStructure:ofSize:typeSpec:ofLength:in: (in category 'marshalling') -----
  ffiPushStructure: pointer ofSize: structSize typeSpec: argSpec ofLength: argSpecSize in: calloutState
  	<var: #pointer type: #'void *'>
  	<var: #argSpec type: #'sqInt *'>
  	<var: #calloutState type: #'CalloutState *'>
  	<var: #arg type: #usqLong>
  	<inline: true>
  	structSize <= 0 ifTrue:
  		[^FFIErrorStructSize].
+ 	"See https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019"
  	(structSize <= WordSize
  	 and: [(structSize bitAnd: structSize - 1) = 0 "a.k.a. structSize isPowerOfTwo"]) ifTrue:
  		[| arg |
  		self memcpy: (self addressOf: arg) _: pointer _: structSize.
  		^self ffiPushUnsignedLongLong: arg in: calloutState].
  
+ 	"BUG!!!! This memory should be 16-byte aligned; Spur guarantees only 8-byte alignment."
+ 	self flag: #bug.
- 	"For now just push the pointer; we should copy the struct to the outgoing stack frame!!!!"
- 	self flag: 'quick hack'.
  	^self ffiPushPointer: pointer in: calloutState!

Item was changed:
  ----- Method: VMPluginCodeGenerator>>isStructSend: (in category 'utilities') -----
  isStructSend: aTSendNode
  	"Answer if the argument aTSendNode is a send of a structure accessor.
  	 This is tricky.  We want
  		foo bar => foo->bar
  		foo bar => foo.bar
  		foo bar: expr => foo->bar = expr
  		foo bar: expr => foo.bar = expr
  	 depending on whether foo is a struct or a pointer to a struct,
  	 but only if both foo is a struct type and bar is a field accessor.
  	 The tricky cases are self-sends within struct class methods.  Here we need to
  	 distinguish between self-sends of ordinary methods from self sends of accessors.
  
  	Override to avoid requiring that there be a struct accessor method for the selector."
  	^aTSendNode numArgs <= 1
+ 	   and: [(self isMacroSelector: aTSendNode selector) not
+ 	   and: [(aTSendNode receiver structTargetKindIn: self) notNil]]!
- 	   and: [(aTSendNode receiver structTargetKindIn: self) notNil]!

Item was added:
+ ----- Method: VMPluginCodeGenerator>>selectorReturnsPointerToStruct: (in category 'C code generator') -----
+ selectorReturnsPointerToStruct: selector "<Symbol>"
+ 	| tMethod |
+ 	^(tMethod := methods
+ 					at: selector
+ 					ifAbsent:
+ 						[apiMethods ifNotNil:
+ 							[apiMethods at: selector ifAbsent: []]]) notNil
+ 	  and: [(VMStructType isTypePointerToStruct: tMethod returnType)
+ 		or: [(pluginClass isStructType: tMethod returnType)
+ 		and: [tMethod returnType last == $*]]]!

Item was added:
+ ----- Method: VMPluginCodeGenerator>>selectorReturnsStruct: (in category 'C code generator') -----
+ selectorReturnsStruct: selector "<Symbol>"
+ 	| tMethod |
+ 	^(tMethod := methods
+ 					at: selector
+ 					ifAbsent:
+ 						[apiMethods ifNotNil:
+ 							[apiMethods at: selector ifAbsent: []]]) notNil
+ 	  and: [(VMStructType isTypeStruct: tMethod returnType)
+ 		or: [(pluginClass isStructType: tMethod returnType)
+ 		and: [tMethod returnType last ~~ $*]]]!

Item was changed:
  ----- Method: VMPluginCodeGenerator>>shouldGenerateAsInterpreterProxySend: (in category 'utilities') -----
  shouldGenerateAsInterpreterProxySend: aSendNode
  	"Answer if this send should be generated as interpreterProxy->foo or its moral equivalent (*).
  	 (*) since we now use function pointers declared in each external plugin we only indirect through
  	 interopreterProxy at plugin initialization.  But we still have to find the set of sends a plugin uses."
+ 	(aSendNode receiver isVariable and: ['interpreterProxy' = aSendNode receiver name]) ifFalse:
+ 		[^false].
+ 	(self isMacroSelector: aSendNode selector) ifTrue:
+ 		[^false].
- 	| selector |
- 	(aSendNode receiver isVariable and: ['interpreterProxy' = aSendNode receiver name]) ifFalse: [^false].
- 	selector := aSendNode selector.
- 	"baseHeaderSize, minSmallInteger et al are #defined in each VM's interp.h"
- 	(VMBasicConstants mostBasicConstantSelectors includes: selector) ifTrue: [^false].
  	"Only include genuine InterpreterProxy methods, excluding things not understood
  	 by InterpreterProxy and things in its initialize, private and simulation protocols."
+ 	^(#(initialize private #'simulation only') includes: (InterpreterProxy compiledMethodAt: aSendNode selector ifAbsent: [^false]) protocol) not!
- 	^(#(initialize private #'simulation only') includes: (InterpreterProxy compiledMethodAt: selector ifAbsent: [^false]) protocol) not!

Item was changed:
  ----- Method: VMPluginCodeGenerator>>structTargetKindForDeclaration: (in category 'C code generator') -----
  structTargetKindForDeclaration: decl "<String>"
  	^(super structTargetKindForDeclaration: decl) ifNil:
  		[pluginClass ifNotNil:
+ 			[| isPointer |
+ 			 (pluginClass isStructType: ((isPointer := decl last == $*)
+ 											ifTrue: [decl allButLast withBlanksTrimmed]
+ 											ifFalse: [decl])) ifTrue:
+ 				[isPointer
- 			[(pluginClass isStructType: (decl last = $*
- 											ifTrue: [decl allButLast]
- 											ifFalse: [decl]) withBlanksTrimmed) ifTrue:
- 				[(decl indexOf: $*) > 0
  					ifTrue: [#pointer]
  					ifFalse: [#struct]]]]!

Item was changed:
  ----- Method: VMStructType class>>isTypePointerToStruct: (in category 'translation') -----
  isTypePointerToStruct: type
  	| index |
+ 	^type isString
- 	^type notNil
  	  and: [(index := type indexOf: $*) > 0
  	  and: [self ensureStructTypeCache anySatisfy:
  			[:structType|
  			(type beginsWith: structType)
  			and: [index > structType size]]]]!



More information about the Vm-dev mailing list