[Vm-dev] VM Maker: VMMaker.oscog-nice.2038.mcz

commits at source.squeak.org commits at source.squeak.org
Tue Dec 13 22:13:18 UTC 2016


Nicolas Cellier uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-nice.2038.mcz

==================== Summary ====================

Name: VMMaker.oscog-nice.2038
Author: nice
Time: 13 December 2016, 11:11:51.599692 pm
UUID: f98d3e8c-e81d-fb42-b790-8e514777bfff
Ancestors: VMMaker.oscog-eem.2037

Revise ALLOCA_LIES_SO_USE_GET_SP

The extra space allocated below the value returned by alloca() is used by mingw32 gcc
(it is used for marshalling args of sub-functions - those who marshall ffi call args)
So we cannot use it for marshalling FFI arguments.

Thus we must change the strategy:
rather than using getsp() for changing the value returned by alloca() before marshalling args (which will lead to SEGV)
we must rather use setsp() for changing the stack pointer before calling the target function.

Thus we rename the macro into ALLOCA_LIES_SO_SET_SP_BEFORE_CALL

Remind that __clang__'s alloca does not lie.
(clang compiler does define __GNUC__)

---

Revise FFI stack alignment:

1) use STACK_BYTES_ALIGNMENT and getsp() as provided by include file sqCogStackAlignment.h
2) use a new macro MUST_ALIGN_STACK that check STACK_BYTES_ALIGNMENT against sizeof(void *)
    indeed, some nice STACK_BYTES_ALIGNMENT might be > 0 now, which does not necessarily mean that we must realign the stack
    This macro is used both for aligning alloca'ed space before marshalling and for aligning SP before call.

It is still questionable whether we must do some further alignment over alloca(), because alloca() should be aware of ABI requirements, but well, if it ain't broken...

TO THINK ABOUT: maybe SP should be saved before call/reset after call. By now, this does not seem necessary.

---

nuke STACK_OFFSET_BYTES from the preamble since it is unused.

---

get rid of registerArgsSlop
It was intended for reserving stack space for shadowing register args as mandated by Win64 X64 ABI (and maybe PPC).
But this is completely unecessary as this stack space will be reserved by the callout itself (at least on WIN64).
For PPC, don't do anything, it's currently unsupported (unmaintained/untested/unnecessary/you name it)

---

Let cogit generate remove float args from stack only if more than 0 float args (by symmetry with int args)
This was forgotten in a previous commit
Note that this is only used by lowcode experiments and still wait to be ported to other architectures than IA32.

=============== Diff against VMMaker.oscog-eem.2037 ===============

Item was changed:
  ----- Method: CogIA32Compiler>>genRemoveNFloatArgsFromStack: (in category 'abi') -----
  genRemoveNFloatArgsFromStack: n 
+ 	n > 0 ifTrue: [cogit AddCq: n * 8 R: ESP].
- 	cogit AddCq: n * 8 R: ESP.
  	^0!

Item was changed:
  ----- Method: ThreadedARMFFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  	<var: #procAddr type: #'void *'>
  	<var: #calloutState type: #'CalloutState *'>
  	<var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double, double, double, double, double)'>
  	"Go out, call this guy and create the return value.  This *must* be inlined because of
  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
  	| myThreadIndex atomicType floatRet intRet loadFloatRegs oop |
  	<var: #floatRet type: #double>
  	<var: #intRet type: #usqLong>
  	<inline: true>
  	self cCode: '' inSmalltalk: [loadFloatRegs := #used. loadFloatRegs class].
  	self cppIf: COGMTVM ifTrue:
  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
  		[myThreadIndex := interpreterProxy disownVM: 0]].
  
- 	self registerArgsSlop + self cStackAlignment > 0 ifTrue:
- 		[self setsp: calloutState argVector].
- 
  	calloutState floatRegisterIndex > 0 ifTrue:
  		[self 
  			load: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 0)) to: 'double *') at: 0)
  			Flo: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 2)) to: 'double *') at: 0)
  			a: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 4)) to: 'double *') at: 0)
  			t: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 6)) to: 'double *') at: 0)
  			R: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 8)) to: 'double *') at: 0)
  			e: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 10)) to: 'double *') at: 0)
  			g: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 12)) to: 'double *') at: 0)
  			s: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 14)) to: 'double *') at: 0)].
  
+ 	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
+ 		[self setsp: calloutState argVector].
+ 
  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
  		ifTrue:
  			[atomicType = FFITypeSingleFloat
  				ifTrue:
  					[floatRet := self 
  						dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  						with: (calloutState integerRegisters at: 0)
  						with: (calloutState integerRegisters at: 1)
  						with: (calloutState integerRegisters at: 2)
  						with: (calloutState integerRegisters at: 3)]
  				ifFalse: "atomicType = FFITypeDoubleFloat"
  					[floatRet := self 
  						dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  						with: (calloutState integerRegisters at: 0)
  						with: (calloutState integerRegisters at: 1)
  						with: (calloutState integerRegisters at: 2)
  						with: (calloutState integerRegisters at: 3)]]
  		ifFalse:
  			[intRet := self 
  				dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqIntptr_t (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  				with: (calloutState integerRegisters at: 0)
  				with: (calloutState integerRegisters at: 1)
  				with: (calloutState integerRegisters at: 2)
  				with: (calloutState integerRegisters at: 3)].
  	"undo any callee argument pops because it may confuse stack management with the alloca."
  	(self isCalleePopsConvention: calloutState callFlags) ifTrue:
  		[self setsp: calloutState argVector].
  
  	self cppIf: COGMTVM ifTrue:
  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
  		[interpreterProxy ownVM: myThreadIndex]].
  
  	(calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
  		["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
  		 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
  		 (calloutState ffiRetHeader anyMask: FFIFlagPointer)
  			ifTrue:
  				[oop := self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]
  			ifFalse:
  				[oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
  		 ^oop].
  	
  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
  		ifTrue:
  			[oop := interpreterProxy floatObjectOf: floatRet]
  		ifFalse:
  			[oop := self ffiCreateIntegralResultOop: intRet
  						ofAtomicType: atomicType
  						in: calloutState].
  	^interpreterProxy methodReturnValue: oop!

Item was removed:
- ----- Method: ThreadedARMFFIPlugin>>registerArgsSlop (in category 'marshalling') -----
- registerArgsSlop
- 	"Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
- 	 being overwritten by any register arguments during calls during marshalling.  On ARM we
- 	 believe this is zero."
- 	^0!

Item was changed:
  ----- Method: ThreadedFFIPlugin class>>preambleCCode (in category 'translation') -----
  preambleCCode
  	"For a source of builtin defines grep for builtin_define in a gcc release config directory."
  	^'
  #include "sqAssert.h" /* for assert */
  #define ThreadedFFIPlugin 1 /* to filter-out unwanted declarations from sqFFI.h */
  #include "sqFFI.h" /* for logging and surface functions */
+ #include "sqCogStackAlignment.h" /* for STACK_ALIGN_BYTES and getsp() */
  
  #ifdef _MSC_VER
  # define alloca _alloca
  #endif
  #if defined(__GNUC__) && (defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__))
  # define setsp(sp) asm volatile ("movl %0,%%esp" : : "m"(sp))
- # define getsp() ({ void *esp; asm volatile ("movl %%esp,%0" : "=r"(esp) : ); esp;})
  # elif defined(__GNUC__) && (defined(__amd64__) || defined(__x86_64__) ||  defined(__amd64) || defined(__x86_64))
  # define setsp(sp) asm volatile ("movq %0,%%rsp" : : "m"(sp))
- # define getsp() ({ void *rsp; asm volatile ("movq %%rsp,%0" : "=r"(rsp) : ); rsp;})
  # elif defined(__GNUC__) && (defined(__arm__))
  # define setsp(sp) asm volatile ("ldr %%sp, %0" : : "m"(sp))
- # define getsp() ({ void *sp; asm volatile ("mov %0, %%sp" : "=r"(sp) : ); sp;})
  #endif
  #if !!defined(getsp)
  # define getsp() 0
  #endif 
  #if !!defined(setsp)
  # define setsp(ignored) 0
  #endif 
  
  #if !!defined(STACK_ALIGN_BYTES)
- # if __APPLE__ && __MACH__ && __i386__
- #  define STACK_ALIGN_BYTES 16
- # elif __linux__ && __i386__
- #  define STACK_ALIGN_BYTES 16
- # elif defined(__amd64__) || defined(__x86_64__) ||  defined(__amd64) || defined(__x86_64)
- #  define STACK_ALIGN_BYTES 16
- # elif defined(powerpc) || defined(__powerpc__) || defined(_POWER) || defined(__POWERPC__) || defined(__PPC__)
- #  define STACK_ALIGN_BYTES 16
- # elif defined(__sparc64__) || defined(__sparcv9__) || defined(__sparc_v9__) /* must precede 32-bit sparc defs */
- #  define STACK_ALIGN_BYTES 16
- # elif defined(sparc) || defined(__sparc__) || defined(__sparclite__)
- #  define STACK_ALIGN_BYTES 8
- # elif defined(__arm__) 
- #  define STACK_ALIGN_BYTES 8
- # else
  #  define STACK_ALIGN_BYTES 0
- # endif
  #endif /* !!defined(STACK_ALIGN_BYTES) */
  
+ /* For ABI that require stack alignment greater than natural word size */
+ #define MUST_ALIGN_STACK (STACK_ALIGN_BYTES > sizeof(void*))
- #if !!defined(STACK_OFFSET_BYTES)
- # define STACK_OFFSET_BYTES 0
- #endif
  
  #if defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__)
  /* Both Mac OS X x86 and Win32 x86 return structs of a power of two in size
   * less than or equal to eight bytes in length in registers. Linux never does so.
   */
  # if __linux__
  #	define WIN32_X86_STRUCT_RETURN 0
  # else
  #	define WIN32_X86_STRUCT_RETURN 1
  # endif
  # if WIN32
  #	define PLATFORM_API_USES_CALLEE_POPS_CONVENTION 1
  # endif
  # elif defined(__amd64__) || defined(__x86_64__) ||  defined(__amd64) || defined(__x86_64)
  # if WIN32 | WIN64
  #	define PLATFORM_API_USES_CALLEE_POPS_CONVENTION 1
  # endif
  #endif /* defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__) */
  
+ #if !!defined(ALLOCA_LIES_SO_SETSP_BEFORE_CALL)
+ # if defined(__MINGW32__) && !!defined(__clang__) && (__GNUC__ >= 3) && (defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__))
- #if !!defined(ALLOCA_LIES_SO_USE_GETSP)
- # if defined(__MINGW32__) && (__GNUC__ >= 3) && (defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__))
      /*
       * cygwin -mno-cygwin (MinGW) gcc 3.4.x''s alloca is a library routine that answers
+      * %esp + xx, so the outgoing stack is offset by one or more word if uncorrected.
-      * %esp + 4, so the outgoing stack is offset by one word if uncorrected.
       * Grab the actual stack pointer to correct.
       */
+ #	define ALLOCA_LIES_SO_SETSP_BEFORE_CALL 1
- #	define ALLOCA_LIES_SO_USE_GETSP 1
  # else
+ #	define ALLOCA_LIES_SO_SETSP_BEFORE_CALL 0
- #	define ALLOCA_LIES_SO_USE_GETSP 0
  # endif
+ #endif /* !!defined(ALLOCA_LIES_SO_SETSP_BEFORE_CALL) */
- #endif /* !!defined(ALLOCA_LIES_SO_USE_GETSP) */
  
  #if !!defined(PLATFORM_API_USES_CALLEE_POPS_CONVENTION)
  # define PLATFORM_API_USES_CALLEE_POPS_CONVENTION 0
  #endif
  
  /* The dispatchOn:in:with:with: generates an unwanted call on error.  Just squash it. */
  #define error(foo) 0
  #ifndef SQUEAK_BUILTIN_PLUGIN
  /* but print assert failures. */
  void
  warning(char *s) { /* Print an error message but don''t exit. */
  	printf("\n%s\n", s);
  }
  #endif
  
  /* sanitize */
  #ifdef SQUEAK_BUILTIN_PLUGIN
  # define EXTERN 
  #else
  # define EXTERN extern
  #endif
  '!

Item was added:
+ ----- Method: ThreadedFFIPlugin>>allocaLiesSoSetSpBeforeCall (in category 'marshalling') -----
+ allocaLiesSoSetSpBeforeCall
+ 	"At least one alloca implementation does not answer the actual top of stack.
+ 	 If so we need to reset the actual stack pointer just before the call.
+ 	Answer whether this is necessary."
+ 	<cmacro: '() ALLOCA_LIES_SO_SETSP_BEFORE_CALL'>
+ 	^false!

Item was removed:
- ----- Method: ThreadedFFIPlugin>>allocaLiesSoUseGetsp (in category 'marshalling') -----
- allocaLiesSoUseGetsp
- 	"At least one alloca implementation does not answer the actual top of stack.
- 	 If so we need to get the actual stack pointer.  Answer whether this is necessary."
- 	<cmacro: '() ALLOCA_LIES_SO_USE_GETSP'>
- 	^false!

Item was changed:
  ----- Method: ThreadedFFIPlugin>>ffiCall:ArgArrayOrNil:NumArgs: (in category 'callout support') -----
  ffiCall: externalFunction ArgArrayOrNil: argArrayOrNil NumArgs: nArgs
  	"Generic callout. Does the actual work.  If argArrayOrNil is nil it takes args from the stack
  	 and the spec from the method.  If argArrayOrNil is not nil takes args from argArrayOrNil
  	 and the spec from the receiver."
  	| flags argTypeArray address argType oop argSpec argClass err theCalloutState calloutState requiredStackSize stackSize allocation result |
  	<inline: true>
  	<var: #theCalloutState type: #'CalloutState'>
  	<var: #calloutState type: #'CalloutState *'>
  	<var: #allocation type: #'char *'>
  
  	(interpreterProxy is: externalFunction KindOfClass: interpreterProxy classExternalFunction) ifFalse:
  		[^self ffiFail: FFIErrorNotFunction].
  	"Load and check the values in the externalFunction before we call out"
  	flags := interpreterProxy fetchInteger: ExternalFunctionFlagsIndex ofObject: externalFunction.
  	interpreterProxy failed ifTrue:
  		[^self ffiFail: FFIErrorBadArgs].
  
  	"This must come early for compatibility with the old FFIPlugin.  Image-level code
  	 may assume the function pointer is loaded eagerly.  Thanks to Nicolas Cellier."
  	address := self ffiLoadCalloutAddress: externalFunction.
  	interpreterProxy failed ifTrue:
  		[^0 "error code already set by ffiLoadCalloutAddress:"].
  	
  	argTypeArray := interpreterProxy fetchPointer: ExternalFunctionArgTypesIndex ofObject: externalFunction.
  	"must be array of arg types"
  	((interpreterProxy isArray: argTypeArray)
  	and: [(interpreterProxy slotSizeOf: argTypeArray) = (nArgs + 1)]) ifFalse:
  		[^self ffiFail: FFIErrorBadArgs].
  	"check if the calling convention is supported"
  	self cppIf: COGMTVM
  		ifTrue:
  			[(self ffiSupportsCallingConvention: (flags bitAnd: FFICallTypesMask)) ifFalse:
  				[^self ffiFail: FFIErrorCallType]]
  		ifFalse: "not masking causes threaded calls to fail, which is as they should if the plugin is not threaded."
  			[(self ffiSupportsCallingConvention: flags) ifFalse:
  				[^self ffiFail: FFIErrorCallType]].
  		
  	requiredStackSize := self externalFunctionHasStackSizeSlot
  							ifTrue: [interpreterProxy
  										fetchInteger: ExternalFunctionStackSizeIndex
  										ofObject: externalFunction]
  							ifFalse: [-1].
  	interpreterProxy failed ifTrue:
  		[^interpreterProxy primitiveFailFor: (argArrayOrNil isNil
  												ifTrue: [PrimErrBadMethod]
  												ifFalse: [PrimErrBadReceiver])].
  	stackSize := requiredStackSize < 0 ifTrue: [DefaultMaxStackSize] ifFalse: [requiredStackSize].
  	self cCode: [] inSmalltalk: [theCalloutState := self class calloutStateClass new].
  	calloutState := self addressOf: theCalloutState.
  	self cCode: [self me: calloutState ms: 0 et: (self sizeof: #CalloutState)].
  	calloutState callFlags: flags.
  	"Fetch return type and args"
  	argType := interpreterProxy fetchPointer: 0 ofObject: argTypeArray.
  	argSpec := interpreterProxy fetchPointer: 0 ofObject: argType.
  	argClass := interpreterProxy fetchPointer: 1 ofObject: argType.
  	(err := self ffiCheckReturn: argSpec With: argClass in: calloutState) ~= 0 ifTrue:
  		[^self ffiFail: err]. "cannot return"
+ 	"alloca the outgoing stack frame, leaving room for marshalling args, and including space for the return struct, if any.
+ 	Additional space reserved for saving register args like mandated by Win64 X64 or PPC ABI, will be managed by the call itself"
+ 	allocation := self alloca: stackSize + calloutState structReturnSize + self cStackAlignment.
+ 	self mustAlignStack ifTrue:
+ 		[allocation := self cCoerce: (allocation asUnsignedIntegerPtr bitClear: self cStackAlignment - 1)
- 	"alloca the outgoing stack frame, leaving room for register args while marshalling, and including space for the return struct, if any."
- 	allocation := self alloca: stackSize + calloutState structReturnSize + self registerArgsSlop + self cStackAlignment.
- 	self allocaLiesSoUseGetsp ifTrue:
- 		[allocation := self getsp].
- 	self cStackAlignment ~= 0 ifTrue:
- 		[allocation := self cCoerce: (allocation asUnsignedInteger bitClear: self cStackAlignment - 1)
  						to: #'char *'].
  	calloutState
  		argVector: allocation;
+ 		currentArg: allocation;
+ 		limit: allocation + stackSize.
- 		currentArg: allocation + self registerArgsSlop;
- 		limit: allocation + stackSize + self registerArgsSlop.
  	(calloutState structReturnSize > 0
  	 and: [self nonRegisterStructReturnIsViaImplicitFirstArgument
  	 and: [(self returnStructInRegisters: calloutState structReturnSize) not]]) ifTrue:
  		[err := self ffiPushPointer: calloutState limit in: calloutState.
  		 err ~= 0 ifTrue:
  			[self cleanupCalloutState: calloutState.
  			 self cppIf: COGMTVM ifTrue:
  			 [err = PrimErrObjectMayMove negated ifTrue:
  				[^PrimErrObjectMayMove]]. "N.B. Do not fail if object may move because caller will GC and retry."
  			 ^self ffiFail: err]].
  	1 to: nArgs do:
  		[:i|
  		argType := interpreterProxy fetchPointer: i ofObject: argTypeArray.
  		argSpec := interpreterProxy fetchPointer: 0 ofObject: argType.
  		argClass := interpreterProxy fetchPointer: 1 ofObject: argType.
  		oop := argArrayOrNil isNil
  				ifTrue: [interpreterProxy stackValue: nArgs - i]
  				ifFalse: [interpreterProxy fetchPointer: i - 1 ofObject: argArrayOrNil].
  		err := self ffiArgument: oop Spec: argSpec Class: argClass in: calloutState.
  		err ~= 0 ifTrue:
  			[self cleanupCalloutState: calloutState.
  			 self cppIf: COGMTVM ifTrue:
  			 [err = PrimErrObjectMayMove negated ifTrue:
  				[^PrimErrObjectMayMove]]. "N.B. Do not fail if object may move because caller will GC and retry."
  			 ^self ffiFail: err]]. "coercion failed or out of stack space"
  	"Failures must be reported back from ffiArgument:Spec:Class:in:.
  	 Should not fail from here on in."
  	self assert: interpreterProxy failed not.
  	self ffiLogCallout: externalFunction.
  	(requiredStackSize < 0
  	 and: [self externalFunctionHasStackSizeSlot]) ifTrue:
  		[stackSize := calloutState currentArg - calloutState argVector.
  		 interpreterProxy storeInteger: ExternalFunctionStackSizeIndex ofObject: externalFunction withValue: stackSize].
  	"Go out and call this guy"
  	result := self ffiCalloutTo: address SpecOnStack: argArrayOrNil notNil in: calloutState.
  	self cleanupCalloutState: calloutState.
  	^result!

Item was added:
+ ----- Method: ThreadedFFIPlugin>>mustAlignStack (in category 'marshalling') -----
+ mustAlignStack
+ 	"Many ABIs mandate a particular stack alignment greater than the natural word size.
+ 	 If so, this macro will answer true.  See class-side preambleCCode."
+ 	<cmacro: '() MUST_ALIGN_STACK'>
+ 	^0!

Item was removed:
- ----- Method: ThreadedFFIPlugin>>registerArgsSlop (in category 'marshalling') -----
- registerArgsSlop
- 	"Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
- 	 being overwritten by any register arguments during calls during marshalling.  For example, on
- 	 PowerPC, which has 8 register arguments in the calling convention, register arguments are also
- 	 written to the stack.  So unless space is left for them, calls during marshalling prior to the actual
- 	 callout (e.g. to interpreterProxy object manipulation routines) can end up overwriting the
- 	 marshalling stack as register arguments are written to the stack during calls."
- 	self subclassResponsibility!

Item was changed:
  ----- Method: ThreadedIA32FFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  	<var: #procAddr type: #'void *'>
  	<var: #calloutState type: #'CalloutState *'>
  	"Go out, call this guy and create the return value.  This *must* be inlined because of
  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
  	| myThreadIndex atomicType floatRet intRet oop |
  	<var: #floatRet type: #double>
  	<var: #intRet type: #usqLong>
  	<inline: true>
  	self cppIf: COGMTVM ifTrue:
  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
  		[myThreadIndex := interpreterProxy disownVM: 0]].
  
+ 	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
- 	self registerArgsSlop + self cStackAlignment > 0 ifTrue:
  		[self setsp: calloutState argVector].
  
  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
  		ifTrue:
  			[floatRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)()')]
  		ifFalse:
  			[intRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqLong (*)()')].
  	"undo any callee argument pops because it may confuse stack management with the alloca."
  	(self isCalleePopsConvention: calloutState callFlags) ifTrue:
  		[self setsp: calloutState argVector].
  
  	self cppIf: COGMTVM ifTrue:
  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
  		[interpreterProxy ownVM: myThreadIndex]].
  
  	(calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
  		["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
  		 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
  		 (calloutState ffiRetHeader anyMask: FFIFlagPointer)
  			ifTrue:
  				[oop := self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]
  			ifFalse:
  				[oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
  		 ^oop].
  	
  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
  		ifTrue:
  			[oop := interpreterProxy floatObjectOf: floatRet]
  		ifFalse:
  			[oop := self ffiCreateIntegralResultOop: intRet
  						ofAtomicType: atomicType
  						in: calloutState].
  	^interpreterProxy methodReturnValue: oop!

Item was removed:
- ----- Method: ThreadedIA32FFIPlugin>>registerArgsSlop (in category 'marshalling') -----
- registerArgsSlop
- 	"Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
- 	 being overwritten by any register arguments during calls during marshalling.  On x86 this is 0"
- 	^0!

Item was removed:
- ----- Method: ThreadedPPCBEFFIPlugin>>registerArgsSlop (in category 'marshalling') -----
- registerArgsSlop
- 	"Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
- 	 being overwritten by any register arguments during calls during marshalling.  On PowerPC, which
- 	 has 8 register arguments in the calling convention, register arguments are also written to the stack.
- 	 So we must leave room for 8 * 4 bytes to avoid overwriting the marshalling stack as register
- 	 arguments are written to the stack during calls to interpreterProxy etc."
- 	^32!

Item was removed:
- ----- Method: ThreadedX64FFIPlugin>>registerArgsSlop (in category 'marshalling') -----
- registerArgsSlop
- 	"Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
- 	 being overwritten by any register arguments during calls during marshalling.  On ARM we
- 	 believe this is zero."
- 	^0!

Item was changed:
  ----- Method: ThreadedX64SysVFFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  	<var: #procAddr type: #'void *'>
  	<var: #calloutState type: #'CalloutState *'>
  	<var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double, double, double, double, double)'>
  	"Go out, call this guy and create the return value.  This *must* be inlined because of
  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
  	| myThreadIndex atomicType floatRet intRet loadFloatRegs oop |
  	<var: #floatRet type: #double>
  	<var: #intRet type: 'SixteenByteReturn'>
  	<inline: true>
  	self cCode: '' inSmalltalk: [loadFloatRegs := #used. loadFloatRegs class].
  	self cppIf: COGMTVM ifTrue:
  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
  		[myThreadIndex := interpreterProxy disownVM: 0]].
  
- 	self registerArgsSlop + self cStackAlignment > 0 ifTrue:
- 		[self setsp: calloutState argVector].
- 
  	calloutState floatRegisterIndex > 0 ifTrue:
  		[self 
  			load: (calloutState floatRegisters at: 0)
  			Flo: (calloutState floatRegisters at: 1)
  			a: (calloutState floatRegisters at: 2)
  			t: (calloutState floatRegisters at: 3)
  			R: (calloutState floatRegisters at: 4)
  			e: (calloutState floatRegisters at: 5)
  			g: (calloutState floatRegisters at: 6)
  			s: (calloutState floatRegisters at: 7)].
  
+ 	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
+ 		[self setsp: calloutState argVector].
+ 
  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
  		ifTrue:
  			[atomicType = FFITypeSingleFloat
  				ifTrue:
  					[floatRet := self 
  						dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  						with: (calloutState integerRegisters at: 0)
  						with: (calloutState integerRegisters at: 1)
  						with: (calloutState integerRegisters at: 2)
  						with: (calloutState integerRegisters at: 3)
  						with: (calloutState integerRegisters at: 4)
  						with: (calloutState integerRegisters at: 5)]
  				ifFalse: "atomicType = FFITypeDoubleFloat"
  					[floatRet := self 
  						dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  						with: (calloutState integerRegisters at: 0)
  						with: (calloutState integerRegisters at: 1)
  						with: (calloutState integerRegisters at: 2)
  						with: (calloutState integerRegisters at: 3)
  						with: (calloutState integerRegisters at: 4)
  						with: (calloutState integerRegisters at: 5)]]
  		ifFalse:
  			[intRet := self 
  				dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'SixteenByteReturn (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  				with: (calloutState integerRegisters at: 0)
  				with: (calloutState integerRegisters at: 1)
  				with: (calloutState integerRegisters at: 2)
  				with: (calloutState integerRegisters at: 3)
  				with: (calloutState integerRegisters at: 4)
  				with: (calloutState integerRegisters at: 5)].
  	"undo any callee argument pops because it may confuse stack management with the alloca."
  	(self isCalleePopsConvention: calloutState callFlags) ifTrue:
  		[self setsp: calloutState argVector].
  
  	self cppIf: COGMTVM ifTrue:
  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
  		[interpreterProxy ownVM: myThreadIndex]].
  
  	(calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
  		["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
  		 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
  		 (calloutState ffiRetHeader anyMask: FFIFlagPointer)
  			ifTrue:
  				[oop := self ffiReturnPointer: intRet a ofType: (self ffiReturnType: specOnStack) in: calloutState]
  			ifFalse:
  				[oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
  		 ^oop].
  	
  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
  		ifTrue:
  			[oop := interpreterProxy floatObjectOf: floatRet]
  		ifFalse:
  			[oop := self ffiCreateIntegralResultOop: intRet a
  						ofAtomicType: atomicType
  						in: calloutState].
  	^interpreterProxy methodReturnValue: oop!

Item was changed:
  ----- Method: ThreadedX64Win64FFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
  	<var: #procAddr type: #'void *'>
  	<var: #calloutState type: #'CalloutState *'>
  	<var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double)'>
  	"Go out, call this guy and create the return value.  This *must* be inlined because of
  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
  	| myThreadIndex atomicType floatRet intRet loadFloatRegs oop |
  	<var: #floatRet type: #double>
  	<var: #intRet type: #usqLong>
  	<inline: true>
  	self cCode: '' inSmalltalk: [loadFloatRegs := #used. loadFloatRegs class].
  	self cppIf: COGMTVM ifTrue:
  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
  		[myThreadIndex := interpreterProxy disownVM: 0]].
  
- 	self registerArgsSlop + self cStackAlignment > 0 ifTrue:
- 		[self setsp: calloutState argVector].
- 
  	calloutState floatRegisterSignature > 0 ifTrue:
  		[self 
  			load: (calloutState floatRegisters at: 0)
  			Flo: (calloutState floatRegisters at: 1)
  			at: (calloutState floatRegisters at: 2)
  			Re: (calloutState floatRegisters at: 3)
  			gs: (calloutState floatRegisters at: 4)].
  
+ 	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
+ 		[self setsp: calloutState argVector].
+ 	
  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
  		ifTrue:
  			[atomicType = FFITypeSingleFloat
  				ifTrue:
  					[floatRet := self 
  						dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  						with: (calloutState integerRegisters at: 0)
  						with: (calloutState integerRegisters at: 1)
  						with: (calloutState integerRegisters at: 2)
  						with: (calloutState integerRegisters at: 3)]
  				ifFalse: "atomicType = FFITypeDoubleFloat"
  					[floatRet := self 
  						dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  						with: (calloutState integerRegisters at: 0)
  						with: (calloutState integerRegisters at: 1)
  						with: (calloutState integerRegisters at: 2)
  						with: (calloutState integerRegisters at: 3)]]
  		ifFalse:
  			[intRet := self 
  				dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqIntptr_t (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
  				with: (calloutState integerRegisters at: 0)
  				with: (calloutState integerRegisters at: 1)
  				with: (calloutState integerRegisters at: 2)
  				with: (calloutState integerRegisters at: 3)].
  	"undo any callee argument pops because it may confuse stack management with the alloca."
  	(self isCalleePopsConvention: calloutState callFlags) ifTrue:
  		[self setsp: calloutState argVector].
  
  	self cppIf: COGMTVM ifTrue:
  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
  		[interpreterProxy ownVM: myThreadIndex]].
  
  	(calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
  		["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
  		 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
  		 (calloutState ffiRetHeader anyMask: FFIFlagPointer)
  			ifTrue:
  				[oop := self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]
  			ifFalse:
  				[oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
  		 ^oop].
  	
  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
  		ifTrue:
  			[oop := interpreterProxy floatObjectOf: floatRet]
  		ifFalse:
  			[oop := self ffiCreateIntegralResultOop: intRet
  						ofAtomicType: atomicType
  						in: calloutState].
  	^interpreterProxy methodReturnValue: oop!



More information about the Vm-dev mailing list