[Vm-dev] VM Maker: VMMaker.oscog-nice.2038.mcz

Esteban Lorenzano estebanlm at gmail.com
Wed Dec 14 07:43:42 UTC 2016


Hi, 

Thanks Nicolas, this is very nice (no pun intended) :P
This means I can remove my workaround settings? (remember those STACK_ALIGN_BYTES=16 and ALLOCA_LIES_SO_USE_GETSP=0)

Thanks!
Esteban


> On 13 Dec 2016, at 23:13, commits at source.squeak.org wrote:
> 
> 
> Nicolas Cellier uploaded a new version of VMMaker to project VM Maker:
> http://source.squeak.org/VMMaker/VMMaker.oscog-nice.2038.mcz
> 
> ==================== Summary ====================
> 
> Name: VMMaker.oscog-nice.2038
> Author: nice
> Time: 13 December 2016, 11:11:51.599692 pm
> UUID: f98d3e8c-e81d-fb42-b790-8e514777bfff
> Ancestors: VMMaker.oscog-eem.2037
> 
> Revise ALLOCA_LIES_SO_USE_GET_SP
> 
> The extra space allocated below the value returned by alloca() is used by mingw32 gcc
> (it is used for marshalling args of sub-functions - those who marshall ffi call args)
> So we cannot use it for marshalling FFI arguments.
> 
> Thus we must change the strategy:
> rather than using getsp() for changing the value returned by alloca() before marshalling args (which will lead to SEGV)
> we must rather use setsp() for changing the stack pointer before calling the target function.
> 
> Thus we rename the macro into ALLOCA_LIES_SO_SET_SP_BEFORE_CALL
> 
> Remind that __clang__'s alloca does not lie.
> (clang compiler does define __GNUC__)
> 
> ---
> 
> Revise FFI stack alignment:
> 
> 1) use STACK_BYTES_ALIGNMENT and getsp() as provided by include file sqCogStackAlignment.h
> 2) use a new macro MUST_ALIGN_STACK that check STACK_BYTES_ALIGNMENT against sizeof(void *)
>    indeed, some nice STACK_BYTES_ALIGNMENT might be > 0 now, which does not necessarily mean that we must realign the stack
>    This macro is used both for aligning alloca'ed space before marshalling and for aligning SP before call.
> 
> It is still questionable whether we must do some further alignment over alloca(), because alloca() should be aware of ABI requirements, but well, if it ain't broken...
> 
> TO THINK ABOUT: maybe SP should be saved before call/reset after call. By now, this does not seem necessary.
> 
> ---
> 
> nuke STACK_OFFSET_BYTES from the preamble since it is unused.
> 
> ---
> 
> get rid of registerArgsSlop
> It was intended for reserving stack space for shadowing register args as mandated by Win64 X64 ABI (and maybe PPC).
> But this is completely unecessary as this stack space will be reserved by the callout itself (at least on WIN64).
> For PPC, don't do anything, it's currently unsupported (unmaintained/untested/unnecessary/you name it)
> 
> ---
> 
> Let cogit generate remove float args from stack only if more than 0 float args (by symmetry with int args)
> This was forgotten in a previous commit
> Note that this is only used by lowcode experiments and still wait to be ported to other architectures than IA32.
> 
> =============== Diff against VMMaker.oscog-eem.2037 ===============
> 
> Item was changed:
>  ----- Method: CogIA32Compiler>>genRemoveNFloatArgsFromStack: (in category 'abi') -----
>  genRemoveNFloatArgsFromStack: n 
> + 	n > 0 ifTrue: [cogit AddCq: n * 8 R: ESP].
> - 	cogit AddCq: n * 8 R: ESP.
>  	^0!
> 
> Item was changed:
>  ----- Method: ThreadedARMFFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
>  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
>  	<var: #procAddr type: #'void *'>
>  	<var: #calloutState type: #'CalloutState *'>
>  	<var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double, double, double, double, double)'>
>  	"Go out, call this guy and create the return value.  This *must* be inlined because of
>  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
>  	| myThreadIndex atomicType floatRet intRet loadFloatRegs oop |
>  	<var: #floatRet type: #double>
>  	<var: #intRet type: #usqLong>
>  	<inline: true>
>  	self cCode: '' inSmalltalk: [loadFloatRegs := #used. loadFloatRegs class].
>  	self cppIf: COGMTVM ifTrue:
>  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
>  		[myThreadIndex := interpreterProxy disownVM: 0]].
> 
> - 	self registerArgsSlop + self cStackAlignment > 0 ifTrue:
> - 		[self setsp: calloutState argVector].
> - 
>  	calloutState floatRegisterIndex > 0 ifTrue:
>  		[self 
>  			load: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 0)) to: 'double *') at: 0)
>  			Flo: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 2)) to: 'double *') at: 0)
>  			a: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 4)) to: 'double *') at: 0)
>  			t: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 6)) to: 'double *') at: 0)
>  			R: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 8)) to: 'double *') at: 0)
>  			e: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 10)) to: 'double *') at: 0)
>  			g: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 12)) to: 'double *') at: 0)
>  			s: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 14)) to: 'double *') at: 0)].
> 
> + 	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
> + 		[self setsp: calloutState argVector].
> + 
>  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
>  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
>  		ifTrue:
>  			[atomicType = FFITypeSingleFloat
>  				ifTrue:
>  					[floatRet := self 
>  						dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
>  						with: (calloutState integerRegisters at: 0)
>  						with: (calloutState integerRegisters at: 1)
>  						with: (calloutState integerRegisters at: 2)
>  						with: (calloutState integerRegisters at: 3)]
>  				ifFalse: "atomicType = FFITypeDoubleFloat"
>  					[floatRet := self 
>  						dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
>  						with: (calloutState integerRegisters at: 0)
>  						with: (calloutState integerRegisters at: 1)
>  						with: (calloutState integerRegisters at: 2)
>  						with: (calloutState integerRegisters at: 3)]]
>  		ifFalse:
>  			[intRet := self 
>  				dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqIntptr_t (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
>  				with: (calloutState integerRegisters at: 0)
>  				with: (calloutState integerRegisters at: 1)
>  				with: (calloutState integerRegisters at: 2)
>  				with: (calloutState integerRegisters at: 3)].
>  	"undo any callee argument pops because it may confuse stack management with the alloca."
>  	(self isCalleePopsConvention: calloutState callFlags) ifTrue:
>  		[self setsp: calloutState argVector].
> 
>  	self cppIf: COGMTVM ifTrue:
>  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
>  		[interpreterProxy ownVM: myThreadIndex]].
> 
>  	(calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
>  		["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
>  		 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
>  		 (calloutState ffiRetHeader anyMask: FFIFlagPointer)
>  			ifTrue:
>  				[oop := self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]
>  			ifFalse:
>  				[oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
>  		 ^oop].
>  	
>  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
>  		ifTrue:
>  			[oop := interpreterProxy floatObjectOf: floatRet]
>  		ifFalse:
>  			[oop := self ffiCreateIntegralResultOop: intRet
>  						ofAtomicType: atomicType
>  						in: calloutState].
>  	^interpreterProxy methodReturnValue: oop!
> 
> Item was removed:
> - ----- Method: ThreadedARMFFIPlugin>>registerArgsSlop (in category 'marshalling') -----
> - registerArgsSlop
> - 	"Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
> - 	 being overwritten by any register arguments during calls during marshalling.  On ARM we
> - 	 believe this is zero."
> - 	^0!
> 
> Item was changed:
>  ----- Method: ThreadedFFIPlugin class>>preambleCCode (in category 'translation') -----
>  preambleCCode
>  	"For a source of builtin defines grep for builtin_define in a gcc release config directory."
>  	^'
>  #include "sqAssert.h" /* for assert */
>  #define ThreadedFFIPlugin 1 /* to filter-out unwanted declarations from sqFFI.h */
>  #include "sqFFI.h" /* for logging and surface functions */
> + #include "sqCogStackAlignment.h" /* for STACK_ALIGN_BYTES and getsp() */
> 
>  #ifdef _MSC_VER
>  # define alloca _alloca
>  #endif
>  #if defined(__GNUC__) && (defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__))
>  # define setsp(sp) asm volatile ("movl %0,%%esp" : : "m"(sp))
> - # define getsp() ({ void *esp; asm volatile ("movl %%esp,%0" : "=r"(esp) : ); esp;})
>  # elif defined(__GNUC__) && (defined(__amd64__) || defined(__x86_64__) ||  defined(__amd64) || defined(__x86_64))
>  # define setsp(sp) asm volatile ("movq %0,%%rsp" : : "m"(sp))
> - # define getsp() ({ void *rsp; asm volatile ("movq %%rsp,%0" : "=r"(rsp) : ); rsp;})
>  # elif defined(__GNUC__) && (defined(__arm__))
>  # define setsp(sp) asm volatile ("ldr %%sp, %0" : : "m"(sp))
> - # define getsp() ({ void *sp; asm volatile ("mov %0, %%sp" : "=r"(sp) : ); sp;})
>  #endif
>  #if !!defined(getsp)
>  # define getsp() 0
>  #endif 
>  #if !!defined(setsp)
>  # define setsp(ignored) 0
>  #endif 
> 
>  #if !!defined(STACK_ALIGN_BYTES)
> - # if __APPLE__ && __MACH__ && __i386__
> - #  define STACK_ALIGN_BYTES 16
> - # elif __linux__ && __i386__
> - #  define STACK_ALIGN_BYTES 16
> - # elif defined(__amd64__) || defined(__x86_64__) ||  defined(__amd64) || defined(__x86_64)
> - #  define STACK_ALIGN_BYTES 16
> - # elif defined(powerpc) || defined(__powerpc__) || defined(_POWER) || defined(__POWERPC__) || defined(__PPC__)
> - #  define STACK_ALIGN_BYTES 16
> - # elif defined(__sparc64__) || defined(__sparcv9__) || defined(__sparc_v9__) /* must precede 32-bit sparc defs */
> - #  define STACK_ALIGN_BYTES 16
> - # elif defined(sparc) || defined(__sparc__) || defined(__sparclite__)
> - #  define STACK_ALIGN_BYTES 8
> - # elif defined(__arm__) 
> - #  define STACK_ALIGN_BYTES 8
> - # else
>  #  define STACK_ALIGN_BYTES 0
> - # endif
>  #endif /* !!defined(STACK_ALIGN_BYTES) */
> 
> + /* For ABI that require stack alignment greater than natural word size */
> + #define MUST_ALIGN_STACK (STACK_ALIGN_BYTES > sizeof(void*))
> - #if !!defined(STACK_OFFSET_BYTES)
> - # define STACK_OFFSET_BYTES 0
> - #endif
> 
>  #if defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__)
>  /* Both Mac OS X x86 and Win32 x86 return structs of a power of two in size
>   * less than or equal to eight bytes in length in registers. Linux never does so.
>   */
>  # if __linux__
>  #	define WIN32_X86_STRUCT_RETURN 0
>  # else
>  #	define WIN32_X86_STRUCT_RETURN 1
>  # endif
>  # if WIN32
>  #	define PLATFORM_API_USES_CALLEE_POPS_CONVENTION 1
>  # endif
>  # elif defined(__amd64__) || defined(__x86_64__) ||  defined(__amd64) || defined(__x86_64)
>  # if WIN32 | WIN64
>  #	define PLATFORM_API_USES_CALLEE_POPS_CONVENTION 1
>  # endif
>  #endif /* defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__) */
> 
> + #if !!defined(ALLOCA_LIES_SO_SETSP_BEFORE_CALL)
> + # if defined(__MINGW32__) && !!defined(__clang__) && (__GNUC__ >= 3) && (defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__))
> - #if !!defined(ALLOCA_LIES_SO_USE_GETSP)
> - # if defined(__MINGW32__) && (__GNUC__ >= 3) && (defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__))
>      /*
>       * cygwin -mno-cygwin (MinGW) gcc 3.4.x''s alloca is a library routine that answers
> +      * %esp + xx, so the outgoing stack is offset by one or more word if uncorrected.
> -      * %esp + 4, so the outgoing stack is offset by one word if uncorrected.
>       * Grab the actual stack pointer to correct.
>       */
> + #	define ALLOCA_LIES_SO_SETSP_BEFORE_CALL 1
> - #	define ALLOCA_LIES_SO_USE_GETSP 1
>  # else
> + #	define ALLOCA_LIES_SO_SETSP_BEFORE_CALL 0
> - #	define ALLOCA_LIES_SO_USE_GETSP 0
>  # endif
> + #endif /* !!defined(ALLOCA_LIES_SO_SETSP_BEFORE_CALL) */
> - #endif /* !!defined(ALLOCA_LIES_SO_USE_GETSP) */
> 
>  #if !!defined(PLATFORM_API_USES_CALLEE_POPS_CONVENTION)
>  # define PLATFORM_API_USES_CALLEE_POPS_CONVENTION 0
>  #endif
> 
>  /* The dispatchOn:in:with:with: generates an unwanted call on error.  Just squash it. */
>  #define error(foo) 0
>  #ifndef SQUEAK_BUILTIN_PLUGIN
>  /* but print assert failures. */
>  void
>  warning(char *s) { /* Print an error message but don''t exit. */
>  	printf("\n%s\n", s);
>  }
>  #endif
> 
>  /* sanitize */
>  #ifdef SQUEAK_BUILTIN_PLUGIN
>  # define EXTERN 
>  #else
>  # define EXTERN extern
>  #endif
>  '!
> 
> Item was added:
> + ----- Method: ThreadedFFIPlugin>>allocaLiesSoSetSpBeforeCall (in category 'marshalling') -----
> + allocaLiesSoSetSpBeforeCall
> + 	"At least one alloca implementation does not answer the actual top of stack.
> + 	 If so we need to reset the actual stack pointer just before the call.
> + 	Answer whether this is necessary."
> + 	<cmacro: '() ALLOCA_LIES_SO_SETSP_BEFORE_CALL'>
> + 	^false!
> 
> Item was removed:
> - ----- Method: ThreadedFFIPlugin>>allocaLiesSoUseGetsp (in category 'marshalling') -----
> - allocaLiesSoUseGetsp
> - 	"At least one alloca implementation does not answer the actual top of stack.
> - 	 If so we need to get the actual stack pointer.  Answer whether this is necessary."
> - 	<cmacro: '() ALLOCA_LIES_SO_USE_GETSP'>
> - 	^false!
> 
> Item was changed:
>  ----- Method: ThreadedFFIPlugin>>ffiCall:ArgArrayOrNil:NumArgs: (in category 'callout support') -----
>  ffiCall: externalFunction ArgArrayOrNil: argArrayOrNil NumArgs: nArgs
>  	"Generic callout. Does the actual work.  If argArrayOrNil is nil it takes args from the stack
>  	 and the spec from the method.  If argArrayOrNil is not nil takes args from argArrayOrNil
>  	 and the spec from the receiver."
>  	| flags argTypeArray address argType oop argSpec argClass err theCalloutState calloutState requiredStackSize stackSize allocation result |
>  	<inline: true>
>  	<var: #theCalloutState type: #'CalloutState'>
>  	<var: #calloutState type: #'CalloutState *'>
>  	<var: #allocation type: #'char *'>
> 
>  	(interpreterProxy is: externalFunction KindOfClass: interpreterProxy classExternalFunction) ifFalse:
>  		[^self ffiFail: FFIErrorNotFunction].
>  	"Load and check the values in the externalFunction before we call out"
>  	flags := interpreterProxy fetchInteger: ExternalFunctionFlagsIndex ofObject: externalFunction.
>  	interpreterProxy failed ifTrue:
>  		[^self ffiFail: FFIErrorBadArgs].
> 
>  	"This must come early for compatibility with the old FFIPlugin.  Image-level code
>  	 may assume the function pointer is loaded eagerly.  Thanks to Nicolas Cellier."
>  	address := self ffiLoadCalloutAddress: externalFunction.
>  	interpreterProxy failed ifTrue:
>  		[^0 "error code already set by ffiLoadCalloutAddress:"].
>  	
>  	argTypeArray := interpreterProxy fetchPointer: ExternalFunctionArgTypesIndex ofObject: externalFunction.
>  	"must be array of arg types"
>  	((interpreterProxy isArray: argTypeArray)
>  	and: [(interpreterProxy slotSizeOf: argTypeArray) = (nArgs + 1)]) ifFalse:
>  		[^self ffiFail: FFIErrorBadArgs].
>  	"check if the calling convention is supported"
>  	self cppIf: COGMTVM
>  		ifTrue:
>  			[(self ffiSupportsCallingConvention: (flags bitAnd: FFICallTypesMask)) ifFalse:
>  				[^self ffiFail: FFIErrorCallType]]
>  		ifFalse: "not masking causes threaded calls to fail, which is as they should if the plugin is not threaded."
>  			[(self ffiSupportsCallingConvention: flags) ifFalse:
>  				[^self ffiFail: FFIErrorCallType]].
>  		
>  	requiredStackSize := self externalFunctionHasStackSizeSlot
>  							ifTrue: [interpreterProxy
>  										fetchInteger: ExternalFunctionStackSizeIndex
>  										ofObject: externalFunction]
>  							ifFalse: [-1].
>  	interpreterProxy failed ifTrue:
>  		[^interpreterProxy primitiveFailFor: (argArrayOrNil isNil
>  												ifTrue: [PrimErrBadMethod]
>  												ifFalse: [PrimErrBadReceiver])].
>  	stackSize := requiredStackSize < 0 ifTrue: [DefaultMaxStackSize] ifFalse: [requiredStackSize].
>  	self cCode: [] inSmalltalk: [theCalloutState := self class calloutStateClass new].
>  	calloutState := self addressOf: theCalloutState.
>  	self cCode: [self me: calloutState ms: 0 et: (self sizeof: #CalloutState)].
>  	calloutState callFlags: flags.
>  	"Fetch return type and args"
>  	argType := interpreterProxy fetchPointer: 0 ofObject: argTypeArray.
>  	argSpec := interpreterProxy fetchPointer: 0 ofObject: argType.
>  	argClass := interpreterProxy fetchPointer: 1 ofObject: argType.
>  	(err := self ffiCheckReturn: argSpec With: argClass in: calloutState) ~= 0 ifTrue:
>  		[^self ffiFail: err]. "cannot return"
> + 	"alloca the outgoing stack frame, leaving room for marshalling args, and including space for the return struct, if any.
> + 	Additional space reserved for saving register args like mandated by Win64 X64 or PPC ABI, will be managed by the call itself"
> + 	allocation := self alloca: stackSize + calloutState structReturnSize + self cStackAlignment.
> + 	self mustAlignStack ifTrue:
> + 		[allocation := self cCoerce: (allocation asUnsignedIntegerPtr bitClear: self cStackAlignment - 1)
> - 	"alloca the outgoing stack frame, leaving room for register args while marshalling, and including space for the return struct, if any."
> - 	allocation := self alloca: stackSize + calloutState structReturnSize + self registerArgsSlop + self cStackAlignment.
> - 	self allocaLiesSoUseGetsp ifTrue:
> - 		[allocation := self getsp].
> - 	self cStackAlignment ~= 0 ifTrue:
> - 		[allocation := self cCoerce: (allocation asUnsignedInteger bitClear: self cStackAlignment - 1)
>  						to: #'char *'].
>  	calloutState
>  		argVector: allocation;
> + 		currentArg: allocation;
> + 		limit: allocation + stackSize.
> - 		currentArg: allocation + self registerArgsSlop;
> - 		limit: allocation + stackSize + self registerArgsSlop.
>  	(calloutState structReturnSize > 0
>  	 and: [self nonRegisterStructReturnIsViaImplicitFirstArgument
>  	 and: [(self returnStructInRegisters: calloutState structReturnSize) not]]) ifTrue:
>  		[err := self ffiPushPointer: calloutState limit in: calloutState.
>  		 err ~= 0 ifTrue:
>  			[self cleanupCalloutState: calloutState.
>  			 self cppIf: COGMTVM ifTrue:
>  			 [err = PrimErrObjectMayMove negated ifTrue:
>  				[^PrimErrObjectMayMove]]. "N.B. Do not fail if object may move because caller will GC and retry."
>  			 ^self ffiFail: err]].
>  	1 to: nArgs do:
>  		[:i|
>  		argType := interpreterProxy fetchPointer: i ofObject: argTypeArray.
>  		argSpec := interpreterProxy fetchPointer: 0 ofObject: argType.
>  		argClass := interpreterProxy fetchPointer: 1 ofObject: argType.
>  		oop := argArrayOrNil isNil
>  				ifTrue: [interpreterProxy stackValue: nArgs - i]
>  				ifFalse: [interpreterProxy fetchPointer: i - 1 ofObject: argArrayOrNil].
>  		err := self ffiArgument: oop Spec: argSpec Class: argClass in: calloutState.
>  		err ~= 0 ifTrue:
>  			[self cleanupCalloutState: calloutState.
>  			 self cppIf: COGMTVM ifTrue:
>  			 [err = PrimErrObjectMayMove negated ifTrue:
>  				[^PrimErrObjectMayMove]]. "N.B. Do not fail if object may move because caller will GC and retry."
>  			 ^self ffiFail: err]]. "coercion failed or out of stack space"
>  	"Failures must be reported back from ffiArgument:Spec:Class:in:.
>  	 Should not fail from here on in."
>  	self assert: interpreterProxy failed not.
>  	self ffiLogCallout: externalFunction.
>  	(requiredStackSize < 0
>  	 and: [self externalFunctionHasStackSizeSlot]) ifTrue:
>  		[stackSize := calloutState currentArg - calloutState argVector.
>  		 interpreterProxy storeInteger: ExternalFunctionStackSizeIndex ofObject: externalFunction withValue: stackSize].
>  	"Go out and call this guy"
>  	result := self ffiCalloutTo: address SpecOnStack: argArrayOrNil notNil in: calloutState.
>  	self cleanupCalloutState: calloutState.
>  	^result!
> 
> Item was added:
> + ----- Method: ThreadedFFIPlugin>>mustAlignStack (in category 'marshalling') -----
> + mustAlignStack
> + 	"Many ABIs mandate a particular stack alignment greater than the natural word size.
> + 	 If so, this macro will answer true.  See class-side preambleCCode."
> + 	<cmacro: '() MUST_ALIGN_STACK'>
> + 	^0!
> 
> Item was removed:
> - ----- Method: ThreadedFFIPlugin>>registerArgsSlop (in category 'marshalling') -----
> - registerArgsSlop
> - 	"Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
> - 	 being overwritten by any register arguments during calls during marshalling.  For example, on
> - 	 PowerPC, which has 8 register arguments in the calling convention, register arguments are also
> - 	 written to the stack.  So unless space is left for them, calls during marshalling prior to the actual
> - 	 callout (e.g. to interpreterProxy object manipulation routines) can end up overwriting the
> - 	 marshalling stack as register arguments are written to the stack during calls."
> - 	self subclassResponsibility!
> 
> Item was changed:
>  ----- Method: ThreadedIA32FFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
>  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
>  	<var: #procAddr type: #'void *'>
>  	<var: #calloutState type: #'CalloutState *'>
>  	"Go out, call this guy and create the return value.  This *must* be inlined because of
>  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
>  	| myThreadIndex atomicType floatRet intRet oop |
>  	<var: #floatRet type: #double>
>  	<var: #intRet type: #usqLong>
>  	<inline: true>
>  	self cppIf: COGMTVM ifTrue:
>  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
>  		[myThreadIndex := interpreterProxy disownVM: 0]].
> 
> + 	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
> - 	self registerArgsSlop + self cStackAlignment > 0 ifTrue:
>  		[self setsp: calloutState argVector].
> 
>  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
>  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
>  		ifTrue:
>  			[floatRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)()')]
>  		ifFalse:
>  			[intRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqLong (*)()')].
>  	"undo any callee argument pops because it may confuse stack management with the alloca."
>  	(self isCalleePopsConvention: calloutState callFlags) ifTrue:
>  		[self setsp: calloutState argVector].
> 
>  	self cppIf: COGMTVM ifTrue:
>  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
>  		[interpreterProxy ownVM: myThreadIndex]].
> 
>  	(calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
>  		["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
>  		 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
>  		 (calloutState ffiRetHeader anyMask: FFIFlagPointer)
>  			ifTrue:
>  				[oop := self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]
>  			ifFalse:
>  				[oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
>  		 ^oop].
>  	
>  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
>  		ifTrue:
>  			[oop := interpreterProxy floatObjectOf: floatRet]
>  		ifFalse:
>  			[oop := self ffiCreateIntegralResultOop: intRet
>  						ofAtomicType: atomicType
>  						in: calloutState].
>  	^interpreterProxy methodReturnValue: oop!
> 
> Item was removed:
> - ----- Method: ThreadedIA32FFIPlugin>>registerArgsSlop (in category 'marshalling') -----
> - registerArgsSlop
> - 	"Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
> - 	 being overwritten by any register arguments during calls during marshalling.  On x86 this is 0"
> - 	^0!
> 
> Item was removed:
> - ----- Method: ThreadedPPCBEFFIPlugin>>registerArgsSlop (in category 'marshalling') -----
> - registerArgsSlop
> - 	"Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
> - 	 being overwritten by any register arguments during calls during marshalling.  On PowerPC, which
> - 	 has 8 register arguments in the calling convention, register arguments are also written to the stack.
> - 	 So we must leave room for 8 * 4 bytes to avoid overwriting the marshalling stack as register
> - 	 arguments are written to the stack during calls to interpreterProxy etc."
> - 	^32!
> 
> Item was removed:
> - ----- Method: ThreadedX64FFIPlugin>>registerArgsSlop (in category 'marshalling') -----
> - registerArgsSlop
> - 	"Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
> - 	 being overwritten by any register arguments during calls during marshalling.  On ARM we
> - 	 believe this is zero."
> - 	^0!
> 
> Item was changed:
>  ----- Method: ThreadedX64SysVFFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
>  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
>  	<var: #procAddr type: #'void *'>
>  	<var: #calloutState type: #'CalloutState *'>
>  	<var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double, double, double, double, double)'>
>  	"Go out, call this guy and create the return value.  This *must* be inlined because of
>  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
>  	| myThreadIndex atomicType floatRet intRet loadFloatRegs oop |
>  	<var: #floatRet type: #double>
>  	<var: #intRet type: 'SixteenByteReturn'>
>  	<inline: true>
>  	self cCode: '' inSmalltalk: [loadFloatRegs := #used. loadFloatRegs class].
>  	self cppIf: COGMTVM ifTrue:
>  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
>  		[myThreadIndex := interpreterProxy disownVM: 0]].
> 
> - 	self registerArgsSlop + self cStackAlignment > 0 ifTrue:
> - 		[self setsp: calloutState argVector].
> - 
>  	calloutState floatRegisterIndex > 0 ifTrue:
>  		[self 
>  			load: (calloutState floatRegisters at: 0)
>  			Flo: (calloutState floatRegisters at: 1)
>  			a: (calloutState floatRegisters at: 2)
>  			t: (calloutState floatRegisters at: 3)
>  			R: (calloutState floatRegisters at: 4)
>  			e: (calloutState floatRegisters at: 5)
>  			g: (calloutState floatRegisters at: 6)
>  			s: (calloutState floatRegisters at: 7)].
> 
> + 	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
> + 		[self setsp: calloutState argVector].
> + 
>  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
>  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
>  		ifTrue:
>  			[atomicType = FFITypeSingleFloat
>  				ifTrue:
>  					[floatRet := self 
>  						dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
>  						with: (calloutState integerRegisters at: 0)
>  						with: (calloutState integerRegisters at: 1)
>  						with: (calloutState integerRegisters at: 2)
>  						with: (calloutState integerRegisters at: 3)
>  						with: (calloutState integerRegisters at: 4)
>  						with: (calloutState integerRegisters at: 5)]
>  				ifFalse: "atomicType = FFITypeDoubleFloat"
>  					[floatRet := self 
>  						dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
>  						with: (calloutState integerRegisters at: 0)
>  						with: (calloutState integerRegisters at: 1)
>  						with: (calloutState integerRegisters at: 2)
>  						with: (calloutState integerRegisters at: 3)
>  						with: (calloutState integerRegisters at: 4)
>  						with: (calloutState integerRegisters at: 5)]]
>  		ifFalse:
>  			[intRet := self 
>  				dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'SixteenByteReturn (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
>  				with: (calloutState integerRegisters at: 0)
>  				with: (calloutState integerRegisters at: 1)
>  				with: (calloutState integerRegisters at: 2)
>  				with: (calloutState integerRegisters at: 3)
>  				with: (calloutState integerRegisters at: 4)
>  				with: (calloutState integerRegisters at: 5)].
>  	"undo any callee argument pops because it may confuse stack management with the alloca."
>  	(self isCalleePopsConvention: calloutState callFlags) ifTrue:
>  		[self setsp: calloutState argVector].
> 
>  	self cppIf: COGMTVM ifTrue:
>  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
>  		[interpreterProxy ownVM: myThreadIndex]].
> 
>  	(calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
>  		["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
>  		 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
>  		 (calloutState ffiRetHeader anyMask: FFIFlagPointer)
>  			ifTrue:
>  				[oop := self ffiReturnPointer: intRet a ofType: (self ffiReturnType: specOnStack) in: calloutState]
>  			ifFalse:
>  				[oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
>  		 ^oop].
>  	
>  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
>  		ifTrue:
>  			[oop := interpreterProxy floatObjectOf: floatRet]
>  		ifFalse:
>  			[oop := self ffiCreateIntegralResultOop: intRet a
>  						ofAtomicType: atomicType
>  						in: calloutState].
>  	^interpreterProxy methodReturnValue: oop!
> 
> Item was changed:
>  ----- Method: ThreadedX64Win64FFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
>  ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
>  	<var: #procAddr type: #'void *'>
>  	<var: #calloutState type: #'CalloutState *'>
>  	<var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double)'>
>  	"Go out, call this guy and create the return value.  This *must* be inlined because of
>  	 the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
>  	| myThreadIndex atomicType floatRet intRet loadFloatRegs oop |
>  	<var: #floatRet type: #double>
>  	<var: #intRet type: #usqLong>
>  	<inline: true>
>  	self cCode: '' inSmalltalk: [loadFloatRegs := #used. loadFloatRegs class].
>  	self cppIf: COGMTVM ifTrue:
>  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
>  		[myThreadIndex := interpreterProxy disownVM: 0]].
> 
> - 	self registerArgsSlop + self cStackAlignment > 0 ifTrue:
> - 		[self setsp: calloutState argVector].
> - 
>  	calloutState floatRegisterSignature > 0 ifTrue:
>  		[self 
>  			load: (calloutState floatRegisters at: 0)
>  			Flo: (calloutState floatRegisters at: 1)
>  			at: (calloutState floatRegisters at: 2)
>  			Re: (calloutState floatRegisters at: 3)
>  			gs: (calloutState floatRegisters at: 4)].
> 
> + 	(self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
> + 		[self setsp: calloutState argVector].
> + 	
>  	atomicType := self atomicTypeOf: calloutState ffiRetHeader.
>  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
>  		ifTrue:
>  			[atomicType = FFITypeSingleFloat
>  				ifTrue:
>  					[floatRet := self 
>  						dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
>  						with: (calloutState integerRegisters at: 0)
>  						with: (calloutState integerRegisters at: 1)
>  						with: (calloutState integerRegisters at: 2)
>  						with: (calloutState integerRegisters at: 3)]
>  				ifFalse: "atomicType = FFITypeDoubleFloat"
>  					[floatRet := self 
>  						dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
>  						with: (calloutState integerRegisters at: 0)
>  						with: (calloutState integerRegisters at: 1)
>  						with: (calloutState integerRegisters at: 2)
>  						with: (calloutState integerRegisters at: 3)]]
>  		ifFalse:
>  			[intRet := self 
>  				dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqIntptr_t (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)') 
>  				with: (calloutState integerRegisters at: 0)
>  				with: (calloutState integerRegisters at: 1)
>  				with: (calloutState integerRegisters at: 2)
>  				with: (calloutState integerRegisters at: 3)].
>  	"undo any callee argument pops because it may confuse stack management with the alloca."
>  	(self isCalleePopsConvention: calloutState callFlags) ifTrue:
>  		[self setsp: calloutState argVector].
> 
>  	self cppIf: COGMTVM ifTrue:
>  	[(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
>  		[interpreterProxy ownVM: myThreadIndex]].
> 
>  	(calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
>  		["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
>  		 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
>  		 (calloutState ffiRetHeader anyMask: FFIFlagPointer)
>  			ifTrue:
>  				[oop := self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]
>  			ifFalse:
>  				[oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
>  		 ^oop].
>  	
>  	(atomicType >> 1) = (FFITypeSingleFloat >> 1)
>  		ifTrue:
>  			[oop := interpreterProxy floatObjectOf: floatRet]
>  		ifFalse:
>  			[oop := self ffiCreateIntegralResultOop: intRet
>  						ofAtomicType: atomicType
>  						in: calloutState].
>  	^interpreterProxy methodReturnValue: oop!
> 



More information about the Vm-dev mailing list