[Vm-dev] VM Maker: VMMaker.oscog-nice.2038.mcz
Esteban Lorenzano
estebanlm at gmail.com
Wed Dec 14 07:43:42 UTC 2016
Hi,
Thanks Nicolas, this is very nice (no pun intended) :P
This means I can remove my workaround settings? (remember those STACK_ALIGN_BYTES=16 and ALLOCA_LIES_SO_USE_GETSP=0)
Thanks!
Esteban
> On 13 Dec 2016, at 23:13, commits at source.squeak.org wrote:
>
>
> Nicolas Cellier uploaded a new version of VMMaker to project VM Maker:
> http://source.squeak.org/VMMaker/VMMaker.oscog-nice.2038.mcz
>
> ==================== Summary ====================
>
> Name: VMMaker.oscog-nice.2038
> Author: nice
> Time: 13 December 2016, 11:11:51.599692 pm
> UUID: f98d3e8c-e81d-fb42-b790-8e514777bfff
> Ancestors: VMMaker.oscog-eem.2037
>
> Revise ALLOCA_LIES_SO_USE_GET_SP
>
> The extra space allocated below the value returned by alloca() is used by mingw32 gcc
> (it is used for marshalling args of sub-functions - those who marshall ffi call args)
> So we cannot use it for marshalling FFI arguments.
>
> Thus we must change the strategy:
> rather than using getsp() for changing the value returned by alloca() before marshalling args (which will lead to SEGV)
> we must rather use setsp() for changing the stack pointer before calling the target function.
>
> Thus we rename the macro into ALLOCA_LIES_SO_SET_SP_BEFORE_CALL
>
> Remind that __clang__'s alloca does not lie.
> (clang compiler does define __GNUC__)
>
> ---
>
> Revise FFI stack alignment:
>
> 1) use STACK_BYTES_ALIGNMENT and getsp() as provided by include file sqCogStackAlignment.h
> 2) use a new macro MUST_ALIGN_STACK that check STACK_BYTES_ALIGNMENT against sizeof(void *)
> indeed, some nice STACK_BYTES_ALIGNMENT might be > 0 now, which does not necessarily mean that we must realign the stack
> This macro is used both for aligning alloca'ed space before marshalling and for aligning SP before call.
>
> It is still questionable whether we must do some further alignment over alloca(), because alloca() should be aware of ABI requirements, but well, if it ain't broken...
>
> TO THINK ABOUT: maybe SP should be saved before call/reset after call. By now, this does not seem necessary.
>
> ---
>
> nuke STACK_OFFSET_BYTES from the preamble since it is unused.
>
> ---
>
> get rid of registerArgsSlop
> It was intended for reserving stack space for shadowing register args as mandated by Win64 X64 ABI (and maybe PPC).
> But this is completely unecessary as this stack space will be reserved by the callout itself (at least on WIN64).
> For PPC, don't do anything, it's currently unsupported (unmaintained/untested/unnecessary/you name it)
>
> ---
>
> Let cogit generate remove float args from stack only if more than 0 float args (by symmetry with int args)
> This was forgotten in a previous commit
> Note that this is only used by lowcode experiments and still wait to be ported to other architectures than IA32.
>
> =============== Diff against VMMaker.oscog-eem.2037 ===============
>
> Item was changed:
> ----- Method: CogIA32Compiler>>genRemoveNFloatArgsFromStack: (in category 'abi') -----
> genRemoveNFloatArgsFromStack: n
> + n > 0 ifTrue: [cogit AddCq: n * 8 R: ESP].
> - cogit AddCq: n * 8 R: ESP.
> ^0!
>
> Item was changed:
> ----- Method: ThreadedARMFFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
> ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
> <var: #procAddr type: #'void *'>
> <var: #calloutState type: #'CalloutState *'>
> <var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double, double, double, double, double)'>
> "Go out, call this guy and create the return value. This *must* be inlined because of
> the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
> | myThreadIndex atomicType floatRet intRet loadFloatRegs oop |
> <var: #floatRet type: #double>
> <var: #intRet type: #usqLong>
> <inline: true>
> self cCode: '' inSmalltalk: [loadFloatRegs := #used. loadFloatRegs class].
> self cppIf: COGMTVM ifTrue:
> [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
> [myThreadIndex := interpreterProxy disownVM: 0]].
>
> - self registerArgsSlop + self cStackAlignment > 0 ifTrue:
> - [self setsp: calloutState argVector].
> -
> calloutState floatRegisterIndex > 0 ifTrue:
> [self
> load: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 0)) to: 'double *') at: 0)
> Flo: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 2)) to: 'double *') at: 0)
> a: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 4)) to: 'double *') at: 0)
> t: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 6)) to: 'double *') at: 0)
> R: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 8)) to: 'double *') at: 0)
> e: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 10)) to: 'double *') at: 0)
> g: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 12)) to: 'double *') at: 0)
> s: ((self cCoerceSimple: (self addressOf: (calloutState floatRegisters at: 14)) to: 'double *') at: 0)].
>
> + (self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
> + [self setsp: calloutState argVector].
> +
> atomicType := self atomicTypeOf: calloutState ffiRetHeader.
> (atomicType >> 1) = (FFITypeSingleFloat >> 1)
> ifTrue:
> [atomicType = FFITypeSingleFloat
> ifTrue:
> [floatRet := self
> dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
> with: (calloutState integerRegisters at: 0)
> with: (calloutState integerRegisters at: 1)
> with: (calloutState integerRegisters at: 2)
> with: (calloutState integerRegisters at: 3)]
> ifFalse: "atomicType = FFITypeDoubleFloat"
> [floatRet := self
> dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
> with: (calloutState integerRegisters at: 0)
> with: (calloutState integerRegisters at: 1)
> with: (calloutState integerRegisters at: 2)
> with: (calloutState integerRegisters at: 3)]]
> ifFalse:
> [intRet := self
> dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqIntptr_t (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
> with: (calloutState integerRegisters at: 0)
> with: (calloutState integerRegisters at: 1)
> with: (calloutState integerRegisters at: 2)
> with: (calloutState integerRegisters at: 3)].
> "undo any callee argument pops because it may confuse stack management with the alloca."
> (self isCalleePopsConvention: calloutState callFlags) ifTrue:
> [self setsp: calloutState argVector].
>
> self cppIf: COGMTVM ifTrue:
> [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
> [interpreterProxy ownVM: myThreadIndex]].
>
> (calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
> ["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
> 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
> (calloutState ffiRetHeader anyMask: FFIFlagPointer)
> ifTrue:
> [oop := self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]
> ifFalse:
> [oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
> ^oop].
>
> (atomicType >> 1) = (FFITypeSingleFloat >> 1)
> ifTrue:
> [oop := interpreterProxy floatObjectOf: floatRet]
> ifFalse:
> [oop := self ffiCreateIntegralResultOop: intRet
> ofAtomicType: atomicType
> in: calloutState].
> ^interpreterProxy methodReturnValue: oop!
>
> Item was removed:
> - ----- Method: ThreadedARMFFIPlugin>>registerArgsSlop (in category 'marshalling') -----
> - registerArgsSlop
> - "Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
> - being overwritten by any register arguments during calls during marshalling. On ARM we
> - believe this is zero."
> - ^0!
>
> Item was changed:
> ----- Method: ThreadedFFIPlugin class>>preambleCCode (in category 'translation') -----
> preambleCCode
> "For a source of builtin defines grep for builtin_define in a gcc release config directory."
> ^'
> #include "sqAssert.h" /* for assert */
> #define ThreadedFFIPlugin 1 /* to filter-out unwanted declarations from sqFFI.h */
> #include "sqFFI.h" /* for logging and surface functions */
> + #include "sqCogStackAlignment.h" /* for STACK_ALIGN_BYTES and getsp() */
>
> #ifdef _MSC_VER
> # define alloca _alloca
> #endif
> #if defined(__GNUC__) && (defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__))
> # define setsp(sp) asm volatile ("movl %0,%%esp" : : "m"(sp))
> - # define getsp() ({ void *esp; asm volatile ("movl %%esp,%0" : "=r"(esp) : ); esp;})
> # elif defined(__GNUC__) && (defined(__amd64__) || defined(__x86_64__) || defined(__amd64) || defined(__x86_64))
> # define setsp(sp) asm volatile ("movq %0,%%rsp" : : "m"(sp))
> - # define getsp() ({ void *rsp; asm volatile ("movq %%rsp,%0" : "=r"(rsp) : ); rsp;})
> # elif defined(__GNUC__) && (defined(__arm__))
> # define setsp(sp) asm volatile ("ldr %%sp, %0" : : "m"(sp))
> - # define getsp() ({ void *sp; asm volatile ("mov %0, %%sp" : "=r"(sp) : ); sp;})
> #endif
> #if !!defined(getsp)
> # define getsp() 0
> #endif
> #if !!defined(setsp)
> # define setsp(ignored) 0
> #endif
>
> #if !!defined(STACK_ALIGN_BYTES)
> - # if __APPLE__ && __MACH__ && __i386__
> - # define STACK_ALIGN_BYTES 16
> - # elif __linux__ && __i386__
> - # define STACK_ALIGN_BYTES 16
> - # elif defined(__amd64__) || defined(__x86_64__) || defined(__amd64) || defined(__x86_64)
> - # define STACK_ALIGN_BYTES 16
> - # elif defined(powerpc) || defined(__powerpc__) || defined(_POWER) || defined(__POWERPC__) || defined(__PPC__)
> - # define STACK_ALIGN_BYTES 16
> - # elif defined(__sparc64__) || defined(__sparcv9__) || defined(__sparc_v9__) /* must precede 32-bit sparc defs */
> - # define STACK_ALIGN_BYTES 16
> - # elif defined(sparc) || defined(__sparc__) || defined(__sparclite__)
> - # define STACK_ALIGN_BYTES 8
> - # elif defined(__arm__)
> - # define STACK_ALIGN_BYTES 8
> - # else
> # define STACK_ALIGN_BYTES 0
> - # endif
> #endif /* !!defined(STACK_ALIGN_BYTES) */
>
> + /* For ABI that require stack alignment greater than natural word size */
> + #define MUST_ALIGN_STACK (STACK_ALIGN_BYTES > sizeof(void*))
> - #if !!defined(STACK_OFFSET_BYTES)
> - # define STACK_OFFSET_BYTES 0
> - #endif
>
> #if defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__)
> /* Both Mac OS X x86 and Win32 x86 return structs of a power of two in size
> * less than or equal to eight bytes in length in registers. Linux never does so.
> */
> # if __linux__
> # define WIN32_X86_STRUCT_RETURN 0
> # else
> # define WIN32_X86_STRUCT_RETURN 1
> # endif
> # if WIN32
> # define PLATFORM_API_USES_CALLEE_POPS_CONVENTION 1
> # endif
> # elif defined(__amd64__) || defined(__x86_64__) || defined(__amd64) || defined(__x86_64)
> # if WIN32 | WIN64
> # define PLATFORM_API_USES_CALLEE_POPS_CONVENTION 1
> # endif
> #endif /* defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__) */
>
> + #if !!defined(ALLOCA_LIES_SO_SETSP_BEFORE_CALL)
> + # if defined(__MINGW32__) && !!defined(__clang__) && (__GNUC__ >= 3) && (defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__))
> - #if !!defined(ALLOCA_LIES_SO_USE_GETSP)
> - # if defined(__MINGW32__) && (__GNUC__ >= 3) && (defined(_X86_) || defined(i386) || defined(__i386) || defined(__i386__))
> /*
> * cygwin -mno-cygwin (MinGW) gcc 3.4.x''s alloca is a library routine that answers
> + * %esp + xx, so the outgoing stack is offset by one or more word if uncorrected.
> - * %esp + 4, so the outgoing stack is offset by one word if uncorrected.
> * Grab the actual stack pointer to correct.
> */
> + # define ALLOCA_LIES_SO_SETSP_BEFORE_CALL 1
> - # define ALLOCA_LIES_SO_USE_GETSP 1
> # else
> + # define ALLOCA_LIES_SO_SETSP_BEFORE_CALL 0
> - # define ALLOCA_LIES_SO_USE_GETSP 0
> # endif
> + #endif /* !!defined(ALLOCA_LIES_SO_SETSP_BEFORE_CALL) */
> - #endif /* !!defined(ALLOCA_LIES_SO_USE_GETSP) */
>
> #if !!defined(PLATFORM_API_USES_CALLEE_POPS_CONVENTION)
> # define PLATFORM_API_USES_CALLEE_POPS_CONVENTION 0
> #endif
>
> /* The dispatchOn:in:with:with: generates an unwanted call on error. Just squash it. */
> #define error(foo) 0
> #ifndef SQUEAK_BUILTIN_PLUGIN
> /* but print assert failures. */
> void
> warning(char *s) { /* Print an error message but don''t exit. */
> printf("\n%s\n", s);
> }
> #endif
>
> /* sanitize */
> #ifdef SQUEAK_BUILTIN_PLUGIN
> # define EXTERN
> #else
> # define EXTERN extern
> #endif
> '!
>
> Item was added:
> + ----- Method: ThreadedFFIPlugin>>allocaLiesSoSetSpBeforeCall (in category 'marshalling') -----
> + allocaLiesSoSetSpBeforeCall
> + "At least one alloca implementation does not answer the actual top of stack.
> + If so we need to reset the actual stack pointer just before the call.
> + Answer whether this is necessary."
> + <cmacro: '() ALLOCA_LIES_SO_SETSP_BEFORE_CALL'>
> + ^false!
>
> Item was removed:
> - ----- Method: ThreadedFFIPlugin>>allocaLiesSoUseGetsp (in category 'marshalling') -----
> - allocaLiesSoUseGetsp
> - "At least one alloca implementation does not answer the actual top of stack.
> - If so we need to get the actual stack pointer. Answer whether this is necessary."
> - <cmacro: '() ALLOCA_LIES_SO_USE_GETSP'>
> - ^false!
>
> Item was changed:
> ----- Method: ThreadedFFIPlugin>>ffiCall:ArgArrayOrNil:NumArgs: (in category 'callout support') -----
> ffiCall: externalFunction ArgArrayOrNil: argArrayOrNil NumArgs: nArgs
> "Generic callout. Does the actual work. If argArrayOrNil is nil it takes args from the stack
> and the spec from the method. If argArrayOrNil is not nil takes args from argArrayOrNil
> and the spec from the receiver."
> | flags argTypeArray address argType oop argSpec argClass err theCalloutState calloutState requiredStackSize stackSize allocation result |
> <inline: true>
> <var: #theCalloutState type: #'CalloutState'>
> <var: #calloutState type: #'CalloutState *'>
> <var: #allocation type: #'char *'>
>
> (interpreterProxy is: externalFunction KindOfClass: interpreterProxy classExternalFunction) ifFalse:
> [^self ffiFail: FFIErrorNotFunction].
> "Load and check the values in the externalFunction before we call out"
> flags := interpreterProxy fetchInteger: ExternalFunctionFlagsIndex ofObject: externalFunction.
> interpreterProxy failed ifTrue:
> [^self ffiFail: FFIErrorBadArgs].
>
> "This must come early for compatibility with the old FFIPlugin. Image-level code
> may assume the function pointer is loaded eagerly. Thanks to Nicolas Cellier."
> address := self ffiLoadCalloutAddress: externalFunction.
> interpreterProxy failed ifTrue:
> [^0 "error code already set by ffiLoadCalloutAddress:"].
>
> argTypeArray := interpreterProxy fetchPointer: ExternalFunctionArgTypesIndex ofObject: externalFunction.
> "must be array of arg types"
> ((interpreterProxy isArray: argTypeArray)
> and: [(interpreterProxy slotSizeOf: argTypeArray) = (nArgs + 1)]) ifFalse:
> [^self ffiFail: FFIErrorBadArgs].
> "check if the calling convention is supported"
> self cppIf: COGMTVM
> ifTrue:
> [(self ffiSupportsCallingConvention: (flags bitAnd: FFICallTypesMask)) ifFalse:
> [^self ffiFail: FFIErrorCallType]]
> ifFalse: "not masking causes threaded calls to fail, which is as they should if the plugin is not threaded."
> [(self ffiSupportsCallingConvention: flags) ifFalse:
> [^self ffiFail: FFIErrorCallType]].
>
> requiredStackSize := self externalFunctionHasStackSizeSlot
> ifTrue: [interpreterProxy
> fetchInteger: ExternalFunctionStackSizeIndex
> ofObject: externalFunction]
> ifFalse: [-1].
> interpreterProxy failed ifTrue:
> [^interpreterProxy primitiveFailFor: (argArrayOrNil isNil
> ifTrue: [PrimErrBadMethod]
> ifFalse: [PrimErrBadReceiver])].
> stackSize := requiredStackSize < 0 ifTrue: [DefaultMaxStackSize] ifFalse: [requiredStackSize].
> self cCode: [] inSmalltalk: [theCalloutState := self class calloutStateClass new].
> calloutState := self addressOf: theCalloutState.
> self cCode: [self me: calloutState ms: 0 et: (self sizeof: #CalloutState)].
> calloutState callFlags: flags.
> "Fetch return type and args"
> argType := interpreterProxy fetchPointer: 0 ofObject: argTypeArray.
> argSpec := interpreterProxy fetchPointer: 0 ofObject: argType.
> argClass := interpreterProxy fetchPointer: 1 ofObject: argType.
> (err := self ffiCheckReturn: argSpec With: argClass in: calloutState) ~= 0 ifTrue:
> [^self ffiFail: err]. "cannot return"
> + "alloca the outgoing stack frame, leaving room for marshalling args, and including space for the return struct, if any.
> + Additional space reserved for saving register args like mandated by Win64 X64 or PPC ABI, will be managed by the call itself"
> + allocation := self alloca: stackSize + calloutState structReturnSize + self cStackAlignment.
> + self mustAlignStack ifTrue:
> + [allocation := self cCoerce: (allocation asUnsignedIntegerPtr bitClear: self cStackAlignment - 1)
> - "alloca the outgoing stack frame, leaving room for register args while marshalling, and including space for the return struct, if any."
> - allocation := self alloca: stackSize + calloutState structReturnSize + self registerArgsSlop + self cStackAlignment.
> - self allocaLiesSoUseGetsp ifTrue:
> - [allocation := self getsp].
> - self cStackAlignment ~= 0 ifTrue:
> - [allocation := self cCoerce: (allocation asUnsignedInteger bitClear: self cStackAlignment - 1)
> to: #'char *'].
> calloutState
> argVector: allocation;
> + currentArg: allocation;
> + limit: allocation + stackSize.
> - currentArg: allocation + self registerArgsSlop;
> - limit: allocation + stackSize + self registerArgsSlop.
> (calloutState structReturnSize > 0
> and: [self nonRegisterStructReturnIsViaImplicitFirstArgument
> and: [(self returnStructInRegisters: calloutState structReturnSize) not]]) ifTrue:
> [err := self ffiPushPointer: calloutState limit in: calloutState.
> err ~= 0 ifTrue:
> [self cleanupCalloutState: calloutState.
> self cppIf: COGMTVM ifTrue:
> [err = PrimErrObjectMayMove negated ifTrue:
> [^PrimErrObjectMayMove]]. "N.B. Do not fail if object may move because caller will GC and retry."
> ^self ffiFail: err]].
> 1 to: nArgs do:
> [:i|
> argType := interpreterProxy fetchPointer: i ofObject: argTypeArray.
> argSpec := interpreterProxy fetchPointer: 0 ofObject: argType.
> argClass := interpreterProxy fetchPointer: 1 ofObject: argType.
> oop := argArrayOrNil isNil
> ifTrue: [interpreterProxy stackValue: nArgs - i]
> ifFalse: [interpreterProxy fetchPointer: i - 1 ofObject: argArrayOrNil].
> err := self ffiArgument: oop Spec: argSpec Class: argClass in: calloutState.
> err ~= 0 ifTrue:
> [self cleanupCalloutState: calloutState.
> self cppIf: COGMTVM ifTrue:
> [err = PrimErrObjectMayMove negated ifTrue:
> [^PrimErrObjectMayMove]]. "N.B. Do not fail if object may move because caller will GC and retry."
> ^self ffiFail: err]]. "coercion failed or out of stack space"
> "Failures must be reported back from ffiArgument:Spec:Class:in:.
> Should not fail from here on in."
> self assert: interpreterProxy failed not.
> self ffiLogCallout: externalFunction.
> (requiredStackSize < 0
> and: [self externalFunctionHasStackSizeSlot]) ifTrue:
> [stackSize := calloutState currentArg - calloutState argVector.
> interpreterProxy storeInteger: ExternalFunctionStackSizeIndex ofObject: externalFunction withValue: stackSize].
> "Go out and call this guy"
> result := self ffiCalloutTo: address SpecOnStack: argArrayOrNil notNil in: calloutState.
> self cleanupCalloutState: calloutState.
> ^result!
>
> Item was added:
> + ----- Method: ThreadedFFIPlugin>>mustAlignStack (in category 'marshalling') -----
> + mustAlignStack
> + "Many ABIs mandate a particular stack alignment greater than the natural word size.
> + If so, this macro will answer true. See class-side preambleCCode."
> + <cmacro: '() MUST_ALIGN_STACK'>
> + ^0!
>
> Item was removed:
> - ----- Method: ThreadedFFIPlugin>>registerArgsSlop (in category 'marshalling') -----
> - registerArgsSlop
> - "Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
> - being overwritten by any register arguments during calls during marshalling. For example, on
> - PowerPC, which has 8 register arguments in the calling convention, register arguments are also
> - written to the stack. So unless space is left for them, calls during marshalling prior to the actual
> - callout (e.g. to interpreterProxy object manipulation routines) can end up overwriting the
> - marshalling stack as register arguments are written to the stack during calls."
> - self subclassResponsibility!
>
> Item was changed:
> ----- Method: ThreadedIA32FFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
> ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
> <var: #procAddr type: #'void *'>
> <var: #calloutState type: #'CalloutState *'>
> "Go out, call this guy and create the return value. This *must* be inlined because of
> the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
> | myThreadIndex atomicType floatRet intRet oop |
> <var: #floatRet type: #double>
> <var: #intRet type: #usqLong>
> <inline: true>
> self cppIf: COGMTVM ifTrue:
> [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
> [myThreadIndex := interpreterProxy disownVM: 0]].
>
> + (self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
> - self registerArgsSlop + self cStackAlignment > 0 ifTrue:
> [self setsp: calloutState argVector].
>
> atomicType := self atomicTypeOf: calloutState ffiRetHeader.
> (atomicType >> 1) = (FFITypeSingleFloat >> 1)
> ifTrue:
> [floatRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)()')]
> ifFalse:
> [intRet := self dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqLong (*)()')].
> "undo any callee argument pops because it may confuse stack management with the alloca."
> (self isCalleePopsConvention: calloutState callFlags) ifTrue:
> [self setsp: calloutState argVector].
>
> self cppIf: COGMTVM ifTrue:
> [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
> [interpreterProxy ownVM: myThreadIndex]].
>
> (calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
> ["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
> 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
> (calloutState ffiRetHeader anyMask: FFIFlagPointer)
> ifTrue:
> [oop := self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]
> ifFalse:
> [oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
> ^oop].
>
> (atomicType >> 1) = (FFITypeSingleFloat >> 1)
> ifTrue:
> [oop := interpreterProxy floatObjectOf: floatRet]
> ifFalse:
> [oop := self ffiCreateIntegralResultOop: intRet
> ofAtomicType: atomicType
> in: calloutState].
> ^interpreterProxy methodReturnValue: oop!
>
> Item was removed:
> - ----- Method: ThreadedIA32FFIPlugin>>registerArgsSlop (in category 'marshalling') -----
> - registerArgsSlop
> - "Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
> - being overwritten by any register arguments during calls during marshalling. On x86 this is 0"
> - ^0!
>
> Item was removed:
> - ----- Method: ThreadedPPCBEFFIPlugin>>registerArgsSlop (in category 'marshalling') -----
> - registerArgsSlop
> - "Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
> - being overwritten by any register arguments during calls during marshalling. On PowerPC, which
> - has 8 register arguments in the calling convention, register arguments are also written to the stack.
> - So we must leave room for 8 * 4 bytes to avoid overwriting the marshalling stack as register
> - arguments are written to the stack during calls to interpreterProxy etc."
> - ^32!
>
> Item was removed:
> - ----- Method: ThreadedX64FFIPlugin>>registerArgsSlop (in category 'marshalling') -----
> - registerArgsSlop
> - "Answer any space needed to prevent the alloca'ed outgoing arguments marshalling area from
> - being overwritten by any register arguments during calls during marshalling. On ARM we
> - believe this is zero."
> - ^0!
>
> Item was changed:
> ----- Method: ThreadedX64SysVFFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
> ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
> <var: #procAddr type: #'void *'>
> <var: #calloutState type: #'CalloutState *'>
> <var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double, double, double, double, double)'>
> "Go out, call this guy and create the return value. This *must* be inlined because of
> the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
> | myThreadIndex atomicType floatRet intRet loadFloatRegs oop |
> <var: #floatRet type: #double>
> <var: #intRet type: 'SixteenByteReturn'>
> <inline: true>
> self cCode: '' inSmalltalk: [loadFloatRegs := #used. loadFloatRegs class].
> self cppIf: COGMTVM ifTrue:
> [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
> [myThreadIndex := interpreterProxy disownVM: 0]].
>
> - self registerArgsSlop + self cStackAlignment > 0 ifTrue:
> - [self setsp: calloutState argVector].
> -
> calloutState floatRegisterIndex > 0 ifTrue:
> [self
> load: (calloutState floatRegisters at: 0)
> Flo: (calloutState floatRegisters at: 1)
> a: (calloutState floatRegisters at: 2)
> t: (calloutState floatRegisters at: 3)
> R: (calloutState floatRegisters at: 4)
> e: (calloutState floatRegisters at: 5)
> g: (calloutState floatRegisters at: 6)
> s: (calloutState floatRegisters at: 7)].
>
> + (self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
> + [self setsp: calloutState argVector].
> +
> atomicType := self atomicTypeOf: calloutState ffiRetHeader.
> (atomicType >> 1) = (FFITypeSingleFloat >> 1)
> ifTrue:
> [atomicType = FFITypeSingleFloat
> ifTrue:
> [floatRet := self
> dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
> with: (calloutState integerRegisters at: 0)
> with: (calloutState integerRegisters at: 1)
> with: (calloutState integerRegisters at: 2)
> with: (calloutState integerRegisters at: 3)
> with: (calloutState integerRegisters at: 4)
> with: (calloutState integerRegisters at: 5)]
> ifFalse: "atomicType = FFITypeDoubleFloat"
> [floatRet := self
> dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
> with: (calloutState integerRegisters at: 0)
> with: (calloutState integerRegisters at: 1)
> with: (calloutState integerRegisters at: 2)
> with: (calloutState integerRegisters at: 3)
> with: (calloutState integerRegisters at: 4)
> with: (calloutState integerRegisters at: 5)]]
> ifFalse:
> [intRet := self
> dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'SixteenByteReturn (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
> with: (calloutState integerRegisters at: 0)
> with: (calloutState integerRegisters at: 1)
> with: (calloutState integerRegisters at: 2)
> with: (calloutState integerRegisters at: 3)
> with: (calloutState integerRegisters at: 4)
> with: (calloutState integerRegisters at: 5)].
> "undo any callee argument pops because it may confuse stack management with the alloca."
> (self isCalleePopsConvention: calloutState callFlags) ifTrue:
> [self setsp: calloutState argVector].
>
> self cppIf: COGMTVM ifTrue:
> [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
> [interpreterProxy ownVM: myThreadIndex]].
>
> (calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
> ["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
> 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
> (calloutState ffiRetHeader anyMask: FFIFlagPointer)
> ifTrue:
> [oop := self ffiReturnPointer: intRet a ofType: (self ffiReturnType: specOnStack) in: calloutState]
> ifFalse:
> [oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
> ^oop].
>
> (atomicType >> 1) = (FFITypeSingleFloat >> 1)
> ifTrue:
> [oop := interpreterProxy floatObjectOf: floatRet]
> ifFalse:
> [oop := self ffiCreateIntegralResultOop: intRet a
> ofAtomicType: atomicType
> in: calloutState].
> ^interpreterProxy methodReturnValue: oop!
>
> Item was changed:
> ----- Method: ThreadedX64Win64FFIPlugin>>ffiCalloutTo:SpecOnStack:in: (in category 'callout support') -----
> ffiCalloutTo: procAddr SpecOnStack: specOnStack in: calloutState
> <var: #procAddr type: #'void *'>
> <var: #calloutState type: #'CalloutState *'>
> <var: #loadFloatRegs declareC: 'extern void loadFloatRegs(double, double, double, double)'>
> "Go out, call this guy and create the return value. This *must* be inlined because of
> the alloca of the outgoing stack frame in ffiCall:WithFlags:NumArgs:Args:AndTypes:"
> | myThreadIndex atomicType floatRet intRet loadFloatRegs oop |
> <var: #floatRet type: #double>
> <var: #intRet type: #usqLong>
> <inline: true>
> self cCode: '' inSmalltalk: [loadFloatRegs := #used. loadFloatRegs class].
> self cppIf: COGMTVM ifTrue:
> [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
> [myThreadIndex := interpreterProxy disownVM: 0]].
>
> - self registerArgsSlop + self cStackAlignment > 0 ifTrue:
> - [self setsp: calloutState argVector].
> -
> calloutState floatRegisterSignature > 0 ifTrue:
> [self
> load: (calloutState floatRegisters at: 0)
> Flo: (calloutState floatRegisters at: 1)
> at: (calloutState floatRegisters at: 2)
> Re: (calloutState floatRegisters at: 3)
> gs: (calloutState floatRegisters at: 4)].
>
> + (self allocaLiesSoSetSpBeforeCall or: [self mustAlignStack]) ifTrue:
> + [self setsp: calloutState argVector].
> +
> atomicType := self atomicTypeOf: calloutState ffiRetHeader.
> (atomicType >> 1) = (FFITypeSingleFloat >> 1)
> ifTrue:
> [atomicType = FFITypeSingleFloat
> ifTrue:
> [floatRet := self
> dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'float (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
> with: (calloutState integerRegisters at: 0)
> with: (calloutState integerRegisters at: 1)
> with: (calloutState integerRegisters at: 2)
> with: (calloutState integerRegisters at: 3)]
> ifFalse: "atomicType = FFITypeDoubleFloat"
> [floatRet := self
> dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'double (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
> with: (calloutState integerRegisters at: 0)
> with: (calloutState integerRegisters at: 1)
> with: (calloutState integerRegisters at: 2)
> with: (calloutState integerRegisters at: 3)]]
> ifFalse:
> [intRet := self
> dispatchFunctionPointer: (self cCoerceSimple: procAddr to: 'usqIntptr_t (*)(sqIntptr_t, sqIntptr_t, sqIntptr_t, sqIntptr_t)')
> with: (calloutState integerRegisters at: 0)
> with: (calloutState integerRegisters at: 1)
> with: (calloutState integerRegisters at: 2)
> with: (calloutState integerRegisters at: 3)].
> "undo any callee argument pops because it may confuse stack management with the alloca."
> (self isCalleePopsConvention: calloutState callFlags) ifTrue:
> [self setsp: calloutState argVector].
>
> self cppIf: COGMTVM ifTrue:
> [(calloutState callFlags anyMask: FFICallFlagThreaded) ifTrue:
> [interpreterProxy ownVM: myThreadIndex]].
>
> (calloutState ffiRetHeader anyMask: FFIFlagPointer+FFIFlagStructure) ifTrue:
> ["Note: Order is important here since FFIFlagPointer + FFIFlagStructure is used to represent
> 'typedef void* VoidPointer' and VoidPointer must be returned as pointer *not* as struct."
> (calloutState ffiRetHeader anyMask: FFIFlagPointer)
> ifTrue:
> [oop := self ffiReturnPointer: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState]
> ifFalse:
> [oop := self ffiReturnStruct: intRet ofType: (self ffiReturnType: specOnStack) in: calloutState].
> ^oop].
>
> (atomicType >> 1) = (FFITypeSingleFloat >> 1)
> ifTrue:
> [oop := interpreterProxy floatObjectOf: floatRet]
> ifFalse:
> [oop := self ffiCreateIntegralResultOop: intRet
> ofAtomicType: atomicType
> in: calloutState].
> ^interpreterProxy methodReturnValue: oop!
>
More information about the Vm-dev
mailing list