[Vm-dev] latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)
Esteban Lorenzano
estebanlm at gmail.com
Wed Nov 30 09:17:33 UTC 2016
> On 30 Nov 2016, at 01:23, Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com> wrote:
>
> Though, it's necessary to define ALLOCA_LIES_SO_USE_GETSP to zero to make FFI work with gcc.
> That does not mean that alloca does not lie, just that there is another problem with stack management…
so, this workaround might be incorrect…
>
> 2016-11-29 21:22 GMT+01:00 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com <mailto:nicolas.cellier.aka.nice at gmail.com>>:
> Thanks Ronie and Esteban.
> This seems to be an alignment problem indeed.
> What I see is that alignment is defined at least in 3 different places:
> - platforms/Cross/vm/sqCogStackAlignment.h
> - platforms/Cross/plugins/IA32ABI/ia32abicc.c
> - src/plugins/SqueakFFIPrims/IA32FFIPlugin.c and friends...
> That's just too many different opinions!!! We have to unify that rather than adding a 4th opinion in a Makefile.
well, yes :)
>
> However, about ALLOCA_LIES_SO_USE_GETSP, I'm not so sure that "It is NOT the case of mingw."
> Last time I used gdb, it WAS still the case, alloca was STILL lying.
> See http://lists.squeakfoundation.org/pipermail/vm-dev/2016-August/022985.html <http://lists.squeakfoundation.org/pipermail/vm-dev/2016-August/022985.html>
>
> BUT:
> -----
> forcing 16 bytes alignment supersedes the alloca hack, making it not strictly necessary anymore
> see below in generated src/plgins/IA32FFIPlugin.c:
>
> allocation = alloca(((stackSize + ((calloutState->structReturnSize)))) + (cStackAlignment()));
> if (allocaLiesSoUseGetsp()) {
> allocation = getsp();
> }
> if ((cStackAlignment()) != 0) {
> allocation = ((char *) ((((((usqInt)allocation)) | ((cStackAlignment()) - 1)) - ((cStackAlignment()) - 1))));
> }
> (calloutState->argVector = allocation);
>
> but we further do:
>
> if ((0 + (cStackAlignment())) > 0) {
> setsp((calloutState->argVector));
> }
>
> So if ever the stack pointer is greater than alloca return value, but we removed the ALLOCA_LIES hack,
> the stack pointer is then set back to alloca returned value, avoiding the stack pointer offset problem
> It would be worth writing a unit test case, and inquiring the reason why it lies in gcc mailing list to be sure...
>
> cheers
>
> 2016-11-29 18:14 GMT+01:00 Esteban Lorenzano <estebanlm at gmail.com <mailto:estebanlm at gmail.com>>:
>
> hah!
> you know what is the sad part of this? I wrote that message… it was for the future me, but I forget to check our flags :P
> I lost 2.5 days then + 2 days now.
>
> this fixes the problem with Windows crashes (yay!) but not the problem with callbacks (booo!)… any idea in that area?
>
> cheers,
> Esteban
>
>> On 29 Nov 2016, at 17:30, Ronie Salgado <roniesalg at gmail.com <mailto:roniesalg at gmail.com>> wrote:
>>
>> The last week I was having this exactly same crash in the MinimalisticHeadless branch, with both MinGW and with Visual Studio. I managed to get the VM working with MinGW (not yet with MSVC) by using the following defines,which I copied from the old Pharo CMake scripts:
>>
>> -DSTACK_ALIGN_BYTES=16 -DALLOCA_LIES_SO_USE_GETSP=0
>>
>> In the pharo-vm, the CogFamilyWindowsConfig >> #commonCompilerFlags method starts with the following comment:
>> commonCompilerFlags
>> "omit -ggdb2 to prevent generating debug info"
>> "Some flags explanation:
>>
>> STACK_ALIGN_BYTES=16 is needed in mingw and FFI (and I suppose on other modules too).
>> DALLOCA_LIES_SO_USE_GETSP=0 Some compilers return the stack address+4 on alloca function,
>> then FFI module needs to adjust that. It is NOT the case of mingw.
>> For more information see this thread: http://forum.world.st/There-are-something-fishy-with-FFI-plugin-td4584226.html <http://forum.world.st/There-are-something-fishy-with-FFI-plugin-td4584226.html>
>> "
>>
>>
>> 2016-11-29 9:32 GMT-03:00 Esteban Lorenzano <estebanlm at gmail.com <mailto:estebanlm at gmail.com>>:
>>
>>
>>> On 29 Nov 2016, at 13:04, Clément Bera <bera.clement at gmail.com <mailto:bera.clement at gmail.com>> wrote:
>>>
>>> Hi,
>>>
>>> Can you confirm this bug happen only in Windows ?
>>
>> yes, the crash is just in windows.
>> the callback problem is general (note that FFICallbackTests works fine, but I think this is related to the fact that it never enters the 2nd condition with the qsort function) .
>>
>>>
>>> Do you have version number (both VMMaker and git commit) of the last version you have that was working ?
>>
>> sadly, not… I tried to get the latest working version, but with the mess I have to get the VM to build with opensmalltalk-vm, I couldn’t track it.
>> I suspect is related to the work on 64bits for windows, but I have no proof of that :P
>>
>> Esteban
>>
>>>
>>> Thanks.
>>>
>>>
>>> On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano <estebanlm at gmail.com <mailto:estebanlm at gmail.com>> wrote:
>>>
>>> Hi,
>>>
>>> So, I’m building the PharoVM along with all his dependencies. For me, this is a major step because I can drop the old build process finally.
>>> Now, I’m having serious problems with FFI (that they were not present before), :
>>>
>>>
>>> 1. CRASH IN WINDOWS (32bits):
>>>
>>> In Win32, it crashes automatically when trying to access this funtion:
>>>
>>> getEnvSize: nameString
>>> ^ self ffiCall: #( int GetEnvironmentVariableA ( String nameString, nil, 0 ) ) module: #Kernel32
>>>
>>> (this works perfectly fine in older versions)
>>>
>>> 2. CALLBACKS FAILING:
>>>
>>> Callbacks have problems. The examples passes but they are very simple… as soon as I try to do something complicates (like unqlite bindings or libgit2 bindings, who use callbacks intensively), callbacks stops working.
>>> I traced the problem up to this method:
>>>
>>> StackInterpreter>>#returnAs:ThroughCallback:Context:
>>>
>>> returnAs: returnTypeOop ThroughCallback: vmCallbackContext Context: callbackMethodContext
>>> "callbackMethodContext is an activation of invokeCallback:[stack:registers:jmpbuf:].
>>> Its sender is the VM's state prior to the callback. Reestablish that state (via longjmp),
>>> and mark callbackMethodContext as dead."
>>> <export: true>
>>> <var: #vmCallbackContext type: #'VMCallbackContext *'>
>>> | calloutMethodContext theFP thePage |
>>> <var: #theFP type: #'char *'>
>>> <var: #thePage type: #'StackPage *'>
>>> ((self isIntegerObject: returnTypeOop)
>>> and: [self isLiveContext: callbackMethodContext]) ifFalse:
>>> [^false].
>>> calloutMethodContext := self externalInstVar: SenderIndex ofContext: callbackMethodContext.
>>> (self isLiveContext: calloutMethodContext) ifFalse:
>>> [^false].
>>> "We're about to leave this stack page; must save the current frame's instructionPointer."
>>> self push: instructionPointer.
>>> self externalWriteBackHeadFramePointers.
>>> "Mark callbackMethodContext as dead; the common case is that it is the current frame.
>>> We go the extra mile for the debugger."
>>> (self isSingleContext: callbackMethodContext)
>>> ifTrue: [self markContextAsDead: callbackMethodContext]
>>> ifFalse:
>>> [theFP := self frameOfMarriedContext: callbackMethodContext.
>>> framePointer = theFP "common case"
>>> ifTrue:
>>> [(self isBaseFrame: theFP)
>>> ifTrue: [stackPages freeStackPage: stackPage]
>>> ifFalse: "calloutMethodContext is immediately below on the same page. Make it current."
>>> [instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
>>> stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
>>> framePointer := self frameCallerFP: framePointer.
>>> self setMethod: (self frameMethodObject: framePointer).
>>> self restoreCStackStateForCallbackContext: vmCallbackContext.
>>> "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
>>> This matches the use of _setjmp in ia32abicc.c."
>>> self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
>>> ^true]]
>>> ifFalse:
>>> [self externalDivorceFrame: theFP andContext: callbackMethodContext.
>>> self markContextAsDead: callbackMethodContext]].
>>> "Make the calloutMethodContext the active frame. The case where calloutMethodContext
>>> is immediately below callbackMethodContext on the same page is handled above."
>>> (self isStillMarriedContext: calloutMethodContext)
>>> ifTrue:
>>> [theFP := self frameOfMarriedContext: calloutMethodContext.
>>> thePage := stackPages stackPageFor: theFP.
>>> "findSPOf:on: points to the word beneath the instructionPointer, but
>>> there is no instructionPointer on the top frame of the current page."
>>> self assert: thePage ~= stackPage.
>>> stackPointer := (self findSPOf: theFP on: thePage) - objectMemory wordSize.
>>> framePointer := theFP]
>>> ifFalse:
>>> [thePage := self makeBaseFrameFor: calloutMethodContext.
>>> framePointer := thePage headFP.
>>> stackPointer := thePage headSP].
>>> instructionPointer := self popStack.
>>> self setMethod: (objectMemory fetchPointer: MethodIndex ofObject: calloutMethodContext).
>>> self setStackPageAndLimit: thePage.
>>> self restoreCStackStateForCallbackContext: vmCallbackContext.
>>> "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
>>> This matches the use of _setjmp in ia32abicc.c."
>>> self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
>>> "NOTREACHED"
>>> ^true
>>>
>>> with the first siglongjmp callbacks are passing fine.
>>> with the last (it would be if framePointer = theFP AND !(isBaseFrame: theFP) ) it doesn’t.
>>>
>>> So… from here I’m a bit lost… I need some help :)
>>>
>>> thanks,
>>> Esteban
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20161130/3a4b89f7/attachment-0001.html>
More information about the Vm-dev
mailing list