On 30 Nov 2016, at 01:23, Nicolas Cellier <nicolas.cellier.aka.nice@gmail.com> wrote:

Though, it's necessary to define ALLOCA_LIES_SO_USE_GETSP to zero to make FFI work with gcc.
That does not mean that alloca does not lie, just that there is another problem with stack management…

so, this workaround might be incorrect…  


2016-11-29 21:22 GMT+01:00 Nicolas Cellier <nicolas.cellier.aka.nice@gmail.com>:
Thanks Ronie and Esteban.
This seems to be an alignment problem indeed.
What I see is that alignment is defined at least in 3 different places:
- platforms/Cross/vm/sqCogStackAlignment.h
- platforms/Cross/plugins/IA32ABI/ia32abicc.c
- src/plugins/SqueakFFIPrims/IA32FFIPlugin.c and friends...
That's just too many different opinions!!! We have to unify that rather than adding a 4th opinion in a Makefile.

well, yes :)


However, about ALLOCA_LIES_SO_USE_GETSP, I'm not so sure that "It is NOT the case of mingw."
Last time I used gdb, it WAS still the case, alloca was STILL lying.
See http://lists.squeakfoundation.org/pipermail/vm-dev/2016-August/022985.html

BUT:
-----
forcing 16 bytes alignment supersedes the alloca hack, making it not strictly necessary anymore
see below in generated src/plgins/IA32FFIPlugin.c:

        allocation = alloca(((stackSize + ((calloutState->structReturnSize)))) + (cStackAlignment()));
        if (allocaLiesSoUseGetsp()) {
                allocation = getsp();
        }
        if ((cStackAlignment()) != 0) {
                allocation = ((char *) ((((((usqInt)allocation)) | ((cStackAlignment()) - 1)) - ((cStackAlignment()) - 1))));
        }
        (calloutState->argVector = allocation);

but we further do:

        if ((0 + (cStackAlignment())) > 0) {
                setsp((calloutState->argVector));
        }

So if ever the stack pointer is greater than alloca return value, but we removed the ALLOCA_LIES hack,
the stack pointer is then set back to alloca returned value, avoiding the stack pointer offset problem
It would be worth writing  a unit test case, and inquiring the reason why it lies in gcc mailing list to be sure...

cheers

2016-11-29 18:14 GMT+01:00 Esteban Lorenzano <estebanlm@gmail.com>:
 
hah! 
you know what is the sad part of this? I wrote that message… it was for the future me, but I forget to check our flags :P
I lost 2.5 days then + 2 days now. 

this fixes the problem with Windows crashes (yay!) but not the problem with callbacks (booo!)… any idea in that area?

cheers, 
Esteban

On 29 Nov 2016, at 17:30, Ronie Salgado <roniesalg@gmail.com> wrote:

The last week I was having this exactly same crash in the MinimalisticHeadless branch, with both MinGW and with Visual Studio. I managed to get the VM working with MinGW (not yet with MSVC) by using the following defines,which I copied from the old Pharo CMake scripts:

-DSTACK_ALIGN_BYTES=16 -DALLOCA_LIES_SO_USE_GETSP=0

In the pharo-vm, the CogFamilyWindowsConfig >> #commonCompilerFlags method starts with the following comment:
commonCompilerFlags
    "omit -ggdb2 to prevent generating debug info"
    "Some flags explanation:
   
    STACK_ALIGN_BYTES=16 is needed in mingw and FFI (and I suppose on other modules too).
    DALLOCA_LIES_SO_USE_GETSP=0 Some compilers return the stack address+4 on alloca function,
    then FFI module needs to adjust that. It is NOT the case of mingw.
    For more information see this thread: http://forum.world.st/There-are-something-fishy-with-FFI-plugin-td4584226.html
    "


2016-11-29 9:32 GMT-03:00 Esteban Lorenzano <estebanlm@gmail.com>:
 

On 29 Nov 2016, at 13:04, Clément Bera <bera.clement@gmail.com> wrote:

Hi,

Can you confirm this bug happen only in Windows ?

yes, the crash is just in windows.
the callback problem is general (note that FFICallbackTests works fine, but I think this is related to the fact that it never enters the 2nd condition with the qsort function) .


Do you have version number (both VMMaker and git commit) of the last version you have that was working ?

sadly, not… I tried to get the latest working version, but with the mess I have to get the VM to build with opensmalltalk-vm, I couldn’t track it. 
I suspect is related to the work on 64bits for windows, but I have no proof of that :P

Esteban


Thanks.


On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano <estebanlm@gmail.com> wrote:

Hi,

So, I’m building the PharoVM along with all his dependencies. For me, this is a major step because I can drop the old build process finally.
Now, I’m having serious problems with FFI (that they were not present before), :


1. CRASH IN WINDOWS (32bits):

In Win32, it crashes automatically when trying to access this funtion:

getEnvSize: nameString
        ^ self ffiCall: #( int GetEnvironmentVariableA ( String nameString, nil, 0 ) ) module: #Kernel32

 (this works perfectly fine in older versions)

2. CALLBACKS FAILING:

Callbacks have problems. The examples passes but they are very simple… as soon as I try to do something complicates (like unqlite bindings or libgit2 bindings, who use callbacks intensively), callbacks stops working.
I traced the problem up to this method:

StackInterpreter>>#returnAs:ThroughCallback:Context:

returnAs: returnTypeOop ThroughCallback: vmCallbackContext Context: callbackMethodContext
        "callbackMethodContext is an activation of invokeCallback:[stack:registers:jmpbuf:].
         Its sender is the VM's state prior to the callback.  Reestablish that state (via longjmp),
         and mark callbackMethodContext as dead."
        <export: true>
        <var: #vmCallbackContext type: #'VMCallbackContext *'>
        | calloutMethodContext theFP thePage |
        <var: #theFP type: #'char *'>
        <var: #thePage type: #'StackPage *'>
        ((self isIntegerObject: returnTypeOop)
         and: [self isLiveContext: callbackMethodContext]) ifFalse:
                [^false].
        calloutMethodContext := self externalInstVar: SenderIndex ofContext: callbackMethodContext.
        (self isLiveContext: calloutMethodContext) ifFalse:
                [^false].
        "We're about to leave this stack page; must save the current frame's instructionPointer."
        self push: instructionPointer.
        self externalWriteBackHeadFramePointers.
        "Mark callbackMethodContext as dead; the common case is that it is the current frame.
         We go the extra mile for the debugger."
        (self isSingleContext: callbackMethodContext)
                ifTrue: [self markContextAsDead: callbackMethodContext]
                ifFalse:
                        [theFP := self frameOfMarriedContext: callbackMethodContext.
                         framePointer = theFP "common case"
                                ifTrue:
                                        [(self isBaseFrame: theFP)
                                                ifTrue: [stackPages freeStackPage: stackPage]
                                                ifFalse: "calloutMethodContext is immediately below on the same page.  Make it current."
                                                        [instructionPointer := (self frameCallerSavedIP: framePointer) asUnsignedInteger.
                                                         stackPointer := framePointer + (self frameStackedReceiverOffset: framePointer) + objectMemory wordSize.
                                                         framePointer := self frameCallerFP: framePointer.
                                                         self setMethod: (self frameMethodObject: framePointer).
                                                         self restoreCStackStateForCallbackContext: vmCallbackContext.
                                                         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
                                                          This matches the use of _setjmp in ia32abicc.c."
                                                         self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
                                                         ^true]]
                                ifFalse:
                                        [self externalDivorceFrame: theFP andContext: callbackMethodContext.
                                         self markContextAsDead: callbackMethodContext]].
        "Make the calloutMethodContext the active frame.  The case where calloutMethodContext
         is immediately below callbackMethodContext on the same page is handled above."
        (self isStillMarriedContext: calloutMethodContext)
                ifTrue:
                        [theFP := self frameOfMarriedContext: calloutMethodContext.
                         thePage := stackPages stackPageFor: theFP.
                         "findSPOf:on: points to the word beneath the instructionPointer, but
                          there is no instructionPointer on the top frame of the current page."
                         self assert: thePage ~= stackPage.
                         stackPointer := (self findSPOf: theFP on: thePage) - objectMemory wordSize.
                         framePointer := theFP]
                ifFalse:
                        [thePage := self makeBaseFrameFor: calloutMethodContext.
                         framePointer := thePage headFP.
                         stackPointer := thePage headSP].
        instructionPointer := self popStack.
        self setMethod: (objectMemory fetchPointer: MethodIndex ofObject: calloutMethodContext).
        self setStackPageAndLimit: thePage.
        self restoreCStackStateForCallbackContext: vmCallbackContext.
         "N.B. siglongjmp is defines as _longjmp on non-win32 platforms.
          This matches the use of _setjmp in ia32abicc.c."
        self siglong: vmCallbackContext trampoline jmp: (self integerValueOf: returnTypeOop).
        "NOTREACHED"
        ^true

with the first siglongjmp callbacks are passing fine.
with the last (it would be if  framePointer = theFP AND !(isBaseFrame: theFP) ) it doesn’t.

So… from here I’m a bit lost… I need some help :)

thanks,
Esteban