[Vm-dev] latest changes (no idea from when it started) are making FFI win32 crash (and FFI Callbacks are not reliable anymore either)

David T. Lewis lewis at mail.msen.com
Thu Dec 1 00:17:58 UTC 2016


Checking a couple of man pages, on Linux we are warned that one should not
expect too much in the way of standards and specifications:


CONFORMING TO
       This function is not in POSIX.1-2001.

       There is evidence that the alloca() function appeared in 32V, PWB, PWB.2, 3BSD, and 4BSD.  There is a man page for it in 4.3BSD.  Linux
       uses the GNU version.


Noting from the above that alloca() seems to have orginated with BSD, check
the man page on FreeBSD:


BUGS
     The alloca() function is machine and compiler dependent; its use is dis-
     couraged.

     The alloca() function is slightly unsafe because it cannot ensure that
     the pointer returned points to a valid and usable block of memory.  The
     allocation made may exceed the bounds of the stack, or even go further
     into other objects in memory, and alloca() cannot determine such an
     error.  Avoid alloca() with large unbounded allocations.

FreeBSD 6.2                    September 5, 2006                   FreeBSD 6.2


So it is not expected to be portable or well specified or well behaved, and
we should not be too surprised if implementation details vary on different
platforms and compilers.

Dave


On Thu, Dec 01, 2016 at 12:27:03AM +0100, Nicolas Cellier wrote:
>  
> Hi Andres
> "It would be worth writing  a unit test case" did mean exactly that, write
> a few lines of C.
> 
> I don't know what you call specification here, probably our expectation?
> The behavior we expect, though sounding reasonnable, is not specified by
> any standard I know of.
> It's probably unspecified and at best implementation defined.
> That's why I suggest inquiring gcc implementation.
> 
> Both i686-w64-mingw32-gcc and x86_64-w64-mingw32-gcc do reserve bytes on
> the stack under the alloca'ed.
> My findings is that this space depends on max number of parameters of
> functions called.
> For example, calling fprintf(1,"\n") after alloca would reserve 8
> additional bytes on i686 and 16 on x86_64.
> Calling fprintf(1,"%d\n",x); would reserve 12 and 24 bytes respectively.
> 
> This does not happen with clang.
> 
> 2016-11-30 3:07 GMT+01:00 Andres Valloud <avalloud at smalltalk.comcastbiz.net>
> :
> 
> >
> > To prove alloca() *lies*, one needs to show e.g. a 5-10 C program
> > independent of anything else exemplifying a clear specification violation.
> > Otherwise, how do you know the LIARS_LIARS_PANTS_ON_FIRE macros are not
> > compensating for undefined behavior elsewhere?
> >
> > On 11/29/16 16:23 , Nicolas Cellier wrote:
> >
> >>
> >>
> >>
> >>
> >> Though, it's necessary to define ALLOCA_LIES_SO_USE_GETSP to zero to
> >> make FFI work with gcc.
> >> That does not mean that alloca does not lie, just that there is another
> >> problem with stack management...
> >>
> >> 2016-11-29 21:22 GMT+01:00 Nicolas Cellier
> >> <nicolas.cellier.aka.nice at gmail.com
> >> <mailto:nicolas.cellier.aka.nice at gmail.com>>:
> >>
> >>
> >>     Thanks Ronie and Esteban.
> >>     This seems to be an alignment problem indeed.
> >>     What I see is that alignment is defined at least in 3 different
> >> places:
> >>     - platforms/Cross/vm/sqCogStackAlignment.h
> >>     - platforms/Cross/plugins/IA32ABI/ia32abicc.c
> >>     - src/plugins/SqueakFFIPrims/IA32FFIPlugin.c and friends...
> >>     That's just too many different opinions!!! We have to unify that
> >>     rather than adding a 4th opinion in a Makefile.
> >>
> >>     However, about ALLOCA_LIES_SO_USE_GETSP, I'm not so sure that "It is
> >>     NOT the case of mingw."
> >>     Last time I used gdb, it WAS still the case, alloca was STILL lying.
> >>     See
> >>     http://lists.squeakfoundation.org/pipermail/vm-dev/2016-Augu
> >> st/022985.html
> >>     <http://lists.squeakfoundation.org/pipermail/vm-dev/2016-
> >> August/022985.html>
> >>
> >>     BUT:
> >>     -----
> >>     forcing 16 bytes alignment supersedes the alloca hack, making it not
> >>     strictly necessary anymore
> >>     see below in generated src/plgins/IA32FFIPlugin.c:
> >>
> >>             allocation = alloca(((stackSize +
> >>     ((calloutState->structReturnSize)))) + (cStackAlignment()));
> >>             if (allocaLiesSoUseGetsp()) {
> >>                     allocation = getsp();
> >>             }
> >>             if ((cStackAlignment()) != 0) {
> >>                     allocation = ((char *) ((((((usqInt)allocation)) |
> >>     ((cStackAlignment()) - 1)) - ((cStackAlignment()) - 1))));
> >>             }
> >>             (calloutState->argVector = allocation);
> >>
> >>     but we further do:
> >>
> >>             if ((0 + (cStackAlignment())) > 0) {
> >>                     setsp((calloutState->argVector));
> >>             }
> >>
> >>     So if ever the stack pointer is greater than alloca return value,
> >>     but we removed the ALLOCA_LIES hack,
> >>     the stack pointer is then set back to alloca returned value,
> >>     avoiding the stack pointer offset problem
> >>     It would be worth writing  a unit test case, and inquiring the
> >>     reason why it lies in gcc mailing list to be sure...
> >>
> >>     cheers
> >>
> >>     2016-11-29 18:14 GMT+01:00 Esteban Lorenzano <estebanlm at gmail.com
> >>     <mailto:estebanlm at gmail.com>>:
> >>
> >>
> >>         hah!
> >>         you know what is the sad part of this? I wrote that message??? it
> >>         was for the future me, but I forget to check our flags :P
> >>         I lost 2.5 days then + 2 days now.
> >>
> >>         this fixes the problem with Windows crashes (yay!) but not the
> >>         problem with callbacks (booo!)??? any idea in that area?
> >>
> >>         cheers,
> >>         Esteban
> >>
> >>         On 29 Nov 2016, at 17:30, Ronie Salgado <roniesalg at gmail.com
> >>>         <mailto:roniesalg at gmail.com>> wrote:
> >>>
> >>>         The last week I was having this exactly same crash in the
> >>>         MinimalisticHeadless branch, with both MinGW and with Visual
> >>>         Studio. I managed to get the VM working with MinGW (not yet
> >>>         with MSVC) by using the following defines,which I copied from
> >>>         the old Pharo CMake scripts:
> >>>
> >>>         -DSTACK_ALIGN_BYTES=16 -DALLOCA_LIES_SO_USE_GETSP=0
> >>>
> >>>         In the pharo-vm, the CogFamilyWindowsConfig >>
> >>>         #commonCompilerFlags method starts with the following comment:
> >>>         commonCompilerFlags
> >>>             "omit -ggdb2 to prevent generating debug info"
> >>>             "Some flags explanation:
> >>>
> >>>             STACK_ALIGN_BYTES=16 is needed in mingw and FFI (and I
> >>>         suppose on other modules too).
> >>>             DALLOCA_LIES_SO_USE_GETSP=0 Some compilers return the
> >>>         stack address+4 on alloca function,
> >>>             then FFI module needs to adjust that. It is NOT the case
> >>>         of mingw.
> >>>             For more information see this thread:
> >>>         http://forum.world.st/There-are-something-fishy-with-FFI-plu
> >>> gin-td4584226.html
> >>>         <http://forum.world.st/There-are-something-fishy-with-FFI-pl
> >>> ugin-td4584226.html>
> >>>             "
> >>>
> >>>
> >>>         2016-11-29 9:32 GMT-03:00 Esteban Lorenzano
> >>>         <estebanlm at gmail.com <mailto:estebanlm at gmail.com>>:
> >>>
> >>>
> >>>
> >>>             On 29 Nov 2016, at 13:04, Cl??ment Bera
> >>>>             <bera.clement at gmail.com <mailto:bera.clement at gmail.com>>
> >>>>             wrote:
> >>>>
> >>>>             Hi,
> >>>>
> >>>>             Can you confirm this bug happen only in Windows ?
> >>>>
> >>>
> >>>             yes, the crash is just in windows.
> >>>             the callback problem is general (note that
> >>>             FFICallbackTests works fine, but I think this is related
> >>>             to the fact that it never enters the 2nd condition with
> >>>             the qsort function) .
> >>>
> >>>
> >>>>             Do you have version number (both VMMaker and git commit)
> >>>>             of the last version you have that was working ?
> >>>>
> >>>
> >>>             sadly, not??? I tried to get the latest working version, but
> >>>             with the mess I have to get the VM to build with
> >>>             opensmalltalk-vm, I couldn???t track it.
> >>>             I suspect is related to the work on 64bits for windows,
> >>>             but I have no proof of that :P
> >>>
> >>>             Esteban
> >>>
> >>>
> >>>>             Thanks.
> >>>>
> >>>>
> >>>>             On Tue, Nov 29, 2016 at 11:54 AM, Esteban Lorenzano
> >>>>             <estebanlm at gmail.com <mailto:estebanlm at gmail.com>> wrote:
> >>>>
> >>>>
> >>>>                 Hi,
> >>>>
> >>>>                 So, I???m building the PharoVM along with all his
> >>>>                 dependencies. For me, this is a major step because I
> >>>>                 can drop the old build process finally.
> >>>>                 Now, I???m having serious problems with FFI (that they
> >>>>                 were not present before), :
> >>>>
> >>>>
> >>>>                 1. CRASH IN WINDOWS (32bits):
> >>>>
> >>>>                 In Win32, it crashes automatically when trying to
> >>>>                 access this funtion:
> >>>>
> >>>>                 getEnvSize: nameString
> >>>>                         ^ self ffiCall: #( int
> >>>>                 GetEnvironmentVariableA ( String nameString, nil, 0 )
> >>>>                 ) module: #Kernel32
> >>>>
> >>>>                  (this works perfectly fine in older versions)
> >>>>
> >>>>                 2. CALLBACKS FAILING:
> >>>>
> >>>>                 Callbacks have problems. The examples passes but they
> >>>>                 are very simple??? as soon as I try to do something
> >>>>                 complicates (like unqlite bindings or libgit2
> >>>>                 bindings, who use callbacks intensively), callbacks
> >>>>                 stops working.
> >>>>                 I traced the problem up to this method:
> >>>>
> >>>>                 StackInterpreter>>#returnAs:ThroughCallback:Context:
> >>>>
> >>>>                 returnAs: returnTypeOop ThroughCallback:
> >>>>                 vmCallbackContext Context: callbackMethodContext
> >>>>                         "callbackMethodContext is an activation of
> >>>>                 invokeCallback:[stack:registers:jmpbuf:].
> >>>>                          Its sender is the VM's state prior to the
> >>>>                 callback.  Reestablish that state (via longjmp),
> >>>>                          and mark callbackMethodContext as dead."
> >>>>                         <export: true>
> >>>>                         <var: #vmCallbackContext type:
> >>>>                 #'VMCallbackContext *'>
> >>>>                         | calloutMethodContext theFP thePage |
> >>>>                         <var: #theFP type: #'char *'>
> >>>>                         <var: #thePage type: #'StackPage *'>
> >>>>                         ((self isIntegerObject: returnTypeOop)
> >>>>                          and: [self isLiveContext:
> >>>>                 callbackMethodContext]) ifFalse:
> >>>>                                 [^false].
> >>>>                         calloutMethodContext := self externalInstVar:
> >>>>                 SenderIndex ofContext: callbackMethodContext.
> >>>>                         (self isLiveContext: calloutMethodContext)
> >>>>                 ifFalse:
> >>>>                                 [^false].
> >>>>                         "We're about to leave this stack page; must
> >>>>                 save the current frame's instructionPointer."
> >>>>                         self push: instructionPointer.
> >>>>                         self externalWriteBackHeadFramePointers.
> >>>>                         "Mark callbackMethodContext as dead; the
> >>>>                 common case is that it is the current frame.
> >>>>                          We go the extra mile for the debugger."
> >>>>                         (self isSingleContext: callbackMethodContext)
> >>>>                                 ifTrue: [self markContextAsDead:
> >>>>                 callbackMethodContext]
> >>>>                                 ifFalse:
> >>>>                                         [theFP := self
> >>>>                 frameOfMarriedContext: callbackMethodContext.
> >>>>                                          framePointer = theFP "common
> >>>>                 case"
> >>>>                                                 ifTrue:
> >>>>                                                         [(self
> >>>>                 isBaseFrame: theFP)
> >>>>
> >>>>                 ifTrue: [stackPages freeStackPage: stackPage]
> >>>>
> >>>>                 ifFalse: "calloutMethodContext is immediately below
> >>>>                 on the same page.  Make it current."
> >>>>
> >>>>                   [instructionPointer := (self frameCallerSavedIP:
> >>>>                 framePointer) asUnsignedInteger.
> >>>>
> >>>>                    stackPointer := framePointer + (self
> >>>>                 frameStackedReceiverOffset: framePointer) +
> >>>>                 objectMemory wordSize.
> >>>>
> >>>>                    framePointer := self frameCallerFP: framePointer.
> >>>>
> >>>>                    self setMethod: (self frameMethodObject:
> >>>>                 framePointer).
> >>>>
> >>>>                    self restoreCStackStateForCallbackContext:
> >>>>                 vmCallbackContext.
> >>>>
> >>>>                    "N.B. siglongjmp is defines as _longjmp on
> >>>>                 non-win32 platforms.
> >>>>
> >>>>                     This matches the use of _setjmp in ia32abicc.c."
> >>>>
> >>>>                    self siglong: vmCallbackContext trampoline jmp:
> >>>>                 (self integerValueOf: returnTypeOop).
> >>>>
> >>>>                    ^true]]
> >>>>                                                 ifFalse:
> >>>>                                                         [self
> >>>>                 externalDivorceFrame: theFP andContext:
> >>>>                 callbackMethodContext.
> >>>>                                                          self
> >>>>                 markContextAsDead: callbackMethodContext]].
> >>>>                         "Make the calloutMethodContext the active
> >>>>                 frame.  The case where calloutMethodContext
> >>>>                          is immediately below callbackMethodContext
> >>>>                 on the same page is handled above."
> >>>>                         (self isStillMarriedContext:
> >>>>                 calloutMethodContext)
> >>>>                                 ifTrue:
> >>>>                                         [theFP := self
> >>>>                 frameOfMarriedContext: calloutMethodContext.
> >>>>                                          thePage := stackPages
> >>>>                 stackPageFor: theFP.
> >>>>                                          "findSPOf:on: points to the
> >>>>                 word beneath the instructionPointer, but
> >>>>                                           there is no
> >>>>                 instructionPointer on the top frame of the current
> >>>> page."
> >>>>                                          self assert: thePage ~=
> >>>>                 stackPage.
> >>>>                                          stackPointer := (self
> >>>>                 findSPOf: theFP on: thePage) - objectMemory wordSize.
> >>>>                                          framePointer := theFP]
> >>>>                                 ifFalse:
> >>>>                                         [thePage := self
> >>>>                 makeBaseFrameFor: calloutMethodContext.
> >>>>                                          framePointer := thePage headFP.
> >>>>                                          stackPointer := thePage
> >>>> headSP].
> >>>>                         instructionPointer := self popStack.
> >>>>                         self setMethod: (objectMemory fetchPointer:
> >>>>                 MethodIndex ofObject: calloutMethodContext).
> >>>>                         self setStackPageAndLimit: thePage.
> >>>>                         self restoreCStackStateForCallbackContext:
> >>>>                 vmCallbackContext.
> >>>>                          "N.B. siglongjmp is defines as _longjmp on
> >>>>                 non-win32 platforms.
> >>>>                           This matches the use of _setjmp in
> >>>>                 ia32abicc.c."
> >>>>                         self siglong: vmCallbackContext trampoline
> >>>>                 jmp: (self integerValueOf: returnTypeOop).
> >>>>                         "NOTREACHED"
> >>>>                         ^true
> >>>>
> >>>>                 with the first siglongjmp callbacks are passing fine.
> >>>>                 with the last (it would be if  framePointer = theFP
> >>>>                 AND !(isBaseFrame: theFP) ) it doesn???t.
> >>>>
> >>>>                 So??? from here I???m a bit lost??? I need some help :)
> >>>>
> >>>>                 thanks,
> >>>>                 Esteban
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >>



More information about the Vm-dev mailing list