[Vm-dev] 64bits Pharo VM for windows

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Wed Mar 22 00:14:00 UTC 2017


Now the failure is:

gdb: unknown target exception 0xc0000028 at 0x76f38078

Program received signal ?, Unknown signal.
0x0000000076f38078 in ntdll!RtlRaiseStatus () from
/cygdrive/c/Windows/SYSTEM32/ntdll.dll

(gdb) where
#0  0x0000000076f38078 in ntdll!RtlRaiseStatus () from
/cygdrive/c/Windows/SYSTEM32/ntdll.dll
#1  0x0000000076ed7eb6 in ntdll!TpAlpcRegisterCompletionList () from
/cygdrive/c/Windows/SYSTEM32/ntdll.dll
#2  0x000007fefeb7e5a3 in msvcrt!longjmp () from
/cygdrive/c/Windows/system32/msvcrt.dll
#3  0x00000000004314f9 in returnToExecutivepostContextSwitch
(inInterpreter=0, switchedContext=1)
    at ../../spur64src/vm/gcc3x-cointerp.c:22130
#4  0x000000000043a110 in activateNewMethod () at
../../spur64src/vm/gcc3x-cointerp.c:15045
#5  0x000000000043c6e2 in interpretMethodFromMachineCode () at
../../spur64src/vm/gcc3x-cointerp.c:19204
#6  0x0000000000442e19 in ceSendsupertonumArgs (selector=204444792,
superNormalBar=0, rcvr=206540520, numArgs=0)
    at ../../spur64src/vm/gcc3x-cointerp.c:17228
#7  0x000000000b7000ba in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

(gdb) up
#1  0x0000000076ed7eb6 in ntdll!TpAlpcRegisterCompletionList () from
/cygdrive/c/Windows/SYSTEM32/ntdll.dll
(gdb) up
#2  0x000007fefeb7e5a3 in msvcrt!longjmp () from
/cygdrive/c/Windows/system32/msvcrt.dll
(gdb) up
#3  0x00000000004314f9 in returnToExecutivepostContextSwitch
(inInterpreter=0, switchedContext=1)
    at ../../spur64src/vm/gcc3x-cointerp.c:22130
22130                   siglongjmp(reenterInterpreter, ReturnToInterpreter);

(gdb) call printCallStack()
... snip...
          0xf0b138 I Set class(HashedCollection class)>new 0xc4f8ee8: a(n)
Set class
          0xf0b168 M FFICallbackThunk class>startUp: 0xdb6b718: a(n)
FFICallbackThunk class
          0xf0b1c0 M [] in SmalltalkImage>send:toClassesNamedIn:with:
0xc553d18: a(n) SmalltalkImage
          0xf0b210 I OrderedCollection>do: 0xc8a81d8: a(n) OrderedCollection
          0xf0b260 I SmalltalkImage>send:toClassesNamedIn:with: 0xc553d18:
a(n) SmalltalkImage
          0xf0b2b8 I SmalltalkImage>processStartUpList: 0xc553d18: a(n)
SmalltalkImage
          0xf0b310 I SmalltalkImage>snapshot:andQuit:withExitCode:embedded:
0xc553d18: a(n) SmalltalkImage
         0xd1187b0 s SmalltalkImage>snapshot:andQuit:embedded:
         0xc79ee20 s SmalltalkImage>snapshot:andQuit:
         0xdc6c6b8 s TheWorldMenu>saveAndQuit
         0xdc6eb20 s TheWorldMenu>doMenuItem:with:
         0xdc6ef98 s [] in MenuItemMorph>invokeWithEvent:
         0xdc6f200 s BlockClosure>ensure:
         0xdc6f2b8 s CursorWithMask(Cursor)>showWhile:
         0xdc6f370 s MenuItemMorph>invokeWithEvent:
...snip...

(gdb) call longPrintOop(aMethodObj)

0x0d781f10:  70/112 d1/209 22/34  e0/224 7c/124 ba/186 31/49  86/134
0x0d781f18:  fe/254
         0xd781ed8: a(n) CompiledMethod (0x468=>0xc4f8168) format 0x1f
nbytes 57 hdr8 ..... hash 0x0
 0               0x29 5(0x5) nLits 5 nArgs 0 nTemps 0
 1          0xc32c900 #initialize:
 2          0xc2fa9a8 #basicNew
 3               0x29 5(0x5)
 4          0xc2f9478 #new
 5          0xd759cf0 a(n) Association

This indeed look like HashedCollection class>>new
49 <70> self
50 <D1> send: basicNew
51 <22> pushConstant: 5
52 <E0> send: initialize:
53 <7C> returnTop

So that's the status of win64 cog currently...
I'll stop here for tonight.
Eliot, any clue what to look at?

2017-03-22 0:47 GMT+01:00 Nicolas Cellier <
nicolas.cellier.aka.nice at gmail.com>:

> Gah, stupid me, I didn't realized that the vm was compiled for SysV...
> I need to define -DWIN64ABI somewhere...
>
> 2017-03-21 22:18 GMT+01:00 Nicolas Cellier <nicolas.cellier.aka.nice@
> gmail.com>:
>
>>
>>
>> 2017-03-21 7:48 GMT+01:00 Nicolas Cellier <nicolas.cellier.aka.nice at gmai
>> l.com>:
>>
>>> Hi Eliot,
>>> I'll try the assert if I can find a time slot today, otherwise this
>>> evening.
>>> I could also have printed the jump_buf address in gdb, but it was too
>>> late yesterday ;)
>>>
>>>
>>> 2017-03-21 2:19 GMT+01:00 Eliot Miranda <eliot.miranda at gmail.com>:
>>>
>>>>
>>>>
>>>>
>>>> On Mon, Mar 20, 2017 at 6:01 PM, Eliot Miranda <eliot.miranda at gmail.com
>>>> > wrote:
>>>>
>>>>> Hi Nicolas,
>>>>>
>>>>> On Mon, Mar 20, 2017 at 5:56 PM, Eliot Miranda <
>>>>> eliot.miranda at gmail.com> wrote:
>>>>>
>>>>>> Hi Nicolas,
>>>>>>
>>>>>> On Mon, Mar 20, 2017 at 3:45 PM, Nicolas Cellier <
>>>>>> nicolas.cellier.aka.nice at gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> Thanks Eliot for pushing WIN64 ABI further!
>>>>>>>
>>>>>>> So the failure is this one:
>>>>>>>
>>>>>>> gdb: unknown target exception 0xc0000028 at 0x774c8078
>>>>>>>
>>>>>>> Program received signal ?, Unknown signal.
>>>>>>> 0x00000000774c8078 in ntdll!RtlRaiseStatus () from
>>>>>>> /cygdrive/c/Windows/SYSTEM32/ntdll.dll
>>>>>>> (gdb) where
>>>>>>> #0  0x00000000774c8078 in ntdll!RtlRaiseStatus () from
>>>>>>> /cygdrive/c/Windows/SYSTEM32/ntdll.dll
>>>>>>> #1  0x0000000077467eb6 in ntdll!TpAlpcRegisterCompletionList ()
>>>>>>> from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
>>>>>>> #2  0x000007fefe08e5a3 in msvcrt!longjmp () from
>>>>>>> /cygdrive/c/Windows/system32/msvcrt.dll
>>>>>>> #3  0x0000000000419502 in ceReturnToInterpreter (anOop=176164968) at
>>>>>>> ../../spur64src/vm/gcc3x-cointerp.c:16504
>>>>>>> #4  0x000000000a801086 in ?? ()
>>>>>>> Backtrace stopped: previous frame inner to this frame (corrupt
>>>>>>> stack?)
>>>>>>>
>>>>>>> I now suspect the jmp_buf alignment problem that I had to workaround
>>>>>>> in jpeg plugin:
>>>>>>> It must be aligned on 16 bytes boundary in Win64, but sometimes the
>>>>>>> compiler fails to honour this requirement
>>>>>>> See https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/120
>>>>>>>
>>>>>>
>>>>>> Hmph.  So the stack /should/ be aligned on a 16-byte boundary and
>>>>>> hence the compiler /should/ be able to maintain the invariant (see
>>>>>> platforms/Cross/vm/sqCogStackAlignment.h; in fact on Mac OS X the
>>>>>> alignment is 32 bytes).
>>>>>>
>>>>>> Let me suggest that you add the following to the preamble:
>>>>>>
>>>>>> #if WIN64
>>>>>> # define sigsetjmp(jb,ssmf) (assert(((int)jb & 15) == 0, setjmp(jb))
>>>>>> # define siglongjmp(jb,v) (assert(((int)jb & 15) == 0, longjmp(jb,v))
>>>>>> #elif WIN32
>>>>>> ...
>>>>>>
>>>>>> and make sure there's a self assertCStackWellAligned send in
>>>>>> ceReturnToInterpreter.
>>>>>>
>>>>>
>>>>> Hmmm.  I expect we need code in the ceReturnToInterpreterTrampoline
>>>>> that establishes the stack alignment requirement.  ceReturnToInterpreter:
>>>>> would be called from machine code where there is only 8 byte alignment on
>>>>> x64 (& 4 byte alignment on 32-bit VMs).  If you like I can try and
>>>>> implement this tomorrow.
>>>>>
>>>>
>>>>  Looking at the code, that's not necessary.  The trampoline still
>>>> switches to the C stack, which should be correctly aligned.  So yes,
>>>> definitely add the "self assertCStackWellAligned" to both
>>>> ceReturnToInterpreter: and ceBaseFrameReturn:.  It looks like the issue is
>>>> whether the returnToInterpreter jmpbuf is correctly aligned.
>>>>
>>>>
>>
>> The STACK_ALIGN_BYTES is 16 in good agreement with WIN64 ABI and
>> assertCStackWellAligned() do succeed in ceReturnToInterpreter...
>>
>> So, I added the assertion on setjmp and longjmp, but they do not fail.
>> The jmp_buf reenterInterpreter is correctly aligned on 16 bytes boundary.
>> There is a declaration like this in setjmp.h
>>  typedef _CRT_ALIGN(16) struct _SETJMP_FLOAT128
>> So local and global jmp_buf variables are allways well aligned by the C
>> compiler(s)
>>
>> The jpeg problem was caused by putting the jmp_buf into a structure.
>> Apparently, gcc and clang fail to correctly handle that case
>> (they should first align the whole struct on 16 bytes then align the
>> offset of the jmp_buf member).
>>
>> There are other cases that will be problematic in Win64 exactly like the
>> jpeg case, for example I see:
>> siglongjmp((vmCallbackContext->trampoline)... in
>> returnAsThroughCallbackContext
>> longjmp(GIV(jmpBuf)[GIV(jmpDepth)]...in callbackLeave (only #if
>> SQ_USE_GLOBAL_STRUCT)
>>
>> We should memcpy these to a local jmp_buf for WIN64 compatibility.
>>
>> But the VM failed before ever reaching one of these callbacks (we should
>> have none in Squeak by default)
>>
>> So, wrong guess from my side for the moment, seeing longjmp on the call
>> stack raised that false alarm.
>>
>>
>>
>>>
>>>>> 2017-03-19 21:03 GMT+01:00 Nicolas Cellier <
>>>>>>> nicolas.cellier.aka.nice at gmail.com>:
>>>>>>>
>>>>>>>> And currently https://github.com/OpenSmallta
>>>>>>>> lk/opensmalltalk-vm/blob/Cog/spur64src/vm/cogitX64.c is generated
>>>>>>>> for SysV only.
>>>>>>>> It's necessary to hack the CogX64Compiler SysV class var
>>>>>>>> initialization and generate a win64 specific cogitX64.c.
>>>>>>>>
>>>>>>>>
>>>>>>>> 2017-03-19 20:34 GMT+01:00 Nicolas Cellier <
>>>>>>>> nicolas.cellier.aka.nice at gmail.com>:
>>>>>>>>
>>>>>>>>> Hi Clement,
>>>>>>>>> it's been a while since I last tested, but in a few words:
>>>>>>>>> - win64 use it's own ABI
>>>>>>>>> - we have to assign the registers differently than sysV
>>>>>>>>> - the experiments I did resulted in VM crashing early (before
>>>>>>>>> opening window)
>>>>>>>>>
>>>>>>>>> Nicolas
>>>>>>>>>
>>>>>>>>> 2017-03-19 20:29 GMT+01:00 Clément Bera <bera.clement at gmail.com>:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thank you very much for doing Nicolas. It is very important for
>>>>>>>>>> many Pharo users to use Pharo 64 bits on Windows.
>>>>>>>>>>
>>>>>>>>>> What are the problems you have when trying to build the VM with
>>>>>>>>>> the JIT that you don't have when building the stack VM ? Is it about API to
>>>>>>>>>> make the memory executable, is it about calling conventions ?
>>>>>>>>>>
>>>>>>>>>> On Sun, Mar 19, 2017 at 12:14 PM, Nicolas Cellier <
>>>>>>>>>> nicolas.cellier.aka.nice at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> And the appveyor builds are green
>>>>>>>>>>> https://ci.appveyor.com/project/OpenSmalltalk/vm/build/1.0.579
>>>>>>>>>>>
>>>>>>>>>>> 2017-03-19 17:31 GMT+01:00 Nicolas Cellier <
>>>>>>>>>>> nicolas.cellier.aka.nice at gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> I've built a 64bits pharo.stack.spur VM for windows on my
>>>>>>>>>>>> machine,
>>>>>>>>>>>> and I'm uploading the changes to opensmalltalk-vm in branch
>>>>>>>>>>>> build_pharo_win32_with_cygwin
>>>>>>>>>>>>
>>>>>>>>>>>> If the appveyor job correctly succeed, I will emit a pull
>>>>>>>>>>>> request.
>>>>>>>>>>>>
>>>>>>>>>>>> The VM does not have the SqueakSSL plugin yet.
>>>>>>>>>>>>
>>>>>>>>>>>> The 64bits squeak/pharo.cog.spur JIT for windows is still to
>>>>>>>>>>>> come,
>>>>>>>>>>>> but I did not work on it for a few months...
>>>>>>>>>>>> One thing at a time.
>>>>>>>>>>>>
>>>>>>>>>>>> Let's cross finger
>>>>>>>>>>>>
>>>>>>>>>>>> Nicolas
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> _,,,^..^,,,_
>>>>>> best, Eliot
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> _,,,^..^,,,_
>>>>> best, Eliot
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> _,,,^..^,,,_
>>>> best, Eliot
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20170322/dc7068e5/attachment-0001.html>


More information about the Vm-dev mailing list