[Vm-dev] Reproducible VM crash on Win32 with callbacks

Guillermo Polito guillermopolito at gmail.com
Wed Jan 16 08:59:06 UTC 2019


Hi Eliot!

Thanks for the quick answer :)

On Tue, Jan 15, 2019 at 6:04 PM Eliot Miranda <eliot.miranda at gmail.com>
wrote:

>
> Hi Guille, Hi Pablo,
>
>     I misspoke...
>
> On Tue, Jan 15, 2019 at 9:00 AM Eliot Miranda <eliot.miranda at gmail.com>
> wrote:
>
>> Hi Guille,
>>
>> On Tue, Jan 15, 2019 at 8:45 AM Guillermo Polito <
>> guillermopolito at gmail.com> wrote:
>>
>>>
>>> Hi all,
>>>
>>> With Pablo we have been tracking a bug on win32 that produces a
>>> segmentation fault on callback return. We can reproduce it 100% certainly
>>> when running the Alien qsort example both in latest pharo and squeak
>>> versions.
>>>
>>> After some debugging, it would seem that the thunkEntry function is
>>> over-optimized in 32 bits, corrupting the (C) stack. This was particularly
>>> boring because compiling the VM in debug mode was taking the bug away
>>> :-). We have cornered the bug and checked that callbacks do work ok if we
>>> disable optimizations just for the thunkEntry function like this:
>>>
>>> long
>>> __attribute__((optimize("O0"))) thunkEntry(void *thunkp, sqIntptr_t
>>> *stackp)
>>>
>>
>>> The thing is that latest mingw which we use for compiling the windows VM
>>> even in travis, now comes with gcc 7.4.0 which has a lot more of
>>> optimizations than before. Just having O1 also produces the same error.
>>>
>>> We have tried disabling some particular optimizations like
>>> fno-combine-stack-adjustments but with no result so far.
>>>
>>> The strange thing is that other callbacks like the ones coming from
>>> libgit work ok.
>>>
>>> Has somebody taken a look into this too?
>>> How would you suggest that we move on with this?
>>>
>>
>> Before adding the pragma to the source also look at whether using the
>> volatile keyword on variables in thunkProcess fixes the issue; for example
>>
>>     volatile VMCallbackContext vmcc;
>>     volatile VMCallbackContext *previousCallbackContext;
>>     volatile int flags, returnType;
>>
>
Ok I'm trying this now. At least I can launch compilation and do something
else [https://xkcd.com/303/] while it compiles :)


>
>> .  The other thing to do is to generate the machine code for thinkProcess
>> with gcc 7.x and an older version that does not crash and compare to try
>> and find out what specific optimization is causing the crash.
>>
>
> I meant generate the assembly with -S, e.g. -O1 -S.
>

Sure, this is what we understood ^^.


> You can also compare _O0 -S with -O1 -S for gcc 7.x, and generate -O1 -S
> with and without the volatile keyword and see what differences that makes.
>

I'll do this this afternoon If I have some time...


> Finally, if you do find you have to use the pragma, please write the fix as
>>
>> long __attribute__((optimize("O0")))
>> thunkEntry(void *thunkp, sqIntptr_t *stackp)
>>
>> to keep the definition starting on a new line, which helps when using
>> command-line tools to look for definitions outside of an ide.
>>
>
Sure! no problem!

Tx Again!


>
>>
>>> From our side, we think that using a pragma to disable optimizations for
>>> thunkEntry in the case of win32 looks okeyish at least to make the bug go
>>> away.
>>>
>>
>> Yes, but I expect it is actually that the volatile keyword has not been
>> used (a mistake of mine).  Here's a relevant stack overflow answer:
>>
>> https://stackoverflow.com/questions/7996825/why-volatile-works-for-setjmp-longjmp
>> <https://stackoverflow.com/questions/7996825/why-volatile-works-for-setjmp-longjmp>
>>
>> And if volatile does fix the issue, please apply it to the other
>> thinkEntry implementations.
>>
>> Cheers,
>>> Guille & Pablo
>>>
>>
>> Cheers!
>>
>
> _,,,^..^,,,_
> best, Eliot
>


-- 



Guille Polito

Research Engineer

Centre de Recherche en Informatique, Signal et Automatique de Lille

CRIStAL - UMR 9189

French National Center for Scientific Research - *http://www.cnrs.fr
<http://www.cnrs.fr>*


*Web:* *http://guillep.github.io* <http://guillep.github.io>

*Phone: *+33 06 52 70 66 13
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20190116/5c0bffe4/attachment-0001.html>


More information about the Vm-dev mailing list