[Vm-dev] Robust FFI with Memory Protection Keys

Ben Coman btc at openinworld.com
Mon Aug 6 01:34:49 UTC 2018


On 5 August 2018 at 23:10, Eliot Miranda <eliot.miranda at gmail.com> wrote:
>
> Hi Ben,
>
>
>> On Aug 4, 2018, at 8:40 AM, Ben Coman <btc at openinworld.com> wrote:
>>
>>
>> A problem with FFI is that if a callout segfaults, all of memory
>> including that of the Image is suspect, and execution of the Image terminates.
>>
>> Occasionally I hunt around hoping to find technology to mitigate that problem.
>> Maybe this time in I found something... Memory Protection Keys [1]
>> Perhaps these could ensure Image memory safe when an FFI callout segfaults.
>>
>> IIUC the main problem with protecting Image memory on every FFI callout
>> is the time it would take update the flags on every page of Image memory.
>> Would being able to change the protection of a massive number of pages
>> with one syscall make it feasible to wrap them around FFI callouts?
>>
>> This may be useful at least where the FFI use is more about reuse of
>> existing functionality than about performance.
>> Or at least useful while someone is learning/experimenting with FFI for
>> the first time or while becoming familiar with some external library.
>> Further info at [2] & [3].
>
> I think there’s a much simpler improvement that doesn’t go this far.  I implemented it in VisualWorks and it’s been in production for more than a decade.  It should be easy to add to Cog.
>
> The idea is simply to add a flag that tracks if the VM is in an FFI call or not and to test this flag in the VM’s exception handlers for SIGBUS, SIGILL, SIGSEGV and their equivalents on Windows.  The exception handlers then respond when in an FFI call by failing the FFI call primitive, answering a primitive fail code that includes the exception information.  Recently we extended Cog’s failure codes to allow a structured object (I font have the details handy; I’ll check soon).  In this case we need a pc and/or address and an exception code.
>
> Would this approach satisfy you?

That sounds good.  Although the argument I've seen is that a memory
access error
means you "cant recover because you don't know what may have been corrupted"
I think its worthwhile to be optimistic that the Image may last a bit
longer to get more information about what call from the Image invoked
the FFI failure.
And if you've been notified (e.g. via Growl message) you can still
take steps to move to a new Image if the current one is suspect.

I guess you'd want to be able to turn it off for native level debugging,
and for critical production applications where its judged better to
crash than continue.

Also, the approach you suggest would be a pre-requisite for what I
suggested anyway,
and make it easier to later experiment with MPKs.
Let me know what I can do to help (probably more capable on the testing side).

cheers -ben

>> [1] https://lwn.net/Articles/643797/
>> [2] http://man7.org/linux/man-pages/man7/pkeys.7.html
>> [3] https://lwn.net/Articles/689395/


More information about the Vm-dev mailing list