SafeFFI concept

List overview All Threads
Download

newer

older

Re: [Vm-dev] [Pharo-dev] Image...

Cannot download a spur image using...

Ben Coman

31 Mar 2018 31 Mar '18

3:25 p.m.

This idea is not fully formed. I've been nibbling away at composing this post for a month and thought I'd just send it out rather than let it drift on further. Its an idea that keeps resurfacing but I've not been in a position to follow it up, so I'm just sharing the rough outline.

One of the great features of programming at the Image level is protection from memory access violations. We get to continue from errors after debugging them. However all bets are off when we use FFI. The bane of FFI are memory violations in the C-callout. Memory violations in FFI C callouts are harder than usual to diagnose since we lose our usual debugging environment. Its hard to recovery from a memory violation since the C callout has full access to VM's heap and thus everything is suspect.

So the idea is the FFI callouts to execute in a separate child-process. That child-process has no access to the VM's memory so a memory violation in the C-callout could not crash the VM.

Obviously there will be some performance penalty, but the question is to what degree. There are two reasons to use an external library via FFI. 1. Speed 2. Functionality Where its more about functionality than speed (e.g. git, libusb, libsodium, pdfium) application developers newly programming against an unfamiliar C library may be willing to trade speed for safety. Perhaps its used part-time like the Assert-VM during development, and production uses the standard higher performance FFI.

The idea of executing FFI callouts in a child-process arose while reading about Linux's clone() function that the parent process can allocate memory for the stack of the child process. https://stackoverflow.com/questions/1083172/how-to-mmap-the- stack-for-the-clone-system-call-on-linux https://nullprogram.com/blog/2015/05/15/

The child-process might be a simple event loop waiting on a semaphore. My understanding of the FFI callout mechanism is that stack frame is constructed in the form expected by the function being invoked. With SafeFFI, when "fficallout" semaphore is being waited on, the child stack is static, so maybe the VM-parent-process could arrange the stack in the child-process such that sem_wait() returns not to line 005 but instead executes the required FFI-callout function. The "fficallout" semaphore is signalled from the Image once the stack frame has been constructed.

001 main() 002 { expose_child_function_addresses_to_parent_process(); 003 while(true) 004 { sem_wait(&fficallout); // Smalltalk image reconstructs stack frame here 005 printf("Dummy statement. Never gets here"); 006 } 007 } 008 009 demo_redirect() 010 { printf("SafeFFI demo success"); 011 }

So how feasible would something like that be?

cheers -ben

P.S. For initial simplicity of the presentation I've avoided discussing return values and callbacks.

Attachments:

attachment.html (text/html — 15.1 KB)

Show replies by date

Eliot Miranda

31 Mar 31 Mar

8:10 p.m.

Hi Ben,

I think it's a fun idea (my Spur memory debugging scheme uses the clone idea too) but for the FFI it isn't useful. IMO so much state is associated with a specific process that only a fraction of library and system calls would work, and debugging those that didn't would be very strange. Just take a system call that opens a file for example. On return the file handle would be present only in the child. Any use of the file descriptor from the parent would fail. There are simpler alternatives:

a) modify the already installed low-level exception handlers in the VM to fail an FFI call, reporting exception location and code, when a low-level exception occurs during an FFI call.

b) allow write-protecting the Smalltalk heap during an FFI call

I like a). b) doesn't play nicely with the threaded FFI

...

On Mar 31, 2018, at 6:25 AM, Ben Coman btc@openinworld.com wrote:

This idea is not fully formed. I've been nibbling away at composing this post for a month and thought I'd just send it out rather than let it drift on further. Its an idea that keeps resurfacing but I've not been in a position to follow it up, so I'm just sharing the rough outline.

One of the great features of programming at the Image level is protection from memory access violations. We get to continue from errors after debugging them. However all bets are off when we use FFI. The bane of FFI are memory violations in the C-callout. Memory violations in FFI C callouts are harder than usual to diagnose since we lose our usual debugging environment. Its hard to recovery from a memory violation since the C callout has full access to VM's heap and thus everything is suspect.

So the idea is the FFI callouts to execute in a separate child-process. That child-process has no access to the VM's memory so a memory violation in the C-callout could not crash the VM.

Obviously there will be some performance penalty, but the question is to what degree. There are two reasons to use an external library via FFI.

Speed

Functionality

Where its more about functionality than speed (e.g. git, libusb, libsodium, pdfium) application developers newly programming against an unfamiliar C library may be willing to trade speed for safety. Perhaps its used part-time like the Assert-VM during development, and production uses the standard higher performance FFI.

The idea of executing FFI callouts in a child-process arose while reading about Linux's clone() function that the parent process can allocate memory for the stack of the child process. https://stackoverflow.com/questions/1083172/how-to-mmap-the-stack-for-the-cl... https://nullprogram.com/blog/2015/05/15/

The child-process might be a simple event loop waiting on a semaphore. My understanding of the FFI callout mechanism is that stack frame is constructed in the form expected by the function being invoked. With SafeFFI, when "fficallout" semaphore is being waited on, the child stack is static, so maybe the VM-parent-process could arrange the stack in the child-process such that sem_wait() returns not to line 005 but instead executes the required FFI-callout function. The "fficallout" semaphore is signalled from the Image once the stack frame has been constructed.

001 main() 002 { expose_child_function_addresses_to_parent_process(); 003 while(true) 004 { sem_wait(&fficallout); // Smalltalk image reconstructs stack frame here 005 printf("Dummy statement. Never gets here"); 006 } 007 } 008 009 demo_redirect() 010 { printf("SafeFFI demo success"); 011 }

So how feasible would something like that be?

cheers -ben

P.S. For initial simplicity of the presentation I've avoided discussing return values and callbacks.

Ben Coman

1 Apr 1 Apr

1:55 a.m.

On 1 April 2018 at 02:15, Todd Blanchard tblanchard@mac.com wrote:

...

Problem with that is when you want to do something like integrate with Cocoa on a Mac of iOS. The thing you want to talk to is in your process already.

On 1 April 2018 at 02:10, Eliot Miranda eliot.miranda@gmail.com wrote:

...

Hi Ben,

I think it's a fun idea (my Spur memory debugging scheme uses the clone idea too) but for the FFI it isn't useful. IMO so much state is associated with a specific process that only a fraction of library and system calls would work, and debugging those that didn't would be very strange. Just take a system call that opens a file for example. On return the file handle would be present only in the child. Any use of the file descriptor from the parent would fail. There are simpler alternatives:

a) modify the already installed low-level exception handlers in the VM to fail an FFI call, reporting exception location and code, when a low-level exception occurs during an FFI call.

b) allow write-protecting the Smalltalk heap during an FFI call

I like a). b) doesn't play nicely with the threaded FFI

Thanks for your consideration. Helps me put the idea aside. cheers -ben

...

On Mar 31, 2018, at 6:25 AM, Ben Coman btc@openinworld.com wrote:

This idea is not fully formed. I've been nibbling away at composing this post for a month and thought I'd just send it out rather than let it drift on further. Its an idea that keeps resurfacing but I've not been in a position to follow it up, so I'm just sharing the rough outline.

One of the great features of programming at the Image level is protection from memory access violations. We get to continue from errors after debugging them. However all bets are off when we use FFI. The bane of FFI are memory violations in the C-callout. Memory violations in FFI C callouts are harder than usual to diagnose since we lose our usual debugging environment. Its hard to recovery from a memory violation since the C callout has full access to VM's heap and thus everything is suspect.

So the idea is the FFI callouts to execute in a separate child-process. That child-process has no access to the VM's memory so a memory violation in the C-callout could not crash the VM.

Obviously there will be some performance penalty, but the question is to what degree. There are two reasons to use an external library via FFI.

Speed

Functionality

Where its more about functionality than speed (e.g. git, libusb, libsodium, pdfium) application developers newly programming against an unfamiliar C library may be willing to trade speed for safety. Perhaps its used part-time like the Assert-VM during development, and production uses the standard higher performance FFI.

The idea of executing FFI callouts in a child-process arose while reading about Linux's clone() function that the parent process can allocate memory for the stack of the child process. https://stackoverflow.com/questions/1083172/how-to-mmap-the- stack-for-the-clone-system-call-on-linux https://nullprogram.com/blog/2015/05/15/

The child-process might be a simple event loop waiting on a semaphore. My understanding of the FFI callout mechanism is that stack frame is constructed in the form expected by the function being invoked. With SafeFFI, when "fficallout" semaphore is being waited on, the child stack is static, so maybe the VM-parent-process could arrange the stack in the child-process such that sem_wait() returns not to line 005 but instead executes the required FFI-callout function. The "fficallout" semaphore is signalled from the Image once the stack frame has been constructed.

001 main() 002 { expose_child_function_addresses_to_parent_process(); 003 while(true) 004 { sem_wait(&fficallout); // Smalltalk image reconstructs stack frame here 005 printf("Dummy statement. Never gets here"); 006 } 007 } 008 009 demo_redirect() 010 { printf("SafeFFI demo success"); 011 }

So how feasible would something like that be?

cheers -ben

P.S. For initial simplicity of the presentation I've avoided discussing return values and callbacks.

Todd Blanchard

31 Mar 31 Mar

8:15 p.m.

Problem with that is when you want to do something like integrate with Cocoa on a Mac of iOS. The thing you want to talk to is in your process already.

...

On Mar 31, 2018, at 6:25 AM, Ben Coman btc@openinworld.com wrote:

This idea is not fully formed. I've been nibbling away at composing this post for a month and thought I'd just send it out rather than let it drift on further. Its an idea that keeps resurfacing but I've not been in a position to follow it up, so I'm just sharing the rough outline.

One of the great features of programming at the Image level is protection from memory access violations. We get to continue from errors after debugging them. However all bets are off when we use FFI. The bane of FFI are memory violations in the C-callout. Memory violations in FFI C callouts are harder than usual to diagnose since we lose our usual debugging environment. Its hard to recovery from a memory violation since the C callout has full access to VM's heap and thus everything is suspect.

So the idea is the FFI callouts to execute in a separate child-process. That child-process has no access to the VM's memory so a memory violation in the C-callout could not crash the VM.

Obviously there will be some performance penalty, but the question is to what degree. There are two reasons to use an external library via FFI.

Speed

Functionality

Where its more about functionality than speed (e.g. git, libusb, libsodium, pdfium) application developers newly programming against an unfamiliar C library may be willing to trade speed for safety. Perhaps its used part-time like the Assert-VM during development, and production uses the standard higher performance FFI.

The idea of executing FFI callouts in a child-process arose while reading about Linux's clone() function that the parent process can allocate memory for the stack of the child process. https://stackoverflow.com/questions/1083172/how-to-mmap-the-stack-for-the-cl... https://stackoverflow.com/questions/1083172/how-to-mmap-the-stack-for-the-clone-system-call-on-linux https://nullprogram.com/blog/2015/05/15/ https://nullprogram.com/blog/2015/05/15/ The child-process might be a simple event loop waiting on a semaphore. My understanding of the FFI callout mechanism is that stack frame is constructed in the form expected by the function being invoked. With SafeFFI, when "fficallout" semaphore is being waited on, the child stack is static, so maybe the VM-parent-process could arrange the stack in the child-process such that sem_wait() returns not to line 005 but instead executes the required FFI-callout function. The "fficallout" semaphore is signalled from the Image once the stack frame has been constructed.

001 main() 002 { expose_child_function_addresses_to_parent_process(); 003 while(true) 004 { sem_wait(&fficallout); // Smalltalk image reconstructs stack frame here 005 printf("Dummy statement. Never gets here"); 006 } 007 } 008 009 demo_redirect() 010 { printf("SafeFFI demo success"); 011 }

So how feasible would something like that be?

cheers -ben

P.S. For initial simplicity of the presentation I've avoided discussing return values and callbacks.

Eliot Miranda

8:20 p.m.

On Sat, Mar 31, 2018 at 11:15 AM, Todd Blanchard tblanchard@mac.com wrote:

...

Problem with that is when you want to do something like integrate with Cocoa on a Mac of iOS. The thing you want to talk to is in your process already.

...

On Mar 31, 2018, at 6:25 AM, Ben Coman btc@openinworld.com wrote:

This idea is not fully formed. I've been nibbling away at composing this post for a month and thought I'd just send it out rather than let it drift on further. Its an idea that keeps resurfacing but I've not been in a position to follow it up, so I'm just sharing the rough outline.

One of the great features of programming at the Image level is protection from memory access violations. We get to continue from errors after debugging them. However all bets are off when we use FFI. The bane of FFI are memory violations in the C-callout. Memory violations in FFI C callouts are harder than usual to diagnose since we lose our usual debugging environment. Its hard to recovery from a memory violation since the C callout has full access to VM's heap and thus everything is suspect.

So the idea is the FFI callouts to execute in a separate child-process. That child-process has no access to the VM's memory so a memory violation in the C-callout could not crash the VM.

Obviously there will be some performance penalty, but the question is to what degree. There are two reasons to use an external library via FFI.

Speed

Functionality

Where its more about functionality than speed (e.g. git, libusb, libsodium, pdfium) application developers newly programming against an unfamiliar C library may be willing to trade speed for safety. Perhaps its used part-time like the Assert-VM during development, and production uses the standard higher performance FFI.

The idea of executing FFI callouts in a child-process arose while reading about Linux's clone() function that the parent process can allocate memory for the stack of the child process. https://stackoverflow.com/questions/1083172/how-to-mmap-the- stack-for-the-clone-system-call-on-linux https://nullprogram.com/blog/2015/05/15/

The child-process might be a simple event loop waiting on a semaphore. My understanding of the FFI callout mechanism is that stack frame is constructed in the form expected by the function being invoked. With SafeFFI, when "fficallout" semaphore is being waited on, the child stack is static, so maybe the VM-parent-process could arrange the stack in the child-process such that sem_wait() returns not to line 005 but instead executes the required FFI-callout function. The "fficallout" semaphore is signalled from the Image once the stack frame has been constructed.

001 main() 002 { expose_child_function_addresses_to_parent_process(); 003 while(true) 004 { sem_wait(&fficallout); // Smalltalk image reconstructs stack frame here 005 printf("Dummy statement. Never gets here"); 006 } 007 } 008 009 demo_redirect() 010 { printf("SafeFFI demo success"); 011 }

So how feasible would something like that be?

cheers -ben

P.S. For initial simplicity of the presentation I've avoided discussing return values and callbacks.

-- _,,,^..^,,,_ best, Eliot

2241

Age (days ago)

2241

Last active (days ago)

vm-dev@lists.squeakfoundation.org

4 comments

3 participants

tags (0)

participants (3)

Ben Coman
Eliot Miranda
Todd Blanchard