FFI callbacks

List overview All Threads
Download

newer

older

VM Maker: Cog-eem.224.mcz

[commit][3150] Add MorphFloat. st...

Bert Freudenberg

24 Nov 2014 24 Nov '14

11:49 p.m.

How do they actually work? I want to know what information the thunk needs, and what happens when it is activated. I think SqueakJS would benefit from callback support.

I imagine that somehow the VM's state needs to be saved, then the context that defined the block is activated, and then the VM would run the block until it returns (or until the return prim is called?), then the VM's state would need to be restored to what it was, and the result is passed back by the thunk returning.

Am I close? Since the callback can happen at any time, and it could do anything, saving the whole VM state seems daunting.

- Bert -

Attachments:

smime.p7s (application/pkcs7-signature — 4.0 KB)

Show replies by date

Eliot Miranda

25 Nov 25 Nov

3:19 a.m.

Hi Bert,

On Mon, Nov 24, 2014 at 2:49 PM, Bert Freudenberg bert@freudenbergs.de wrote:

...

How do they actually work? I want to know what information the thunk needs, and what happens when it is activated. I think SqueakJS would benefit from callback support.

I'll try and write this up properly in a blog post tomorrow. But...

Callback is a wrapper round a piece of executable code, a block, and a "signature" (see below). The executable code is wrapped by an FFICallbackThunk, an Alien, which is just a pointer to this code. Th code is allocated from a special pool of memory marked as executable. The address of the code is unique and is hence used as a key to identify the Callback and hence locate the block from the thunk. The FFI marshalling code should pass the address of the thunk when passing the Callback. Right now that has to be done explicitly. The thunk, when called, invokes thunkEntry, which is defined in e.g. platforms/Cross/plugins/IA32ABI/ia32abicc.c (it needs a different definition for each API). On x86 thinkEntry needs only to be invoked with the think and the stack pointer, since on x86 all arguments are accessible from the stack. On other platforms thunkEntry will need to be invoked with the register arguments.

thunkEntry's job is to store the state necessary to return from the callback in an instance of either VMCallbackContext32 or VMCallbackContext64. These are aliens for the following structure:

typedef struct { void *thunkp; char *stackptr; long *intRegArgs; double *floatRegArgs; void *savedCStackPointer; void *savedCFramePointer; union { long vallong; struct { int low, high; } valleint64; struct { int high, low; } valbeint64; double valflt64; struct { void *addr; long size; } valstruct; } rvs; jmp_buf trampoline; } VMCallbackContext;

The structure lives in the stack frame of thunkEntry, hence one per callback.

It does three things:

1. it identifies all the input arguments. These are accessible, depending on platform, through stackptr, intRegArgs and floatRegArgs. 2. it maintains the jmp_buf which can be used to longjmp back to the callback to return from it. 3. it has slots to hold the result to be returned from the callback (the union rvs). The argument to the longjmp (the value returned from the setjmp in thinkEntry) tells thunkEntry how to return the result, i.e. as a 32-bit value, a 64-bit value, a double or a struct.

Once thinkEntry has packed up the state in its local VMCallbackContext, it uses

if ((flags = interpreterProxy->ownVM(0)) < 0) { fprintf(stderr,"Warning; callback failed to own the VM\n"); return -1; }

to "own" the VM. If the VM is single-threaded then this can check that the callback is being made on the same thread as the VM, and fail otherwise. If the VM is multi-threaded ownVM can block until the thread can enter the VM.

It then calls-back into the VM using

interpreterProxy->sendInvokeCallbackContext(&vmcc);

which sends the message (found in the specialObjectsArray) #invokeCallbackContext: which is understood by Alien (see Alien class>>#invokeCallbackContext:). This method wraps up the input argument (the raw address of thunkEntry's VMCallbackContext) in the relevant Alien and invokes Callback's entry point:

invokeCallbackContext: vmCallbackContextAddress "<Integer>" "^<FFICallbackReturnValue>" "The low-level entry-point for callbacks sent from the VM/IA32ABI plugin. Return via primReturnFromContext:through:. thisContext's sender is the call-out context." | callbackAlien type | callbackAlien := (Smalltalk wordSize = 4 ifTrue: [VMCallbackContext32] ifFalse: [VMCallbackContext64]) atAddress: vmCallbackContextAddress. [type := Callback evaluateCallbackForContext: callbackAlien] ifCurtailed: [self error: 'attempt to non-local return across a callback']. type ifNil: [type := 1. callbackAlien wordResult: -1]. callbackAlien primReturnAs: type fromContext: thisContext

The sendInvokeCallbackContext machinery just constructs an activation of invokeCallbackContext: on top of the current process's stack, which is usually the process that called out through the FFI. Bit it doesn't have to be. threaded callbacks are too detailed for this brief message.

Callback then locates the relevant marshalling method for the callback's signature (actually it does this when the Callback is created, but the effect is the same). Callback uses methods that identify themselves via pragmas to choose the relevant marshaller. e.g. for a qsort sort function callback the marshaller on x86 is

Callback methods for signatures voidstarvoidstarRetint: callbackContext sp: spAlien <signature: #(int (*)(const void *, const void *)) abi: 'IA32'> ^callbackContext wordResult: (block value: (Alien forPointer: (spAlien unsignedLongAt: 1)) value: (Alien forPointer: (spAlien unsignedLongAt: 5)))

So it fetches the arguments from the stack using Alien accessors, evaluates the block with them and then assigns the result via wordResult, and answers the type code back to invokeCallbackContext:, e.g.

VMCallbackContext32 methods for accessing wordResult: anInteger "Accept any value in the -2^31 to 2^32-1 range." anInteger >= 0 ifTrue: [self unsignedLongAt: 25 put: anInteger] ifFalse: [self signedLongAt: 25 put: anInteger]. ^1

Then invokeCallbackContext: invokes the primitive to longjmp back to thunkEntry, supplying the return code that will allow thunkEntry to return the reslt correctly:

VMCallbackContext32 methods for primitives primReturnAs: typeCode "<SmallInteger>" fromContext: context "<MethodContext>" <primitive: 'primReturnAsFromContextThrough' module: 'IA32ABI' error: ec> ^self primitiveFailed

Then thunkEntry switches on the return code and returns to the caller.

Note that sendInvokeCallbackContext and primReturnAsFromContextThrough conspire to save, set and restore the VM's notion of what the C stack is in the VMCallbackContext's savedCStackPointer & savedCFramePointer, growing the stack on callback, and cutting it back on return. There's a variable in the VM, previousCallbackContext, that primReturnAsFromContextThrough uses to make sure returns are LIFO.

I imagine that somehow the VM's state needs to be saved, then the context

...

that defined the block is activated, and then the VM would run the block until it returns (or until the return prim is called?), then the VM's state would need to be restored to what it was, and the result is passed back by the thunk returning.

That's right.

Am I close? Since the callback can happen at any time, and it could do

...

anything, saving the whole VM state seems daunting.

Provided that the callback occurs from the context of a callout there are

no problems. Threaded callbacks take some more doing. Basically the VM needs to b sharable between threads. This is the threaded VM prototype. If you absolutely need threaded callbacks we should have a serious talk. This is not trivial to productise.

HTH

-- best, Eliot

Bert Freudenberg

1:18 p.m.

On 25.11.2014, at 03:19, Eliot Miranda eliot.miranda@gmail.com wrote:

...

Hi Bert,

On Mon, Nov 24, 2014 at 2:49 PM, Bert Freudenberg <bert@freudenbergs.de mailto:bert@freudenbergs.de> wrote:

How do they actually work? I want to know what information the thunk needs, and what happens when it is activated. I think SqueakJS would benefit from callback support.

I'll try and write this up properly in a blog post tomorrow. But...

Callback is a […]

Cool, very interesting! Thanks :)

...

Provided that the callback occurs from the context of a callout there are no problems. Threaded callbacks take some more doing. Basically the VM needs to b sharable between threads. This is the threaded VM prototype. If you absolutely need threaded callbacks we should have a serious talk. This is not trivial to productise.

I absolutely need callbacks that happen after the method returned. JavaScript uses callbacks for everything, because it is single-threaded. There is no preemption, and not even a yield: you always have to return completely back to the browser. You don’t call the browser, it calls you. There is no main(). Also, there is no longjmp.

The whole SqueakJS VM is implemented as a callback. When it is called, it executes bytecodes for a couple of milliseconds, arranges to be called back again soonish, and returns. This is pretty easy to do with a plain interpreter, but makes it harder to do a proper JIT. But the JIT will have to support this model one way or another (mine does).

Good news is that the VM is always in a consistent state, it can only be observed between bytecodes, not while it is in the middle of something.

So inside an FFI callback the thunk automatically owns the VM. It can just execute bytecodes until the return-from-callback primitive was invoked (which would set a flag). This is how e.g. my clipboard code works: The copy callback puts a cmd-c keyboard event in the image’s queue, then runs the VM until the image has invoked the clipboard primitive (which stores the clipboard data and sets a flag), then returns the clipboard data.

The question is, how do I switch to the FFI callback’s context. I think this should work like a process switch, as if the current process was pre-empted by a higher-priority process. I could make it work in the image I guess - just launch a high-priority process executing the callback block in an endless loop, and have it wait on an external semaphore. The thunk would signal the semaphore and execute bytecodes until the return prim was called, or it’s blocked on the semaphore again, then return.

Maybe this would be the cleanest solution. What I don’t like about it is that I would have to manually manage Squeak processes. There is no way to know how often the callback will be executed, so I fear that gazillions of callback processes would clutter up the image. If instead the thunk would create the process when needed, it would automatically go away when the callback is no longer used. OTOH this would make the block and process not be part of the object memory anymore … not sure yet if this would be good or bad. Also, creating processes is normally done in the image, not in the VM. Might not be a good idea.

Overall these design constraints are pretty different from a regular C VM. OTOH the Android VM works in a similar way, being callback-based and single-threaded, right? Not sure about iOS - does the ObjectiveC bridge support callbacks?

Ideas welcome :)

- Bert -

John McIntosh

6:22 p.m.

Look at the squeak proxy logic, I put in the objective-c work

https://drive.google.com/file/d/0BzM4orHA3iGxMjU0ZDU4MWEtOTQ2Zi00OGVlLTg2ODU...

But if you where to reverse that, where the vm would assemble the request, pop back to the JS environment, which execute the request, then return to the VM.

On Tue, Nov 25, 2014 at 4:18 AM, Bert Freudenberg bert@freudenbergs.de wrote:

...

On 25.11.2014, at 03:19, Eliot Miranda eliot.miranda@gmail.com wrote:

Hi Bert,

On Mon, Nov 24, 2014 at 2:49 PM, Bert Freudenberg bert@freudenbergs.de wrote:

...
How do they actually work? I want to know what information the thunk needs, and what happens when it is activated. I think SqueakJS would benefit from callback support.

I'll try and write this up properly in a blog post tomorrow. But...

Callback is a [...]

Cool, very interesting! Thanks :)

Provided that the callback occurs from the context of a callout there are no problems. Threaded callbacks take some more doing. Basically the VM needs to b sharable between threads. This is the threaded VM prototype. If you absolutely need threaded callbacks we should have a serious talk. This is not trivial to productise.

I absolutely need callbacks that happen after the method returned. JavaScript uses callbacks for everything, because it is single-threaded. There is no preemption, and not even a yield: you always have to return completely back to the browser. You don't call the browser, it calls you. There is no main(). Also, there is no longjmp.

The whole SqueakJS VM is implemented as a callback. When it is called, it executes bytecodes for a couple of milliseconds, arranges to be called back again soonish, and returns. This is pretty easy to do with a plain interpreter, but makes it harder to do a proper JIT. But the JIT will have to support this model one way or another (mine does).

Good news is that the VM is always in a consistent state, it can only be observed between bytecodes, not while it is in the middle of something.

So inside an FFI callback the thunk automatically owns the VM. It can just execute bytecodes until the return-from-callback primitive was invoked (which would set a flag). This is how e.g. my clipboard code works: The copy callback puts a cmd-c keyboard event in the image's queue, then runs the VM until the image has invoked the clipboard primitive (which stores the clipboard data and sets a flag), then returns the clipboard data.

The question is, how do I switch to the FFI callback's context. I think this should work like a process switch, as if the current process was pre-empted by a higher-priority process. I could make it work in the image I guess - just launch a high-priority process executing the callback block in an endless loop, and have it wait on an external semaphore. The thunk would signal the semaphore and execute bytecodes until the return prim was called, or it's blocked on the semaphore again, then return.

Maybe this would be the cleanest solution. What I don't like about it is that I would have to manually manage Squeak processes. There is no way to know how often the callback will be executed, so I fear that gazillions of callback processes would clutter up the image. If instead the thunk would create the process when needed, it would automatically go away when the callback is no longer used. OTOH this would make the block and process not be part of the object memory anymore ... not sure yet if this would be good or bad. Also, creating processes is normally done in the image, not in the VM. Might not be a good idea.

Overall these design constraints are pretty different from a regular C VM. OTOH the Android VM works in a similar way, being callback-based and single-threaded, right? Not sure about iOS - does the ObjectiveC bridge support callbacks?

Ideas welcome :)

Bert -

-- =========================================================================== John M. McIntosh johnmci@smalltalkconsulting.com https://www.linkedin.com/in/smalltalk ===========================================================================

tim Rowledge

26 Nov 26 Nov

7:47 p.m.

On 25-11-2014, at 4:18 AM, Bert Freudenberg bert@freudenbergs.de wrote:

...

I absolutely need callbacks that happen after the method returned. JavaScript uses callbacks for everything, because it is single-threaded. There is no preemption, and not even a yield: you always have to return completely back to the browser. You don’t call the browser, it calls you. There is no main(). Also, there is no longjmp.

Ah, so it’s back to the old days of Windows 3.1, MacOS System 8 & RISC OS. What fun!

tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Planetary axial tilt: the actual 'reason for the season'.

3461

Age (days ago)

3463

Last active (days ago)

vm-dev@lists.squeakfoundation.org

4 comments

4 participants

tags (0)

participants (4)

Bert Freudenberg
Eliot Miranda
John McIntosh
tim Rowledge