[Vm-dev] Re: Problem with alien callbacks

Eliot Miranda eliot.miranda at gmail.com
Tue Apr 5 17:03:07 UTC 2011


On Sun, Apr 3, 2011 at 8:52 PM, Javier Pimás <elpochodelagente at gmail.com>wrote:

> We were researching about this problem for some days now. We also met with
> Richie yesterday and he explained us a bit what we were seeing. If we are
> right, the problem is that the interpreter is not fully reentrant, so for
> callbacks mechanism to work you have to first set the interpreter in a state
> that it waits the callback to come in. That way, when the callback arrives,
> the interpreter is ready to handle it, and when it finishes handling, the
> state is correctly restored. Is that right?
>
> But in the multithreaded stack VM, you Eliot solved this in other way,
> right? I think you said that you set the stack as if it was full so that on
> the next method activation (well here I'm just guessing) or the next
> heartbeat you detected it and made space for the callback to come. I may
> have said nonsense sorry if that happened.
>

Sort-of.  The threaded VM does arrange to save and restore all relevant VM
state across thread switches.  But what you're seeing is I think still an
issue, which is that the VM assumes that a call-back can occur only in the
context of a call-out (plugin primitive or FFI), not at an arbitrary point.
 It looks to me like the callback machinery needs to save and restore more
state, at least the messageSelector, and things like lastMethodCacheWrite
(or whatever its called).  The problem is of course that there's a lot of
this state, that some of it is specific to optimizations that may evolve
over time, etc.  So this is messy.  You have a most demanding application of
callbacks for the VM since you're using them to service things like page
faults in the VM.  I'm not sure yet how to deal with this.  It may make
sense to provide a more heavy-weight callback entry-point (e.g. identified
by some flag in the callback trampoline) that causes the VM to save and
restore more state than that needed for normal callbacks.  But in any case
this needs some thought.


>
> The thing is that for our application of paging we need to solve the
> callback instantly, no matter what the interpreter is doing at the moment.
> So if I was correct about the StackVM, then we couldn't use that either. In
> that case what we'd need for this special type of callbacks is to be able to
> save all the context of the interpreter. We don't want to have a perfect
> solution, just a fine enough one (then we can improve it after moving to
> cog, but we must finish this step first, one at a time). Could you tell us
> which variables of the interpreter must be saved and which mustn't?
>
> Maybe Igor you had some experience about this with hydravm, right? Also,
> how do the nativeboost callbacks work, they might be just what we are
> looking for.
>
> Regards,
>           Javier.
>
> On Thu, Mar 31, 2011 at 2:27 PM, Javier Pimás <elpochodelagente at gmail.com>wrote:
>
>> hi! the callback is comming just in:
>>
>> "Clean up session id and external primitive index"
>> self storePointerUnchecked: 2 ofObject: lit withValue: ConstZero. <- here
>>  self storePointerUnchecked: 3 ofObject: lit withValue: ConstZero.
>>
>> I know, because I'm debugging with gdb, that writing to that place causes
>> a page fault (target object's page is marked as read only), and the page
>> fault handling mecanism issues the callback to handle it). After all that,
>> the original primExternalCall continues execution, and uses the wrong values
>> of messageSelector, and lkupClass (even if it found the primitive it would
>> write in the wrong place of the cache I think).
>>
>> I know that the vm has a lot of state and of course you don't want to save
>> everything, but the callback could come in any place, not just
>> primExternalCall, so any variable could be used. I was actually surprised
>> that just saving the active context and creating a new one was enough to
>> save all the state of the VM. Thinking what is enough will not be easy. I
>> tried manually saving and then restoring messageSelector and lkupClass
>> before and after the callback, which solved the problem for some iterations
>> of interpreting, but seemed to corrupt the image, which crashed after some
>> moments. Is there anything else you'd recommend to save to workaround this
>> for now?
>>
>>
>> On Thu, Mar 31, 2011 at 12:53 PM, Eliot Miranda <eliot.miranda at gmail.com>wrote:
>>
>>>
>>>
>>> On Thu, Mar 31, 2011 at 6:11 AM, Javier Pimás <
>>> elpochodelagente at gmail.com> wrote:
>>>
>>>> Hi, we are having a problem with callbacks in alien and we would like to
>>>> see if we are doing something wrong or if it is a bug in the implementation
>>>> (for the standard old vm).
>>>>
>>>> We are receiving the callback just in the middle of a
>>>> primitiveExternalCall (actually to a function that will fail because the
>>>> plugin is not present, but i don't think that's important). We pinned it to
>>>> occur always in the same line, which is
>>>>
>>>> longAtput((lit + (BASE_HEADER_SIZE)) + (2 << (SHIFT_FOR_WORD)),
>>>> ConstZero);
>>>>
>>>> of primitiveExternalCall. When the callback occurs, the thunkEntry is
>>>> called, which if we understand correctly, saves the active context and runs
>>>> the interpreter by calling sendInvokeCallbackStackRegistersJmpbuf. The
>>>> problem is that things like messageSelector and lkupClass, which are global
>>>> variables are not saved while saving the context, and when the callback
>>>> returns, the last line of primitiveExternalCall,
>>>>
>>>> rewriteMethodCacheSelclassprimIndex(messageSelector, lkupClass, 0);
>>>>
>>>> puts a 0 in the wrong place. Also, probably as las message sent
>>>> was primReturnFromContext:through: (because we just returned from the
>>>> context), we get a primitiveFailed, but not for the original called function
>>>> but for primReturnFromContext:through:.
>>>>
>>>> What do you think? are we missing something?
>>>>
>>>
>>> Hmmm, looking at it I think you must be taking a callback before the
>>> external call occurs.  Here's how the code reads in Cog:
>>>
>>>
>>> ...
>>>
>>> addr := self ioLoadExternalFunction: functionName + BaseHeaderSize
>>>  OfLength: functionLength
>>> FromModule: moduleName + BaseHeaderSize
>>> OfLength: moduleLength.
>>>  addr = 0
>>> ifTrue: [index := -1]
>>> ifFalse: ["add the function to the external primitive table"
>>>  index := self addToExternalPrimitiveTable: addr].
>>>
>>> "Store the index (or -1 if failure) back in the literal"
>>>  objectMemory storePointerUnchecked: 3 ofObject: lit withValue:
>>> (objectMemory integerObjectOf: index).
>>>
>>> "If the function has been successfully loaded cache and call it"
>>>  index >= 0
>>> ifTrue:
>>> [self rewriteMethodCacheEntryForExternalPrimitiveToFunction: (self cCode:
>>> [addr] inSmalltalk: [1000 + index]).
>>>  self callExternalPrimitive: addr]
>>> ifFalse: ["Otherwise void the primitive function and fail"
>>>  self rewriteMethodCacheEntryForExternalPrimitiveToFunction: 0.
>>> ^self primitiveFailFor: PrimErrNotFound]
>>>
>>> So the rewrite to zero (self
>>> rewriteMethodCacheEntryForExternalPrimitiveToFunction: 0) isn't done if no
>>> callout is made.  Where is your callback comming from?  Looks like its
>>> comming from the internals of things like ioLoadExternalFunction...
>>>
>>> It is hard to save and restore all the VM state around a callback.
>>>  There's too much of it in the current VM design.  Take a look
>>> at rewriteMethodCacheEntryForExternalPrimitiveToFunction:.  It is written to
>>> be fast, using lastMethodCacheProbeWrite to avoid work in rewriting the
>>> cache entry if the module and/or function load fails.  That's state one
>>> doesn't want to have to save and restore around callbacks along with
>>> lkupClass, messageSelector.  primitiveFunctionPointer, newMethod,
>>> framePointer, instructionPointer and stackPointer are already a lot.  Tthis
>>> needs more thought.
>>>
>>>
>>>> Regards,
>>>>             Javier.
>>>>
>>>>
>>>> --
>>>> Javier Pimás
>>>> Ciudad de Buenos Aires
>>>>
>>>
>>>
>>
>>
>> --
>> Javier Pimás
>> Ciudad de Buenos Aires
>>
>
>
>
> --
> Javier Pimás
> Ciudad de Buenos Aires
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20110405/e8f05dd8/attachment.htm


More information about the Vm-dev mailing list