[Vm-dev] Re: primitive retry across suspension for OwnedLock waitAcquire (was: Interpreter versus StackInterpreter hierarchy)

Ben Coman btc at openinworld.com
Fri Jun 3 10:33:00 UTC 2016


On Fri, Jun 3, 2016 at 3:08 PM, Ben Coman <btc at openinworld.com> wrote:
> On Fri, Jun 3, 2016 at 12:52 PM, Ben Coman <btc at openinworld.com> wrote:
>> On Sun, May 22, 2016 at 2:15 AM, Eliot Miranda <eliot.miranda at gmail.com> wrote:
>>>
>>> On Fri, May 20, 2016 at 7:52 PM, Ben Coman <btc at openinworld.com> wrote:
>>>>
>>>> On Sat, May 21, 2016 at 4:36 AM, Clément Bera <bera.clement at gmail.com> wrote:
>>>> >
>>>> > On Fri, May 20, 2016 at 7:51 PM, Ben Coman <btc at openinworld.com> wrote:
>>>> >>
>>>> >> On Fri, May 20, 2016 at 9:25 PM, Clément Bera <bera.clement at gmail.com> wrote:
>>>> >> > On Thu, May 19, 2016 at 3:42 PM, Ben Coman <btc at openinworld.com> wrote:
>>>>
>>> There is a better way of solving this, and that is to use a pragma to identify a method that contains such a suspension point, and have the process terminate code look for the pragma and act accordingly.  For example, the pragma could have a terminate action, sent to the receiver with the context as argument, e.g.
>>>
>>> Mutex>>critical: mutuallyExcludedBlock
>>>     <onTerminate: #ensureMutexUnlockedInCritical:>
>>>     ^lock waitAcquire
>>>         ifNil: mutuallyExcludedBlock
>>>         ifNotNil:[ mutuallyExcludedBlock ensure: [lock release] ]
>>>
>>> (and here I'm guessing...)
>>>
>>> Mutex>> ensureMutexUnlockedInCritical: aContext
>>>     "long-winded comment explaining the corner case, referencing tests, etc, etc and how it is solved on terminate buy this method"
>>>     (aContext pc = aContext initialPC
>>>      and: [self inTheCorner]) ifTrue:
>>>         [self doTheRightThingTM]
>>>
>>> So on terminate the stack is walked (it is anyway) looking for unwinds or onTerminate: markers.  Any onTerminate: markers are evaluated, and the corner case is solved.  The pragma approach also allows for visibility in the code.
>>
>> I think this general < onTerminate: > pragma might be useful, but I'd
>> like to keep it in the back pocket for the moment while I explore
>> another idea for primitive retry after a process resumes.
>>
>>
>> I still have a concern that  #primitiveOwnedLockWaitAcquire sleeps at
>> the bottom of the primitive, thus if the sleeping process is resumed,
>> it continues into the critical section without having gained the
>> mutex, which seems a bit fragile.  Retrying
>> #primitiveOwnedLockWaitAcquire immediately after waking would
>> effectively have the process sleep at the top of the primitive, and
>> *not*proceed until it *really* holds the lock..
>>
>> So I'm thinking out loud here to formulate my thoughts, and in case
>> there is some major impediment you can help me fail fast...
>>
>> One possibility is putting "self maybeRetryFailureAfterWaking"
>> in #slowPrimitiveResponse, similar to maybeRetryFailureDueToForwarding
>> and (guessing) maybeRetryFailureDueToLowMemory.  The difficulty seems
>> to be that process's   primFailCode   doesn't hold across process
>> suspension(??).
>>
>> As an aside, it seems fragile that IIUC it is possible for
>> primFailCode   to be set by one process, which if then suspended will
>> carry over to fail the new active process.   Perhaps somewhere like
>> externalSetStackPageAndPointersForSuspendedContextOfProcess: should
>> zero primFailCode.
>>
>> Anyway... I thought one way to retain   primFailCode   across process
>> suspension might be to push it to the stack in  #transferTo:  and pop
>> primFailCode from the stack in
>> externalSetStackPageAndPointersForSuspendedContextOfProcess:
>> except that I see that method called from a few places, so messing
>> with the stack here is probably a bad idea.
>>
>> Another way might be for Process to get an additional instance
>> variable 'suspendedPrimitiveFailCode' which again could be set in
>> #transferTo:  (or even more specifically only in
>> #primitiveOwnedLockWaitAcquire).  That is,
>>
>> #transferTo:  might have...
>>
>>   oldProc := objectMemory fetchPointer: ActiveProcessIndex ofObject: sched.
>>   objectMemory
>>       storePointer: SuspendedPrimitiveFailCodeIndex
>>       ofObject: oldProc
>>       withValue: primFailCode
>>   ...
>>     primFailCode := objectMemory
>>         fetchPointer: SuspendedPrimitiveFailCodeIndex
>>         ofObject: newProc.
>>    self externalSetStackPageAndPointersForSuspendedContextOfProcess: newProc.
>>
>>
>
> Whoops, for a start, that needed to be (*1*)...
>     objectMemory
>         storeInteger: SuspendedPrimitiveFailCodeIndex
>         ofObject: oldProc
>         withValue: primFailCode.
>     primFailCode := objectMemory
>        fetchInteger: SuspendedPrimitiveFailCodeIndex
>        ofObject: newProc.
>
>
>> The additional advantage here might be that the saving and restoring
>> of primFailCode is localised to one method. I'm hoping that change
>> (plus similar in Cog) might be sufficient to facilitate behaviour like
>> this...
>>
>> process 1
>> 1.    invokes slowPrimitiveResponse
>> 2.     dispatches to primitiveOwnedLockWaitAcquire
>> 3.         calls primitiveFailFor: PrimErrRetryAfterWaking
>> 4.         primitiveFail saved into Process object
>> 5.         process goes to sleep
>>
>> 6. later after process 1 woken
>> 7.   primitiveFail restored from Process object
>> 8.   returns to slowPrimitiveResponse
>> 9.   if PrimErrRetryAfterWaking
>> 10.     dispatch to primitiveOwnedLockWaitAcquire (goto step 2)
>>
>> Maybe there would need to be a step 9a to checkForInterrupts to avoid
>> too tight a loop locking the image, but maybe this is already done
>> somewhere in the suspend/resume process.
>>
>> So I'm now going to try coding the second way in the StackVM, and then
>> look at how it might be done in Cog.  All feedback appreciated.
>>
>> cheers -ben

My my experiment to better understand this is running the simulator thus...
    | cos |
    cos := StackInterpreterSimulator newWithOptions: #(ObjectMemory
Spur32BitMemoryManager).
    cos desiredNumStackPages: 8.
    cos openOn: 'ownedlock-reader.image'.
    cos openAsMorph; run

and at the simulator's REPL input box running...
    o := OwnedLock new.
    o experiment1.   !

where  OwnedLock>>experiment1   is...
    | result |
    result := OrderedCollection new: 20.
    result myAdd: 0.

    [result myAdd: 11.
        [result myAdd: 12.
         self experimentSuccessAndSleep] on: Error do: [result myAdd: 13].
    result myAdd: 14] forkAt: 72.

    [result myAdd: 21.
        [result myAdd: 22.
         self experimentFailAndWakeOther] on: Error do: [result myAdd: 23].
    result myAdd: 24] forkAt: 71.

    result myAdd: 8.
    self experimentSuccessAndWakeOther.
    result myAdd: 9.
    ^result.

The current VM produces a result of...
     OrderedCollection(0 11 12 21 22 13 14 24 8 9)

but I'm hoping to get something like...
     OrderedCollection(0 11 12 21 22 14 23 24 8 9)


Where the primitives are...

primitiveExperimentSuccessAndSleep
    | ownedLock activeProc |
    ownedLock := self stackTop.
    activeProc := self activeProcess.
    self addLastLink: activeProc toList: ownedLock.
    self transferTo: self wakeHighestPriority.

primitiveExperimentFailAndWakeOther
    | ownedLock waitingProcess |
    ownedLock := self stackTop.
    self primitiveFailFor: 42.
    (self isEmptyList: ownedLock)
        ifFalse:
            [waitingProcess := self removeFirstLinkOfList: ownedLock.
            self resume: waitingProcess preemptedYieldingIf: preemptionYields]

primitiveExperimentSuccessAndWakeOther
    | ownedLock waitingProcess |
    ownedLock := self stackTop.  "rcvr"
    (self isEmptyList: ownedLock)
       ifFalse:
        [waitingProcess := self removeFirstLinkOfList: ownedLock.
        self resume: waitingProcess preemptedYieldingIf: preemptionYields]


More information about the Vm-dev mailing list