[Vm-dev] Re: Process>>terminate woes
andreas.raab at gmx.de
Wed Dec 5 00:40:07 UTC 2007
[cc: vm-dev which I accidentally left out earlier]
Attached my proposed fixes for myList manipulation. The first CS
(PrimSuspend-ar) changes primitiveSuspend to enable atomic removal from
myList for the non-active process. The second (SuspendFixes-ar) has the
main modifications for the in-image part (this may not work for all
Squeak versions - I used a Croquet image as the basis, YMMV).
Feedback is welcome, in particular from the usual suspects on vm-dev.
Andreas Raab wrote:
> Hi -
> I had an eventful (which is euphemistic for @!^# up) morning caused by
> Process>>terminate. In our last round of delay and semaphore discussions
> I had noticed that there is a possibility of having a race condition in
> Process>>terminate but dismissed it as being of an application problem
> (e.g., if you send #terminate make sure you have only one place where
> you send it).
> This morning proved conclusively that this is a race condition which can
> affect *every* user of the system. It is caused by Process>>terminate
> which says:
> myList remove: self ifAbsent: .
> The reason this is so problematic is that the modification of myList is
> not atomic and that because of the non-atomic modification there is a
> possibility of the VM manipulating the very same list concurrently due
> to an external event (like a network interrupt). When this happens in
> "just the right way" the effect is that any number of processes at the
> same priority will "fall off" of the scheduled list. In the image that I
> was looking at earlier we had the following situation:
> * ~40 processes were not running
> * The processes had their myList be an empty linked list
> * The processes were internally linked (via nextLink)
> * The processes were all at the same priority
> Given that most of the processes were unrelated other than having the
> same priority I think the evidence is pretty clear.
> The question is now: How can we fix it? My proposal would be to simply
> change primitiveSuspend such that for a non-active process it will
> primitively take the process off its suspendingList. This makes suspend
> a little more general and (by returning the previous suspendingList) it
> will also guard us against any following cleanup (like the Semaphore
> situations earlier).
> Unfortunately, this *will* require VM changes but I don't think it can
> be helped at this point since the VM will be manipulating these lists
> atomically anyway. The good news though is that we can have reasonable
> fallback code which does just exactly what we do today as a fallback to
> Any comments? Alternatives? Suggestions?
> - Andreas
-------------- next part --------------
'From Croquet1.0beta of 11 April 2006 [latest update: #1] on 4 December 2007 at 4:32:26 pm'!
!Interpreter methodsFor: 'processes' stamp: 'ar 12/4/2007 16:21'!
removeProcess: aProcess fromList: aList
"Remove a given process from a linked list. May fail if aProcess is not on the list."
| firstLink lastLink nextLink tempLink |
firstLink := self fetchPointer: FirstLinkIndex ofObject: aList.
lastLink := self fetchPointer: LastLinkIndex ofObject: aList.
aProcess == firstLink ifTrue:[
nextLink := self fetchPointer: NextLinkIndex ofObject: aProcess .
self storePointer: FirstLinkIndex ofObject: aList withValue: nextLink.
aProcess == lastLink ifTrue:[
self storePointer: LastLinkIndex ofObject: aList withValue: self nilObject.
tempLink := firstLink.
[tempLink == self nilObject ifTrue:[^self success: false]. "fail"
nextLink := self fetchPointer: NextLinkIndex ofObject: tempLink.
nextLink == aProcess] whileFalse:[
tempLink := self fetchPointer: NextLinkIndex ofObject: tempLink.
nextLink := self fetchPointer: NextLinkIndex ofObject: aProcess.
self storePointer: NextLinkIndex ofObject: tempLink withValue: nextLink.
aProcess == lastLink ifTrue:[
self storePointer: LastLinkIndex ofObject: aList withValue: tempLink.
self storePointer: NextLinkIndex ofObject: aProcess withValue: self nilObject.
!Interpreter methodsFor: 'process primitives' stamp: 'ar 12/4/2007 16:31'!
"Primitive. Suspend the receiver, aProcess such that it can be executed again
by sending #resume. If the given process is not currently running, take it off
its corresponding list. The primitive returns the list the receiver was previously on."
| process activeProc myList |
process := self stackTop.
activeProc := self fetchPointer: ActiveProcessIndex
ofObject: self schedulerPointer.
process == activeProc ifTrue:[
self pop: 1.
self push: nilObj.
self transferTo: self wakeHighestPriority.
myList := self fetchPointer: MyListIndex ofObject: process.
"XXXX Fixme. We should really check whether myList is a kind of LinkedList or not
but we can't easily so just do a quick check for nil which is the most common case."
myList == self nilObject ifTrue:[^self primitiveFail].
self removeProcess: process fromList: myList.
self storePointer: MyListIndex ofObject: process withValue: self nilObject.
self pop: 1.
self push: myList.
-------------- next part --------------
'From Croquet1.0beta of 11 April 2006 [latest update: #1] on 4 December 2007 at 4:33:43 pm'!
!ControlManager methodsFor: 'scheduling' stamp: 'ar 12/4/2007 15:44'!
"Create a Notifier on the active scheduling process with the given label. Make the Notifier the active controller."
| newActiveController |
activeController ~~ nil ifTrue: [
"Carefully de-emphasis the current window."
activeController view topView deEmphasizeForDebugger].
label: labelString) controller.
self activeController: newActiveController.
!Process methodsFor: 'changing process state' stamp: 'ar 12/4/2007 15:49'!
"Primitive. Stop the process that self represents in such a way
that it can be restarted at a later time (by sending #resume).
If the receiver is not the active process it will take it off its suspendingList.
Essential. See Object documentation whatIsAPrimitive."
| oldList |
"This is fallback code for VMs which only support the old primitiveSuspend which
would not accept processes that are waiting to be run."
myList ifNil:[self error: 'Process already suspended'].
oldList := myList.
myList := nil.
oldList remove: self ifAbsent:.
!Process methodsFor: 'changing process state' stamp: 'ar 12/4/2007 15:30'!
"Stop the process that the receiver represents in such a way
that it can be restarted at a later time (by sending the receiver the
message resume). If the receiver represents the activeProcess, suspend it.
Otherwise remove the receiver from the list of waiting processes."
!Process methodsFor: 'changing process state' stamp: 'ar 12/4/2007 15:55'!
"Stop the process that the receiver represents forever. Unwind to execute pending ensure:/ifCurtailed: blocks before terminating."
| ctxt unwindBlock oldList |
self isActiveProcess ifTrue: [
ctxt := thisContext.
[ ctxt := ctxt findNextUnwindContextUpTo: nil.
] whileFalse: [
unwindBlock := ctxt tempAt: 1.
unwindBlock ifNotNil: [
ctxt tempAt: 1 put: nil.
thisContext terminateTo: ctxt.
thisContext terminateTo: nil.
] ifFalse: [
self isSuspended ifFalse:[oldList := self primitiveSuspend].
suspendedContext ifNotNil: [
"Figure out if we are terminating the process while waiting in Semaphore>>critical:
In this case, pop the suspendedContext so that we leave the ensure: block inside
Semaphore>>critical: without signaling the semaphore."
(oldList class == Semaphore and:[
suspendedContext method == (Semaphore compiledMethodAt: #critical:)]) ifTrue:[
suspendedContext := suspendedContext home.
ctxt := self popTo: suspendedContext bottomContext.
ctxt == suspendedContext bottomContext ifFalse: [
self debug: ctxt title: 'Unwind error during termination']].
!Project class methodsFor: 'utilities' stamp: 'ar 12/4/2007 15:47'!
"Create a Notifier on the active scheduling process with the given label."
| projectProcess |
Smalltalk isMorphic ifFalse:
[^ ScheduledControllers interruptName: labelString].
ActiveHand ifNotNil:[ActiveHand interrupted].
ActiveWorld := World. "reinstall active globals"
ActiveHand := World primaryHand.
ActiveHand interrupted. "make sure this one's interrupted too"
ActiveEvent := nil.
projectProcess := self uiProcess. "we still need the accessor for a while"
ToolSet interrupt: projectProcess label: labelString! !
Process removeSelector: #offList!
More information about the Vm-dev