[Vm-dev] Image freeze because handleTimerEvent and Seaside process gone?!

Andreas Raab andreas.raab at gmx.de
Tue Jan 4 15:27:58 UTC 2011


On 1/3/2011 10:09 AM, Adrian Lienhard wrote:
> I'll try that...
>
> But I still don't understand why the change in Pharo that makes LinkedList>>remove:ifAbsent: non thread safe can cause the problem since this code is executed by the timerEvent process, which runs at the highest priority. This process should never be suspended during the execution of remove:ifAbsent:. What do I miss?

The problem isn't thread-safety, at least in the classical definition. 
What happens is that if you're removing processes by using 
LinkedList>>remove: you are subject to a race condition where the 
semaphore gets signaled *while* you are removing the the process. 
Obviously, hillarity ensues at this point, which is why I made primitive 
suspend do the Right Thing (i.e., remove the process primitively). There 
are two parameters which affect if you're likely to see the effect or 
not: One is the number of suspension points (real sends) in the method. 
The more you have, the more likely you're affected. The second one is 
whether the method can tolerate having the process removed "underneith 
its feet". Both are far worse in Pharo.

Cheers,
   - Andreas

>
> Cheers,
> Adrian
>
> On Dec 31, 2010, at 10:08 , Andreas Raab wrote:
>
>> Revert LinkedList>>remove:ifAbsent: back to the version in Squeak and your problems will go away.
>>
>> Cheers,
>>   - Andreas
>>
>> On 12/30/2010 11:50 PM, Adrian Lienhard wrote:
>>>
>>> Thanks Andreas and David for the responses!
>>>
>>> In the meantime I've gathered more information. From the mail of Andreas I assumed that the most likely reason for the freeze is that the timer event loop throws an unhandled exception and therefore gets suspended.
>>>
>>> So I added a guard to catch any error in handleTimerEvent, restart the loop, and then pass the exception to open a debugger:
>>>
>>> runTimerEventLoop
>>> 	[RunTimerEventLoop] whileTrue: [
>>> 		[ self handleTimerEvent ]
>>> 			on: Error
>>> 			do: [ :e |
>>> 				self startTimerEventLoop.
>>> 				...write a warning to stdout...
>>> 				e pass ] ]
>>>
>>> And voila, after 10 days or so I got the stack trace below.
>>>
>>> I haven't had time to dive into it, but from the stack it seems like a concurrency issue in linked list (although I wonder whether that's possible since the timer event loop runs at the highest priority...).
>>>
>>> Maybe something catches somebody's eye.
>>>
>>> Cheers,
>>> Adrian
>>>
>>>
>>> THERE_BE_DRAGONS_HERE
>>> Error: no such method!
>>> 30 December 2010 10:32:28 pm
>>>
>>> VM: unix - i686 - linux - Squeak3.10.2 of '5 June 2008' [latest update: #7179]
>>> Image: Pharo1.1 [Latest update: #11410]
>>>
>>> Semaphore(Object)>>error:
>>> 	Receiver: a Semaphore(a Process in [] in DelayWaitTimeout>>wait)
>>> 	Arguments and temporary variables:
>>> 		t1: 	'no such method!'
>>> 	Receiver's instance variables:
>>> 		firstLink: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		lastLink: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		excessSignals: 	0
>>>
>>>
>>> [] in Semaphore(LinkedList)>>removeLink:
>>> 	Receiver: a Semaphore(a Process in [] in DelayWaitTimeout>>wait)
>>> 	Arguments and temporary variables:
>>>
>>> 	Receiver's instance variables:
>>> 		firstLink: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		lastLink: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		excessSignals: 	0
>>>
>>>
>>> Semaphore(LinkedList)>>removeLink:ifAbsent:
>>> 	Receiver: a Semaphore(a Process in [] in DelayWaitTimeout>>wait)
>>> 	Arguments and temporary variables:
>>> 		aLink: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		aBlock: 	[self error: 'no such method!']
>>> 		tempLink: 	nil
>>> 	Receiver's instance variables:
>>> 		firstLink: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		lastLink: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		excessSignals: 	0
>>>
>>>
>>> Semaphore(LinkedList)>>removeLink:
>>> 	Receiver: a Semaphore(a Process in [] in DelayWaitTimeout>>wait)
>>> 	Arguments and temporary variables:
>>> 		aLink: 	a Process in [] in DelayWaitTimeout>>wait
>>> 	Receiver's instance variables:
>>> 		firstLink: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		lastLink: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		excessSignals: 	0
>>>
>>>
>>> Semaphore(LinkedList)>>remove:ifAbsent:
>>> 	Receiver: a Semaphore(a Process in [] in DelayWaitTimeout>>wait)
>>> 	Arguments and temporary variables:
>>> 		aLinkOrObject: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		aBlock: 	[]
>>> 		link: 	a Process in [] in DelayWaitTimeout>>wait
>>> 	Receiver's instance variables:
>>> 		firstLink: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		lastLink: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		excessSignals: 	0
>>>
>>>
>>> Process>>suspend
>>> 	Receiver: a Process in [] in DelayWaitTimeout>>wait
>>> 	Arguments and temporary variables:
>>> 		t1: 	a Semaphore(a Process in [] in DelayWaitTimeout>>wait)
>>> 	Receiver's instance variables:
>>> 		nextLink: 	nil
>>> 		suspendedContext: 	[] in DelayWaitTimeout>>wait
>>> 		priority: 	30
>>> 		myList: 	a Semaphore(a Process in [] in DelayWaitTimeout>>wait)
>>> 		errorHandler: 	nil
>>> 		name: 	'seaside'
>>> 		env: 	nil
>>>
>>>
>>> DelayWaitTimeout>>signalWaitingProcess
>>> 	Receiver: a DelayWaitTimeout(10000 msecs)
>>> 	Arguments and temporary variables:
>>>
>>> 	Receiver's instance variables:
>>> 		delayDuration: 	10000
>>> 		resumptionTime: 	217048389
>>> 		delaySemaphore: 	a Semaphore(a Process in [] in DelayWaitTimeout>>wait)
>>> 		beingWaitedOn: 	false
>>> 		process: 	a Process in [] in DelayWaitTimeout>>wait
>>> 		expired: 	true
>>>
>>>
>>> Delay class>>handleTimerEvent
>>> 	Receiver: Delay
>>> 	Arguments and temporary variables:
>>> 		t1: 	217128602
>>> 		t2: 	nil
>>> 	Receiver's instance variables:
>>> 		superclass: 	Object
>>> 		methodDict: 	a MethodDictionary(#adjustResumptionTimeOldBase:newBase:->(Delay>>#...etc...
>>> 		format: 	138
>>> 		instanceVariables: 	#('delayDuration' 'resumptionTime' 'delaySemaphore' 'beingWa...etc...
>>> 		organization: 	('as yet unclassified' adjustResumptionTimeOldBase:newBase: being...etc...
>>> 		subclasses: 	{MonitorDelay. DelayWaitTimeout}
>>> 		name: 	#Delay
>>> 		classPool: 	a Dictionary(#AccessProtect->a Semaphore() #ActiveDelay->a Delay(10 ...etc...
>>> 		sharedPools: 	nil
>>> 		environment: 	a SystemDictionary(lots of globals)
>>> 		category: 	#'Kernel-Processes'
>>> 		traitComposition: 	{}
>>> 		localSelectors: 	nil
>>>
>>>
>>> [] in Delay class>>runTimerEventLoop
>>> 	Receiver: Delay
>>> 	Arguments and temporary variables:
>>>
>>> 	Receiver's instance variables:
>>> 		superclass: 	Object
>>> 		methodDict: 	a MethodDictionary(#adjustResumptionTimeOldBase:newBase:->(Delay>>#...etc...
>>> 		format: 	138
>>> 		instanceVariables: 	#('delayDuration' 'resumptionTime' 'delaySemaphore' 'beingWa...etc...
>>> 		organization: 	('as yet unclassified' adjustResumptionTimeOldBase:newBase: being...etc...
>>> 		subclasses: 	{MonitorDelay. DelayWaitTimeout}
>>> 		name: 	#Delay
>>> 		classPool: 	a Dictionary(#AccessProtect->a Semaphore() #ActiveDelay->a Delay(10 ...etc...
>>> 		sharedPools: 	nil
>>> 		environment: 	a SystemDictionary(lots of globals)
>>> 		category: 	#'Kernel-Processes'
>>> 		traitComposition: 	{}
>>> 		localSelectors: 	nil
>>>
>>>
>>> BlockClosure>>on:do:
>>> 	Receiver: [self handleTimerEvent]
>>> 	Arguments and temporary variables:
>>> 		exception: 	Error
>>> 		handlerAction: 	[:e |
>>> self startTimerEventLoop.
>>> 	FileStream
>>> 		fileNamed: '/dev/...etc...
>>> 		handlerActive: 	false
>>> 	Receiver's instance variables:
>>> 		outerContext: 	Delay class>>runTimerEventLoop
>>> 		startpc: 	108
>>> 		numArgs: 	0
>>>
>>>
>>> Delay class>>runTimerEventLoop
>>> 	Receiver: Delay
>>> 	Arguments and temporary variables:
>>>
>>> 	Receiver's instance variables:
>>> 		superclass: 	Object
>>> 		methodDict: 	a MethodDictionary(#adjustResumptionTimeOldBase:newBase:->(Delay>>#...etc...
>>> 		format: 	138
>>> 		instanceVariables: 	#('delayDuration' 'resumptionTime' 'delaySemaphore' 'beingWa...etc...
>>> 		organization: 	('as yet unclassified' adjustResumptionTimeOldBase:newBase: being...etc...
>>> 		subclasses: 	{MonitorDelay. DelayWaitTimeout}
>>> 		name: 	#Delay
>>> 		classPool: 	a Dictionary(#AccessProtect->a Semaphore() #ActiveDelay->a Delay(10 ...etc...
>>> 		sharedPools: 	nil
>>> 		environment: 	a SystemDictionary(lots of globals)
>>> 		category: 	#'Kernel-Processes'
>>> 		traitComposition: 	{}
>>> 		localSelectors: 	nil
>>>
>>>
>>> [] in Delay class>>startTimerEventLoop
>>> 	Receiver: Delay
>>> 	Arguments and temporary variables:
>>>
>>> 	Receiver's instance variables:
>>> 		superclass: 	Object
>>> 		methodDict: 	a MethodDictionary(#adjustResumptionTimeOldBase:newBase:->(Delay>>#...etc...
>>> 		format: 	138
>>> 		instanceVariables: 	#('delayDuration' 'resumptionTime' 'delaySemaphore' 'beingWa...etc...
>>> 		organization: 	('as yet unclassified' adjustResumptionTimeOldBase:newBase: being...etc...
>>> 		subclasses: 	{MonitorDelay. DelayWaitTimeout}
>>> 		name: 	#Delay
>>> 		classPool: 	a Dictionary(#AccessProtect->a Semaphore() #ActiveDelay->a Delay(10 ...etc...
>>> 		sharedPools: 	nil
>>> 		environment: 	a SystemDictionary(lots of globals)
>>> 		category: 	#'Kernel-Processes'
>>> 		traitComposition: 	{}
>>> 		localSelectors: 	nil
>>>
>>>
>>> [] in BlockClosure>>newProcess
>>> 	Receiver: [self runTimerEventLoop]
>>> 	Arguments and temporary variables:
>>>
>>> 	Receiver's instance variables:
>>> 		outerContext: 	Delay class>>startTimerEventLoop
>>> 		startpc: 	144
>>> 		numArgs: 	0
>>>
>>>
>>>
>>> --- The full stack ---
>>> Semaphore(Object)>>error:
>>> [] in Semaphore(LinkedList)>>removeLink:
>>> Semaphore(LinkedList)>>removeLink:ifAbsent:
>>> Semaphore(LinkedList)>>removeLink:
>>> Semaphore(LinkedList)>>remove:ifAbsent:
>>> Process>>suspend
>>> DelayWaitTimeout>>signalWaitingProcess
>>> Delay class>>handleTimerEvent
>>> [] in Delay class>>runTimerEventLoop
>>> BlockClosure>>on:do:
>>> Delay class>>runTimerEventLoop
>>> [] in Delay class>>startTimerEventLoop
>>> [] in BlockClosure>>newProcess
>>> ------------------------------------------------------------
>
>


More information about the Vm-dev mailing list