[BUG] Mysterious Delay lockups

J J azreal1977 at hotmail.com
Sat May 5 09:13:44 UTC 2007


Did anything ever happen with this?  I didn't see anything else get posted 
here, or anything when I google with site: pointed at squeak archives.

>From: Andreas Raab <andreas.raab at gmx.de>
>Reply-To: The general-purpose Squeak developers 
>list<squeak-dev at lists.squeakfoundation.org>
>To: The general-purpose Squeak developers 
>list<squeak-dev at lists.squeakfoundation.org>, Squeak Virtual Machine 
>Development Discussion<vm-dev at lists.squeakfoundation.org>
>Subject: [BUG] Mysterious Delay lockups
>Date: Tue, 17 Apr 2007 19:20:28 -0700
>
>Hi Folks -
>
>Some of you (mostly those who run heavy servers) may have noticed that at 
>times Squeak locks up in mysterious and unforeseen ways. One of those 
>lockups involves Delay's AccessProtect in an unsignaled state and 
>consequently the entire image locking up since Delay access is required in 
>many, many places.
>
>Today, David presented me an image that was locked up in such a state but 
>by sheer luck he managed to save it right before it happened which allowed 
>me to investigate the situation. The result can best be explained by the 
>little test case shown here:
>
>   "Create mutex unsignaled so we can manually signal it"
>   mutex := Semaphore new.
>   "Create a process which will wait inside the mutex"
>   p := [mutex critical:[]] forkAt: Processor userBackgroundPriority.
>   "Wait until process has entered mutex"
>   [p suspendingList == mutex]
>       whileFalse:[(Delay forMilliseconds: 10) wait].
>   "Signal mutex"
>   mutex signal.
>   "Kill process"
>   p terminate.
>   "and check to see if the mutex is signaled"
>   mutex isSignaled ifFalse:[self error: 'Mutex not signaled'].
>
>Note that despite the somewhat complex setup the basic idea is that a low 
>priority process waiting in a critical section receives a signal on the 
>semaphore it is waiting on but gets terminated by a higher priority process 
>inbetween receiving the signal and execution of the process itself.
>
>This situation (manually executed in the above to make it more easily 
>repeatable) can happen in many situations where processes get terminated 
>"from the outside" and it would cause particular grief in the timing 
>semaphore because it gets served by the highest priority process which 
>makes the unfortunate cause of events much more likely.
>
>All Squeak versions that I have access to expose this behavior. Looking at 
>Semaphore>>critical: which says
>
>Semaphore>>critical: aBlock
>   self wait.
>   [blockValue := aBlock value] ensure: [self signal].
>
>makes it seem as if moving the wait into the ensured block is the correct 
>answer, but that ain't necessarily so. When we move the wait into the block 
>we risk that the entering process is terminated after entering the block 
>but before entering the wait which would leave the semaphore signaled 
>twice, which is just as bad as not signaled at all.
>
>Methinks a solution would involve Process>>terminate but I'm running out of 
>steam after trying to understand the problem in all its implications. Any 
>ideas would be greatly welcome.
>
>Cheers,
>   - Andreas
>

_________________________________________________________________
Get a FREE Web site, company branded e-mail and more from Microsoft Office 
Live! http://clk.atdmt.com/MRT/go/mcrssaub0050001411mrt/direct/01/




More information about the Squeak-dev mailing list