Simple lock up with delay + semaphore - not fixed with 0006576

Andreas Raab andreas.raab at gmx.de
Sat Jan 5 14:05:00 UTC 2008


Georg Köster wrote:
> Hey I prefer Tom's version! I shouldn't be racy if I read the 
> resumeProcess and resume code correctly. The resumeProcess message has 
> no effect on non-waiting processes!

To the contrary. This code *introduces* a race condition when 
manipulating the Semaphore's list of processes without protecting it 
against concurrent modifications. Worse, the code *cannot be protected* 
against concurrent modification since the VM manipulates that list on 
its own. It may actually explain why some people have reported issues on 
3.9 that do not seem to appear on 3.8 or 3.10 variants.

> 
>  >               | waitingProcess wakeupProcess |
>  >               waitingProcess _ Processor activeProcess.
>  >               wakeupProcess _
>  >                       [(Delay forMilliseconds: (anInteger max: 0)) wait.
>  >                       self resumeProcess: waitingProcess] fork.
>  >
>  >               self wait.
>     "preempting here and getting the resumeProcess message sent would 
> have no effect - therefore no race!"

What are you talking about? With the original code there was no race 
condition whatsoever. If you think there is a race condition somewhere, 
please explain in detail where you think that race condition is.

>     "in comparison having a semaphore getting signaled here would cause 
> an excess signal on the sem: bad"

First, the original code was guarded with a call to #unschedule which 
would avoid a second signal if the "real event" occurs long enough 
before the "timeout event". On the other hand, neither implementation 
prevents a double-signal if the real event occurs within an epsilon of 
the timeout event. And due to the asynchronous nature of some semaphore 
signals it is impossible to have a light-weight implementation like the 
one provided in Semaphore handle this correctly.  It's possible to 
handle this but this would require to separate the semaphore and the 
delay signal and use a critical section to decide which one came first 
and how to deal with the other. That, on the other hand, seems overkill 
for the 99% of the cases in which the simplistic implementation in 
3.8/3.10 is all that's needed.

Cheers,
   - Andreas






More information about the Squeak-dev mailing list