Simple lock up with delay + semaphore - not fixed with 0006576

Georg Köster georg.koester at gmail.com
Sat Jan 5 15:04:05 UTC 2008


Hi Andreas, all,

first of all: Sorry, it also works in my fresh Croquet image. The backup I
always used is apparently not as fresh as I thought :-( mince. And I didn't
see that email :-( Double sorry.

Newbie qestion: Why are images restartable? Why not label them differently
(for example as 'core dump') if they don't work?

To the race discussion. I think we have a point:
 § double signaling is unavoidable if a critical section is to be avoided

Therefore there is a race in the first implementation.

The second implementation has clearly other troubles (I didn't see that it's
just using a LinkedList that's apparently not thread safe), but I explained
before that I believe at least the double signaling is dealt with.

I would recommend at least adding this problem to the documentation of
wait*, that if one of the timed waits is used the wait (without timeout)
contract is violated in that it might return even if no user process sent a
signal.

Best regards and thanks for considering my comments in the first place!
Georg

On Jan 5, 2008 3:05 PM, Andreas Raab <andreas.raab at gmx.de> wrote:

> Georg Köster wrote:
> > Hey I prefer Tom's version! I shouldn't be racy if I read the
> > resumeProcess and resume code correctly. The resumeProcess message has
> > no effect on non-waiting processes!
>
> To the contrary. This code *introduces* a race condition when
> manipulating the Semaphore's list of processes without protecting it
> against concurrent modifications. Worse, the code *cannot be protected*
> against concurrent modification since the VM manipulates that list on
> its own. It may actually explain why some people have reported issues on
> 3.9 that do not seem to appear on 3.8 or 3.10 variants.
>
> >
> >  >               | waitingProcess wakeupProcess |
> >  >               waitingProcess _ Processor activeProcess.
> >  >               wakeupProcess _
> >  >                       [(Delay forMilliseconds: (anInteger max: 0))
> wait.
> >  >                       self resumeProcess: waitingProcess] fork.
> >  >
> >  >               self wait.
> >     "preempting here and getting the resumeProcess message sent would
> > have no effect - therefore no race!"
>
> What are you talking about? With the original code there was no race
> condition whatsoever. If you think there is a race condition somewhere,
> please explain in detail where you think that race condition is.
>
> >     "in comparison having a semaphore getting signaled here would cause
> > an excess signal on the sem: bad"
>
> First, the original code was guarded with a call to #unschedule which
> would avoid a second signal if the "real event" occurs long enough
> before the "timeout event". On the other hand, neither implementation
> prevents a double-signal if the real event occurs within an epsilon of
> the timeout event. And due to the asynchronous nature of some semaphore
> signals it is impossible to have a light-weight implementation like the
> one provided in Semaphore handle this correctly.  It's possible to
> handle this but this would require to separate the semaphore and the
> delay signal and use a critical section to decide which one came first
> and how to deal with the other. That, on the other hand, seems overkill
> for the 99% of the cases in which the simplistic implementation in
> 3.8/3.10 is all that's needed.
>
> Cheers,
>   - Andreas
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20080105/855b2d26/attachment.htm


More information about the Squeak-dev mailing list