http://forum.world.st/Socket-s-readSemaphore-is-losing-signals-with-Cog-on-Linux-td3741174.html

On 8/13/2011 13:42, Levente Uzonyi wrote:

> Socket's readSemaphore is losing signals with CogVMs on linux. We 
> found several cases (RFB, PostgreSQL) when processes are stuck in the 
> following method: 
> Socket >> waitForDataIfClosed: closedBlock 
>     "Wait indefinitely for data to arrive.  This method will block until 
>     data is available or the socket is closed." 
>     [ 
>         (self primSocketReceiveDataAvailable: socketHandle) 
>             ifTrue: [^self]. 
>         self isConnected 
>             ifFalse: [^closedBlock value]. 
>         self readSemaphore wait ] repeat 
> When we inspect the contexts, the process is waiting for the 
> readSemaphore, but evaluating (self primSocketReceiveDataAvailable: 
> socketHandle) yields true. Signaling the readSemaphore makes the 
> process running again. As a workaround we replaced #wait with 
> #waitTimeoutMSecs: and all our problems disappeared. 
> The interpreter VM doesn't seem to have this bug, so I guess the bug 
> was introduced with the changes of aio.c.
«  [hide part of quote]

Oh, interesting. We know this problem fairly well and have always worked 
around by changing the wait in the above to a "waitTimeoutMSecs: 500" 
which turns it into a soft busy loop. It would be interesting to see if 
there's a bug in Cog which causes this. FWIW, here is the relevant portion: 

             "Soft 500ms busy loop - to protect against AIO probs; 
             occasionally, VM-level AIO fails to trip the semaphore" 
             self readSemaphore waitTimeoutMSecs: 500. 

Cheers, 
   - Andreas 
_,,,^..^,,,_
best, Eliot