[Vm-dev] Socket's readSemaphore is losing signals with Cog on
Linux
Andreas Raab
andreas.raab at gmx.de
Sun Aug 14 17:58:06 UTC 2011
On 8/13/2011 13:42, Levente Uzonyi wrote:
> Socket's readSemaphore is losing signals with CogVMs on linux. We
> found several cases (RFB, PostgreSQL) when processes are stuck in the
> following method:
>
> Socket >> waitForDataIfClosed: closedBlock
> "Wait indefinitely for data to arrive. This method will block until
> data is available or the socket is closed."
>
> [
> (self primSocketReceiveDataAvailable: socketHandle)
> ifTrue: [^self].
> self isConnected
> ifFalse: [^closedBlock value].
> self readSemaphore wait ] repeat
>
> When we inspect the contexts, the process is waiting for the
> readSemaphore, but evaluating (self primSocketReceiveDataAvailable:
> socketHandle) yields true. Signaling the readSemaphore makes the
> process running again. As a workaround we replaced #wait with
> #waitTimeoutMSecs: and all our problems disappeared.
>
> The interpreter VM doesn't seem to have this bug, so I guess the bug
> was introduced with the changes of aio.c.
Oh, interesting. We know this problem fairly well and have always worked
around by changing the wait in the above to a "waitTimeoutMSecs: 500"
which turns it into a soft busy loop. It would be interesting to see if
there's a bug in Cog which causes this. FWIW, here is the relevant portion:
"Soft 500ms busy loop - to protect against AIO probs;
occasionally, VM-level AIO fails to trip the semaphore"
self readSemaphore waitTimeoutMSecs: 500.
Cheers,
- Andreas
More information about the Vm-dev
mailing list