[Vm-dev] Socket>>#sendData:count: does no error checking and hence locks up the system

Eliot Miranda eliot.miranda at gmail.com
Tue Mar 1 03:23:40 UTC 2016


Hi Levente, Hi All,

    I'm trying to investigate the socket issues in aio.c but have found a
much moire basic issue.  With my recent changes to Network that more
carefully checked for errors the SocketTest>>testSocketReuse test appears
to lock up.  In fact, the VM is fine, happily doing what it's being told by
Socket>>#sendData:count:

Socket>>sendData: buffer count: n
"Send the amount of data from the given buffer"
| sent |
sent := 0.
[sent < n] whileTrue:[
sent := sent + (self sendSomeData: buffer startIndex: sent+1 count:
(n-sent))].

The VM keeps trying to send data on a socket that is being reused and gets
an error from sendto, answers 0 as the number of bytes sent, as required,
but Socket>>#sendData:count: pays no heed and spins hard.  Here's the
traces:

The test is SocketTest>>testSocketReuse which spawns two processes, one to
send and one to receive data.  Here are the processes:

Process  0x48641f8 priority 40
0xbfec0498 M Socket>sendSomeData:startIndex:count:for: 0x4864d18: a(n)
Socket
0xbfec04c0 M Socket>sendSomeData:startIndex:count: 0x4864d18: a(n) Socket
0xbfec04ec M Socket>sendData:count: 0x4864d18: a(n) Socket
0xbfec0520 I [] in SocketTest>testSocketReuse 0x4864dd0: a(n) SocketTest
0xbfec0540 I [] in BlockClosure>newProcess 0x4864df0: a(n) BlockClosure

Process  0x6543178 priority 40
0xbfec22c8 I [] in Delay>wait 0x4864ea0: a(n) Delay
0xbfec22f0 I BlockClosure>ifCurtailed: 0x4864eb8: a(n) BlockClosure
0xbfec2314 I Delay>wait 0x4864ea0: a(n) Delay
0xbfec2340 I [] in SocketTest>testSocketReuse 0x4864dd0: a(n) SocketTest
0xbfec2360 M BlockClosure>ensure: 0x4864fa8: a(n) BlockClosure
0xbfec2390 I SocketTest>testSocketReuse 0x4864dd0: a(n) SocketTest

Process  0x4864168 priority 40
0xbfec3438 I [] in DelayWaitTimeout>wait 0x48652f8: a(n) DelayWaitTimeout
0xbfec3458 M BlockClosure>ensure: 0x4865378: a(n) BlockClosure
0xbfec347c I DelayWaitTimeout>wait 0x48652f8: a(n) DelayWaitTimeout
0xbfec34a0 I Semaphore>waitTimeoutMSecs: 0x48652e0: a(n) Semaphore
0xbfec34c4 I Socket>waitForDataIfClosed: 0x4865408: a(n) Socket
0xbfec34f0 I Socket>receiveDataInto:startingAt: 0x4865408: a(n) Socket
0xbfec3520 I [] in SocketTest>testSocketReuse 0x4864dd0: a(n) SocketTest
0xbfec3540 I [] in BlockClosure>newProcess 0x48654c0: a(n) BlockClosure

And here's the VM spinning:
   15726    0 sqUnixSocket.c:1128 UDP sendData(11, 16)
   15726    0 sqUnixSocket.c:1134 UDP send failed 56 Socket is already
connected
   15726    0 sqUnixSocket.c:1128 UDP sendData(11, 16)
   15726    0 sqUnixSocket.c:1134 UDP send failed 56 Socket is already
connected
   15726    0 sqUnixSocket.c:1128 UDP sendData(11, 16)
   15726    0 sqUnixSocket.c:1134 UDP send failed 56 Socket is already
connected
   ...etc...

Ah!! Of course.  Because I have changed the default scheduling semantics in
Squeak 5 to make preemption not a yield point, Socket>>#sendData:count:
 never yields to the other processes.  Previously when the Delay process
woke up this would implicitly yield the process spinning in
Socket>>#sendData:count:.

So Socket>>#sendData:count: needs to do a yield if no data is sent.
However, shouldn't but also check for errors if no data is sent and do
something like return an error if it discovers, via
Socket>>primSocketError:, that the socket is not happy?

_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20160229/084230ca/attachment-0001.htm


More information about the Vm-dev mailing list