Obscure hang
Bert Freudenberg
bert at impara.de
Mon Jan 9 17:43:33 UTC 2006
Hi folks,
I've had some rather obscure hangs on my squeak-source server lately.
The funny thing is, as soon as I connect to the Xvnc server and move
the mouse, things start rolling again. Otherwise, it takes a couple
of minutes until serving continues. I have no idea why.
Here are the file descriptors for reference. #3 is the X server
(Xvnc) connection:
squeaksou 30574 source 0r CHR 1,3 31583 /dev/
null
squeaksou 30574 source 1w REG 3,5 32895 16482319 .../
some.log
squeaksou 30574 source 2w REG 3,5 32895 16482319 .../
some.log
squeaksou 30574 source 3u unix 0xf2119ac0 3367464 socket
squeaksou 30574 source 4r REG 3,5 23206828 12402706 .../
some.image
squeaksou 30574 source 5r DIR 3,5 4096 12402698 .../
somedir
squeaksou 30574 source 6r REG 3,5 14542313 17317896 .../
SqueakV3.sources
squeaksou 30574 source 7u REG 3,5 20230368 12402705 .../
some.changes
squeaksou 30574 source 8u IPv4 4290750 TCP
*:someport (LISTEN)
Unfortunately I took this only after the hang. Socket #9 must be a
file that was interrupted while writing, 10 and 11 is an HTTP
connection pair.
There were already 3581255 bytes of 4457385 bytes written to file #9
at this point.
=== According to strace the VM was chugging along just fine:
gettimeofday({1136823729, 265939}, NULL) = 0
gettimeofday({1136823729, 265985}, NULL) = 0
gettimeofday({1136823729, 266050}, NULL) = 0
gettimeofday({1136823729, 266092}, NULL) = 0
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn() = ? (mask now [])
gettimeofday({1136823729, 277789}, NULL) = 0
gettimeofday({1136823729, 277819}, NULL) = 0
gettimeofday({1136823729, 277856}, NULL) = 0
gettimeofday({1136823729, 277877}, NULL) = 0
gettimeofday({1136823729, 277910}, NULL) = 0
gettimeofday({1136823729, 277939}, NULL) = 0
ioctl(3, FIONREAD, [0]) = 0
gettimeofday({1136823729, 278015}, NULL) = 0
select(9, [3], [], [3 8], {0, 0}) = 0 (Timeout)
gettimeofday({1136823729, 278212}, NULL) = 0
gettimeofday({1136823729, 289711}, NULL) = 0
gettimeofday({1136823729, 289751}, NULL) = 0
gettimeofday({1136823729, 289813}, NULL) = 0
gettimeofday({1136823729, 289854}, NULL) = 0
...
=== but nothing was served to sockets 10/11 nor written to file #9.
Now some X data arrives on socket #3
...
gettimeofday({1136823748, 170924}, NULL) = 0
gettimeofday({1136823748, 170956}, NULL) = 0
gettimeofday({1136823748, 170984}, NULL) = 0
ioctl(3, FIONREAD, [32]) = 0
read(3, "\7\1W\204A\306\371\257%\0\0\0\1\0\200\0\2\0\200\0\217
\003"..., 32) = 32
ioctl(3, FIONREAD, [32]) = 0
read(3, "\6\0W\204B\306\371\257%\0\0\0\2\0\200\0\0\0\0\0\211\003"...,
32) = 32
gettimeofday({1136823748, 171177}, NULL) = 0
ioctl(3, FIONREAD, [0]) = 0
gettimeofday({1136823748, 171221}, NULL) = 0
select(9, [3], [], [3 8], {0, 0}) = 0 (Timeout)
...
==== and 5 seconds later it answers the (pending?) request on sockets
10/11
ioctl(3, FIONREAD, [0]) = 0
gettimeofday({1136823753, 233667}, NULL) = 0
select(9, [3], [], [3 8], {0, 0}) = 0 (Timeout)
gettimeofday({1136823753, 234002}, NULL) = 0
gettimeofday({1136823753, 234042}, NULL) = 0
gettimeofday({1136823753, 234070}, NULL) = 0
gettimeofday({1136823753, 234110}, NULL) = 0
ioctl(10, FIONBIO, [1]) = 0
gettimeofday({1136823753, 234306}, NULL) = 0
gettimeofday({1136823753, 234330}, NULL) = 0
gettimeofday({1136823753, 234352}, NULL) = 0
gettimeofday({1136823753, 234377}, NULL) = 0
gettimeofday({1136823753, 234410}, NULL) = 0
gettimeofday({1136823753, 234431}, NULL) = 0
gettimeofday({1136823753, 234456}, NULL) = 0
getpeername(10, {sa_family=AF_INET, sin_port=htons(44431),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
time(NULL) = 1136823753
gettimeofday({1136823753, 234716}, NULL) = 0
gettimeofday({1136823753, 234738}, NULL) = 0
recv(10, "G", 1, MSG_PEEK) = 1
read(10, "GET /3d/?C=M;O=D HTTP/1.1\r\nHost:"..., 2000) = 266
time(NULL) = 1136823753
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn() = ? (mask now [])
gettimeofday({1136823753, 239059}, NULL) = 0
gettimeofday({1136823753, 239600}, NULL) = 0
gettimeofday({1136823753, 239624}, NULL) = 0
time(NULL) = 1136823753
gettimeofday({1136823753, 243498}, NULL) = 0
select(11, NULL, [10], NULL, {0, 0}) = 1 (out [10], left {0, 0})
write(10, "HTTP/1.1 200 OK\r\nKeep-Alive: tim"..., 190) = 190
gettimeofday({1136823753, 243785}, NULL) = 0
select(11, NULL, [10], NULL, {0, 0}) = 1 (out [10], left {0, 0})
write(10, "<?xml version=\"1.0\" encoding=\"is"..., 3811) = -1 EPIPE
(Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---
=== immediately followed by writing to file #9
gettimeofday({1136823753, 244020}, NULL) = 0
getpeername(10, 0xbffff9d0, [16]) = -1 ENOTCONN (Transport
endpoint is not connected)
time(NULL) = 1136823753
setsockopt(10, SOL_SOCKET, SO_LINGER, {onoff=0, linger=0}, 8) = 0
close(10) = 0
write(9, "\10", 1) = 1
write(9, "\0", 1) = 1
write(9, "\0", 1) = 1
write(9, "\0", 1) = 1
write(9, "\1", 1) = 1
write(9, "\t", 1) = 1
_llseek(9, 3579904, [3579904], SEEK_SET) = 0
read(9, "ubclasses.\20\3\37\312\20\5\37\325\4\0\0\0\0\n\0\34\315"...,
1356) = 1356
read(9, "\t", 4096) = 1
_llseek(9, 3581261, [3581261], SEEK_SET) = 0
_llseek(9, 3581261, [3581261], SEEK_SET) = 0
_llseek(9, -1, [3581260], SEEK_CUR) = 0
write(9, "\20", 1) = 1
write(9, "\t", 1) = 1
write(9, "\37", 1) = 1
write(9, "\3", 1) = 1
=== 11 seconds later the saving was done (yes storing is that slow):
close(9) = 0
munmap(0x40188000, 4096) = 0
gettimeofday({1136823764, 800773}, NULL) = 0
=== ... and everything is back to normal ...
All this is on Linux, 2.4 kernel, Ian's 3.7.7 VM.
Did anybody see something similar? Any ideas?
- Bert -
More information about the Vm-dev
mailing list