[Vm-dev] Image freeze because handleTimerEvent and Seaside process gone?!

David T. Lewis lewis at mail.msen.com
Mon Dec 6 11:54:30 UTC 2010


Have a look at /proc/<vmpid>/fd/* for a VM process that has been
running for a while, and check for accumulation of open file handles
over time. If sockets or files are not being closed, the open file
handles can accumulate over time and eventually hit whatever
per-process limit is in place (typically 1024 per process). I could
imagine this leading to a condition where the process that is trying
to open new session requests would fail.

If this turns out to be the case, check what your are doing with
OSProcess, as it's quite easy to e.g. use a PipeableOSProcess and
forget to close the output pipes when done.

Dave


On Mon, Dec 06, 2010 at 11:55:55AM +0100, Adrian Lienhard wrote:
> 
> Hi all,
> 
> We've been experiencing an "interesting" problem: the image freezes and does not response to HTTP requests anymore after it has been running for days. 
> 
> Here some basic information about our setup:
> 
> Squeak VM 4.0.3-2202 compiled with gcc 4.3.2
> PharoCore 1.1
> OS Debian Lenny amd64 (CPUs are 4 Intel Xeon E5530 2.40GHz)
> 
> - We have never seen the problem with the Squeak VM 3.9-9 and Squeak 3.9 on the identical machine and with the same application source (modulo some adaptations to make it run on Pharo).
> - We run the VM with -mmap 512m -vm-sound-null -vm-display-null, and the UI process is suspended (Project uiProcess suspend)
> - VM does not hog the CPU and memory usage is normal
> - The meantime between failure is several weeks and we haven't managed to reproduce the problem
> - The application mainly serves HTTP requests. When the image does not receive requests for some time we send it a STOP signal, when a request comes in it is sent a CONT signal.
> - lsof shows
> 	TCP *:9093 (LISTEN)
> 	TCP server:9093->server:46930 (CLOSE_WAIT)
> 
> Below is a GDB backtrace and the Smalltalk stacks from an image that was frozen (the VM had been running for almost 100 hours):
> 
> =============================================================
> (gdb) bt
> #0  0x08072020 in ?? ()
> #1  <signal handler called>
> #2  0xb766f5e0 in malloc () from /lib/libc.so.6
> #3  <function called from gdb>
> #4  0xb76c50c8 in select () from /lib/libc.so.6
> #5  0x08071063 in aioPoll ()
> #6  0xb778bb8d in ?? () from /usr/lib/squeak/4.0.3-2202//so.vm-display-null
> #7  0x000003e8 in ?? ()
> #8  0x997b5a34 in ?? ()
> #9  0xbfe7cb28 in ?? ()
> #10 0x08074575 in ioRelinquishProcessorForMicroseconds ()
> Backtrace stopped: frame did not save the PC
> 
> (gdb) call printCallStack() 
> -1719969228 >idleProcess
> -1719969320 >startUp
> -1740134028 BlockClosure>newProcess
> $3 = -1755344892
> 
> (gdb) call (int) printAllStacks()
> Process
> -1719969228 >idleProcess
> -1719969320 >startUp
> -1740134028 BlockClosure>newProcess
> 
> Process
> -1740113860 >finalizationProcess
> -1740113952 >restartFinalizationProcess
> -1740113532 BlockClosure>newProcess
> 
> Process
> -1740134424 SmalltalkImage>lowSpaceWatcher
> -1740134516 SmalltalkImage>installLowSpaceWatcher
> -1740134300 BlockClosure>newProcess
> 
> Process
> -1719451488 Delay>wait
> -1719451580 BlockClosure>ifCurtailed:
> -1719451704 Delay>wait
> -1719451796 InputEventPollingFetcher>waitForInput
> -1740126940 InputEventFetcher>eventLoop
> -1740127032 InputEventFetcher>installEventLoop
> -1740126816 BlockClosure>newProcess
> 
> Process
> -1719557780 UnixOSProcessAccessor>grimReaperProcess
> -1740113624 BlockClosure>repeat
> -1740113716 UnixOSProcessAccessor>grimReaperProcess
> -1740117340 BlockClosure>newProcess
> 
> [omitted many newlines between output above]
> =============================================================
> 
> What is striking from the above process listing is that two processes are missing: the handleTimerEvent process and the Seaside process (that is, the TCP listener loop). How comes these processes vanished?
> 
> This may be related to Pharo or to the Squeak VM.
> 
> Has anybody else seen this problem? Any idea how to debug/fix this issue is very much appreciated!
> 
> Cheers,
> Adrian
> 
> 
> CCed to pharo-dev since this may be related to Pharo; please respond on the squeak-vm list
> 


More information about the Vm-dev mailing list