Ah, very good.
We seem to have had the same problem two weeks in a row. I would not be surprised to see it come back next week at the same time. If so, we can look for excess Mantis processes.
Dave
On Mon, Jun 09, 2014 at 07:18:42PM +0200, Levente Uzonyi wrote:
I just checked it again, and I'm pretty sure it's mantis what leaks the child processes. When the server runs out of them, it will not serve anything anymore.
On Mon, 9 Jun 2014, Levente Uzonyi wrote:
I checked it last time, and apache ran out of child processes (200). If it happens again, we should check what leaks them.
Levente
On Mon, 9 Jun 2014, David T. Lewis wrote:
On Mon, Jun 09, 2014 at 07:29:20AM -0400, David T. Lewis wrote:
The problem seems to be back. The symptoms that I see are:
- http://source.squeak.org and http://bugs.squeak.org are not responding
- http://squeak.org is working normally
squeak.org is served from box4.
- http://source.squeak.org:9090 brings up the source.squeak.org page
Processes are running normally on box2.squeak.org.
The last time this happened, restarting apache2 two times seems to have resolved the problem. I will try doing that now.
I have restarted apache a total of three times (sudo /etc/init.d/apache2 restart), and the problem remains.
I think the best way is to stop it first, then check if it has really stopped, and then restart.
The last time this happened was one week ago, and at the time we noticed an rsync backup job eating a lot of system resource. That is not the case right now, but it's possible there is some issue associated with backup jobs (run from cron) that somehow affects the apache addressing.
I don't think that the local rsync has anything to do with this. It will eat lots of resources, because it's reading and writing the same hard drive at the same time. Because of the high IO usage, the server will show high load.
I will probably not be able to take any further action on this today. My suggestion is that someone (anyone) on box-admins restart the apache2 service about an hour from now, and keep trying every hour until it decides to start working.
I think it's a crawler which triggers the problem, so getting rid of that might resolve this problem for a while.
Levente
Dave
On Mon, Jun 02, 2014 at 02:35:01PM +0200, Tobias Pape wrote:
Hi,
On 02.06.2014, at 13:18, David T. Lewis lewis@mail.msen.com wrote:
[?]
> >Thanks Tobias, > >I will be away with no easy access to box2 for the next day, so I >don't want to touch anything that might cause a bigger problem that >I can't fix. That said ... I don't think the runaway rnapshot process >is the cause of the problem, although it might be a symptom of >something >else. I can connect to source.squeak.org on 9090 and it works fine, so >the CPU usage is not directly causing a problem. > >I will try restarting apache (sudo /etc/init.d/apache2 restart) and >see >if it helps. >
No, that did not fix it. www.squeak.org remains active, but source.squeak.org and bugs.squeak.org are not available.
As before, connecting to http://source.squeak.org:9090 brings up the web page normally.
It turns out that apache did not successfully restart. Why, remains a mystery. Yet I now have started it and the pages should be fine again.
Best -Tobias