[Box-Admins] source.squeak.org is down

David T. Lewis lewis at mail.msen.com
Mon Jun 9 22:42:00 UTC 2014


Ah, very good.

We seem to have had the same problem two weeks in a row. I would not be
surprised to see it come back next week at the same time. If so, we can
look for excess Mantis processes.

Dave

On Mon, Jun 09, 2014 at 07:18:42PM +0200, Levente Uzonyi wrote:
> I just checked it again, and I'm pretty sure it's mantis what leaks the 
> child processes. When the server runs out of them, it will not serve 
> anything anymore.
> 
> On Mon, 9 Jun 2014, Levente Uzonyi wrote:
> 
> >I checked it last time, and apache ran out of child processes (200). If it 
> >happens again, we should check what leaks them.
> >
> >
> >Levente
> >
> >On Mon, 9 Jun 2014, David T. Lewis wrote:
> >
> >>On Mon, Jun 09, 2014 at 07:29:20AM -0400, David T. Lewis wrote:
> >>>The problem seems to be back. The symptoms that I see are:
> >>>
> >>>- http://source.squeak.org and http://bugs.squeak.org are not responding
> >>>- http://squeak.org is working normally
> 
> squeak.org is served from box4.
> 
> >>>- http://source.squeak.org:9090 brings up the source.squeak.org page
> >>>
> >>>Processes are running normally on box2.squeak.org.
> >>>
> >>>The last time this happened, restarting apache2 two times seems to have
> >>>resolved the problem. I will try doing that now.
> >>>
> >>
> >>I have restarted apache a total of three times (sudo /etc/init.d/apache2 
> >>restart),
> >>and the problem remains.
> 
> I think the best way is to stop it first, then check if it has really 
> stopped, and then restart.
> 
> >>
> >>The last time this happened was one week ago, and at the time we noticed
> >>an rsync backup job eating a lot of system resource. That is not the case
> >>right now, but it's possible there is some issue associated with backup
> >>jobs (run from cron) that somehow affects the apache addressing.
> 
> I don't think that the local rsync has anything to do with this. It will 
> eat lots of resources, because it's reading and writing the same hard 
> drive at the same time. Because of the high IO usage, the server will show 
> high load.
> 
> >>
> >>I will probably not be able to take any further action on this today. My
> >>suggestion is that someone (anyone) on box-admins restart the apache2
> >>service about an hour from now, and keep trying every hour until it 
> >>decides
> >>to start working.
> 
> I think it's a crawler which triggers the problem, so getting rid of that 
> might resolve this problem for a while.
> 
> 
> Levente
> 
> >>
> >>Dave
> >>
> >>
> >>>On Mon, Jun 02, 2014 at 02:35:01PM +0200, Tobias Pape wrote:
> >>>>Hi,
> >>>>
> >>>>On 02.06.2014, at 13:18, David T. Lewis <lewis at mail.msen.com> wrote:
> >>>>
> >>>>[?]
> >>>>>>
> >>>>>>Thanks Tobias,
> >>>>>>
> >>>>>>I will be away with no easy access to box2 for the next day, so I
> >>>>>>don't want to touch anything that might cause a bigger problem that
> >>>>>>I can't fix. That said ... I don't think the runaway rnapshot process
> >>>>>>is the cause of the problem, although it might be a symptom of 
> >>>>>>something
> >>>>>>else. I can connect to source.squeak.org on 9090 and it works fine, so
> >>>>>>the CPU usage is not directly causing a problem.
> >>>>>>
> >>>>>>I will try restarting apache (sudo /etc/init.d/apache2 restart) and 
> >>>>>>see
> >>>>>>if it helps.
> >>>>>>
> >>>>>
> >>>>>No, that did not fix it. www.squeak.org remains active, but 
> >>>>>source.squeak.org
> >>>>>and bugs.squeak.org are not available.
> >>>>>
> >>>>>As before, connecting to http://source.squeak.org:9090 brings up the 
> >>>>>web page
> >>>>>normally.
> >>>>
> >>>>It turns out that apache did not successfully restart.
> >>>>Why, remains a mystery. Yet I now have started it and
> >>>>the pages should be fine again.
> >>>>
> >>>>Best
> >>>>	-Tobias
> >>>>
> >>>
> >>
> >


More information about the Box-Admins mailing list