http://www.downforeveryoneorjustme.com/source.squeak.org
I tried simply killing the process on the server, and daemontools seemed to restart it, but it does not seem to be coming back up..
On Sun, Jun 01, 2014 at 08:21:13PM -0500, Chris Muller wrote:
http://www.downforeveryoneorjustme.com/source.squeak.org
I tried simply killing the process on the server, and daemontools seemed to restart it, but it does not seem to be coming back up..
I'm not sure what the problem is, but it's something related to DNS or the Apache configuration. The squeaksource image itself is fine, and if you open http://source.squeak.org:9090 you see the expected user interface. So it must be something to do with mapping source.squeak.org requests to port 9090 on the server.
Has anyone changed anything related to Apache configurations on the box2 server?
Dave
Hello everyone.
On 02.06.2014, at 03:55, David T. Lewis lewis@mail.msen.com wrote:
On Sun, Jun 01, 2014 at 08:21:13PM -0500, Chris Muller wrote:
http://www.downforeveryoneorjustme.com/source.squeak.org
I tried simply killing the process on the server, and daemontools seemed to restart it, but it does not seem to be coming back up..
I'm not sure what the problem is, but it's something related to DNS or the Apache configuration. The squeaksource image itself is fine, and if you open http://source.squeak.org:9090 you see the expected user interface. So it must be something to do with mapping source.squeak.org requests to port 9090 on the server.
Has anyone changed anything related to Apache configurations on the box2 server?
I've been on the server. Problem is, rsnapshot's copying process is eating all cpu since this morning, also Disk space is low!!
box2:/var/cache/rsnapshot/daily.0/localhost# du -sh * 3.2G etc 28G home 261M root 39M usr 38G var box2:/var/cache/rsnapshot/daily.0/localhost# df -h Filesystem Size Used Avail Use% Mounted on /dev/hda2 146G 130G 8.6G 94% / tmpfs 475M 4.0K 475M 1% /dev/shm
CPU!! 28122 root 18 0 2052 904 1660 S 0.0 0.0 0:00.00 | `- /USR/SBIN/CRON 28123 root 17 0 5296 3820 2708 S 0.0 0.1 0:00.09 | `- /usr/bin/perl -w /usr/bin/rsnapshot daily 28352 root 25 0 199M 187M 1420 R 90.1 6.4 5h22:27 | `- /bin/cp -al /var/cache/rsnapshot/daily.0/ /var/cache/rsnapshot/daily.1/
Please do something or allow me to kill the cp and see if that fixes apache :)
Also, we should care about the disk space!
Best -Tobias
On Mon, Jun 02, 2014 at 11:28:57AM +0200, Tobias Pape wrote:
Hello everyone.
On 02.06.2014, at 03:55, David T. Lewis lewis@mail.msen.com wrote:
On Sun, Jun 01, 2014 at 08:21:13PM -0500, Chris Muller wrote:
http://www.downforeveryoneorjustme.com/source.squeak.org
I tried simply killing the process on the server, and daemontools seemed to restart it, but it does not seem to be coming back up..
I'm not sure what the problem is, but it's something related to DNS or the Apache configuration. The squeaksource image itself is fine, and if you open http://source.squeak.org:9090 you see the expected user interface. So it must be something to do with mapping source.squeak.org requests to port 9090 on the server.
Has anyone changed anything related to Apache configurations on the box2 server?
I've been on the server. Problem is, rsnapshot's copying process is eating all cpu since this morning, also Disk space is low!!
box2:/var/cache/rsnapshot/daily.0/localhost# du -sh * 3.2G etc 28G home 261M root 39M usr 38G var box2:/var/cache/rsnapshot/daily.0/localhost# df -h Filesystem Size Used Avail Use% Mounted on /dev/hda2 146G 130G 8.6G 94% / tmpfs 475M 4.0K 475M 1% /dev/shm
CPU!!
28122 root 18 0 2052 904 1660 S 0.0 0.0 0:00.00 | `- /USR/SBIN/CRON 28123 root 17 0 5296 3820 2708 S 0.0 0.1 0:00.09 | `- /usr/bin/perl -w /usr/bin/rsnapshot daily 28352 root 25 0 199M 187M 1420 R 90.1 6.4 5h22:27 | `- /bin/cp -al /var/cache/rsnapshot/daily.0/ /var/cache/rsnapshot/daily.1/
Please do something or allow me to kill the cp and see if that fixes apache :)
Also, we should care about the disk space!
Thanks Tobias,
I will be away with no easy access to box2 for the next day, so I don't want to touch anything that might cause a bigger problem that I can't fix. That said ... I don't think the runaway rnapshot process is the cause of the problem, although it might be a symptom of something else. I can connect to source.squeak.org on 9090 and it works fine, so the CPU usage is not directly causing a problem.
I will try restarting apache (sudo /etc/init.d/apache2 restart) and see if it helps.
Dave
On Mon, Jun 02, 2014 at 07:12:41AM -0400, David T. Lewis wrote:
On Mon, Jun 02, 2014 at 11:28:57AM +0200, Tobias Pape wrote:
Hello everyone.
On 02.06.2014, at 03:55, David T. Lewis lewis@mail.msen.com wrote:
On Sun, Jun 01, 2014 at 08:21:13PM -0500, Chris Muller wrote:
http://www.downforeveryoneorjustme.com/source.squeak.org
I tried simply killing the process on the server, and daemontools seemed to restart it, but it does not seem to be coming back up..
I'm not sure what the problem is, but it's something related to DNS or the Apache configuration. The squeaksource image itself is fine, and if you open http://source.squeak.org:9090 you see the expected user interface. So it must be something to do with mapping source.squeak.org requests to port 9090 on the server.
Has anyone changed anything related to Apache configurations on the box2 server?
I've been on the server. Problem is, rsnapshot's copying process is eating all cpu since this morning, also Disk space is low!!
box2:/var/cache/rsnapshot/daily.0/localhost# du -sh * 3.2G etc 28G home 261M root 39M usr 38G var box2:/var/cache/rsnapshot/daily.0/localhost# df -h Filesystem Size Used Avail Use% Mounted on /dev/hda2 146G 130G 8.6G 94% / tmpfs 475M 4.0K 475M 1% /dev/shm
CPU!!
28122 root 18 0 2052 904 1660 S 0.0 0.0 0:00.00 | `- /USR/SBIN/CRON 28123 root 17 0 5296 3820 2708 S 0.0 0.1 0:00.09 | `- /usr/bin/perl -w /usr/bin/rsnapshot daily 28352 root 25 0 199M 187M 1420 R 90.1 6.4 5h22:27 | `- /bin/cp -al /var/cache/rsnapshot/daily.0/ /var/cache/rsnapshot/daily.1/
Please do something or allow me to kill the cp and see if that fixes apache :)
Also, we should care about the disk space!
Thanks Tobias,
I will be away with no easy access to box2 for the next day, so I don't want to touch anything that might cause a bigger problem that I can't fix. That said ... I don't think the runaway rnapshot process is the cause of the problem, although it might be a symptom of something else. I can connect to source.squeak.org on 9090 and it works fine, so the CPU usage is not directly causing a problem.
I will try restarting apache (sudo /etc/init.d/apache2 restart) and see if it helps.
No, that did not fix it. www.squeak.org remains active, but source.squeak.org and bugs.squeak.org are not available.
As before, connecting to http://source.squeak.org:9090 brings up the web page normally.
Dave
Hi,
On 02.06.2014, at 13:18, David T. Lewis lewis@mail.msen.com wrote:
[…]
Thanks Tobias,
I will be away with no easy access to box2 for the next day, so I don't want to touch anything that might cause a bigger problem that I can't fix. That said ... I don't think the runaway rnapshot process is the cause of the problem, although it might be a symptom of something else. I can connect to source.squeak.org on 9090 and it works fine, so the CPU usage is not directly causing a problem.
I will try restarting apache (sudo /etc/init.d/apache2 restart) and see if it helps.
No, that did not fix it. www.squeak.org remains active, but source.squeak.org and bugs.squeak.org are not available.
As before, connecting to http://source.squeak.org:9090 brings up the web page normally.
It turns out that apache did not successfully restart. Why, remains a mystery. Yet I now have started it and the pages should be fine again.
Best -Tobias
PS: Debian Sarge for box2 is pretty old, no?
On Mon, Jun 02, 2014 at 02:35:01PM +0200, Tobias Pape wrote:
Hi,
On 02.06.2014, at 13:18, David T. Lewis lewis@mail.msen.com wrote:
[?]
Thanks Tobias,
I will be away with no easy access to box2 for the next day, so I don't want to touch anything that might cause a bigger problem that I can't fix. That said ... I don't think the runaway rnapshot process is the cause of the problem, although it might be a symptom of something else. I can connect to source.squeak.org on 9090 and it works fine, so the CPU usage is not directly causing a problem.
I will try restarting apache (sudo /etc/init.d/apache2 restart) and see if it helps.
No, that did not fix it. www.squeak.org remains active, but source.squeak.org and bugs.squeak.org are not available.
As before, connecting to http://source.squeak.org:9090 brings up the web page normally.
It turns out that apache did not successfully restart. Why, remains a mystery. Yet I now have started it and the pages should be fine again.
Tobias,
Thank you for handling this!
PS: Debian Sarge for box2 is pretty old, no?
Yes. We need to migrate the existing services from the old box2 onto the new box4, then shut down box2.
Dave
The problem seems to be back. The symptoms that I see are:
- http://source.squeak.org and http://bugs.squeak.org are not responding - http://squeak.org is working normally - http://source.squeak.org:9090 brings up the source.squeak.org page
Processes are running normally on box2.squeak.org.
The last time this happened, restarting apache2 two times seems to have resolved the problem. I will try doing that now.
Dave
On Mon, Jun 02, 2014 at 02:35:01PM +0200, Tobias Pape wrote:
Hi,
On 02.06.2014, at 13:18, David T. Lewis lewis@mail.msen.com wrote:
[?]
Thanks Tobias,
I will be away with no easy access to box2 for the next day, so I don't want to touch anything that might cause a bigger problem that I can't fix. That said ... I don't think the runaway rnapshot process is the cause of the problem, although it might be a symptom of something else. I can connect to source.squeak.org on 9090 and it works fine, so the CPU usage is not directly causing a problem.
I will try restarting apache (sudo /etc/init.d/apache2 restart) and see if it helps.
No, that did not fix it. www.squeak.org remains active, but source.squeak.org and bugs.squeak.org are not available.
As before, connecting to http://source.squeak.org:9090 brings up the web page normally.
It turns out that apache did not successfully restart. Why, remains a mystery. Yet I now have started it and the pages should be fine again.
Best -Tobias
On Mon, Jun 09, 2014 at 07:29:20AM -0400, David T. Lewis wrote:
The problem seems to be back. The symptoms that I see are:
- http://source.squeak.org and http://bugs.squeak.org are not responding
- http://squeak.org is working normally
- http://source.squeak.org:9090 brings up the source.squeak.org page
Processes are running normally on box2.squeak.org.
The last time this happened, restarting apache2 two times seems to have resolved the problem. I will try doing that now.
I have restarted apache a total of three times (sudo /etc/init.d/apache2 restart), and the problem remains.
The last time this happened was one week ago, and at the time we noticed an rsync backup job eating a lot of system resource. That is not the case right now, but it's possible there is some issue associated with backup jobs (run from cron) that somehow affects the apache addressing.
I will probably not be able to take any further action on this today. My suggestion is that someone (anyone) on box-admins restart the apache2 service about an hour from now, and keep trying every hour until it decides to start working.
Dave
On Mon, Jun 02, 2014 at 02:35:01PM +0200, Tobias Pape wrote:
Hi,
On 02.06.2014, at 13:18, David T. Lewis lewis@mail.msen.com wrote:
[?]
Thanks Tobias,
I will be away with no easy access to box2 for the next day, so I don't want to touch anything that might cause a bigger problem that I can't fix. That said ... I don't think the runaway rnapshot process is the cause of the problem, although it might be a symptom of something else. I can connect to source.squeak.org on 9090 and it works fine, so the CPU usage is not directly causing a problem.
I will try restarting apache (sudo /etc/init.d/apache2 restart) and see if it helps.
No, that did not fix it. www.squeak.org remains active, but source.squeak.org and bugs.squeak.org are not available.
As before, connecting to http://source.squeak.org:9090 brings up the web page normally.
It turns out that apache did not successfully restart. Why, remains a mystery. Yet I now have started it and the pages should be fine again.
Best -Tobias
I checked it last time, and apache ran out of child processes (200). If it happens again, we should check what leaks them.
Levente
On Mon, 9 Jun 2014, David T. Lewis wrote:
On Mon, Jun 09, 2014 at 07:29:20AM -0400, David T. Lewis wrote:
The problem seems to be back. The symptoms that I see are:
- http://source.squeak.org and http://bugs.squeak.org are not responding
- http://squeak.org is working normally
- http://source.squeak.org:9090 brings up the source.squeak.org page
Processes are running normally on box2.squeak.org.
The last time this happened, restarting apache2 two times seems to have resolved the problem. I will try doing that now.
I have restarted apache a total of three times (sudo /etc/init.d/apache2 restart), and the problem remains.
The last time this happened was one week ago, and at the time we noticed an rsync backup job eating a lot of system resource. That is not the case right now, but it's possible there is some issue associated with backup jobs (run from cron) that somehow affects the apache addressing.
I will probably not be able to take any further action on this today. My suggestion is that someone (anyone) on box-admins restart the apache2 service about an hour from now, and keep trying every hour until it decides to start working.
Dave
On Mon, Jun 02, 2014 at 02:35:01PM +0200, Tobias Pape wrote:
Hi,
On 02.06.2014, at 13:18, David T. Lewis lewis@mail.msen.com wrote:
[?]
Thanks Tobias,
I will be away with no easy access to box2 for the next day, so I don't want to touch anything that might cause a bigger problem that I can't fix. That said ... I don't think the runaway rnapshot process is the cause of the problem, although it might be a symptom of something else. I can connect to source.squeak.org on 9090 and it works fine, so the CPU usage is not directly causing a problem.
I will try restarting apache (sudo /etc/init.d/apache2 restart) and see if it helps.
No, that did not fix it. www.squeak.org remains active, but source.squeak.org and bugs.squeak.org are not available.
As before, connecting to http://source.squeak.org:9090 brings up the web page normally.
It turns out that apache did not successfully restart. Why, remains a mystery. Yet I now have started it and the pages should be fine again.
Best -Tobias
I just checked it again, and I'm pretty sure it's mantis what leaks the child processes. When the server runs out of them, it will not serve anything anymore.
On Mon, 9 Jun 2014, Levente Uzonyi wrote:
I checked it last time, and apache ran out of child processes (200). If it happens again, we should check what leaks them.
Levente
On Mon, 9 Jun 2014, David T. Lewis wrote:
On Mon, Jun 09, 2014 at 07:29:20AM -0400, David T. Lewis wrote:
The problem seems to be back. The symptoms that I see are:
- http://source.squeak.org and http://bugs.squeak.org are not responding
- http://squeak.org is working normally
squeak.org is served from box4.
- http://source.squeak.org:9090 brings up the source.squeak.org page
Processes are running normally on box2.squeak.org.
The last time this happened, restarting apache2 two times seems to have resolved the problem. I will try doing that now.
I have restarted apache a total of three times (sudo /etc/init.d/apache2 restart), and the problem remains.
I think the best way is to stop it first, then check if it has really stopped, and then restart.
The last time this happened was one week ago, and at the time we noticed an rsync backup job eating a lot of system resource. That is not the case right now, but it's possible there is some issue associated with backup jobs (run from cron) that somehow affects the apache addressing.
I don't think that the local rsync has anything to do with this. It will eat lots of resources, because it's reading and writing the same hard drive at the same time. Because of the high IO usage, the server will show high load.
I will probably not be able to take any further action on this today. My suggestion is that someone (anyone) on box-admins restart the apache2 service about an hour from now, and keep trying every hour until it decides to start working.
I think it's a crawler which triggers the problem, so getting rid of that might resolve this problem for a while.
Levente
Dave
On Mon, Jun 02, 2014 at 02:35:01PM +0200, Tobias Pape wrote:
Hi,
On 02.06.2014, at 13:18, David T. Lewis lewis@mail.msen.com wrote:
[?]
Thanks Tobias,
I will be away with no easy access to box2 for the next day, so I don't want to touch anything that might cause a bigger problem that I can't fix. That said ... I don't think the runaway rnapshot process is the cause of the problem, although it might be a symptom of something else. I can connect to source.squeak.org on 9090 and it works fine, so the CPU usage is not directly causing a problem.
I will try restarting apache (sudo /etc/init.d/apache2 restart) and see if it helps.
No, that did not fix it. www.squeak.org remains active, but source.squeak.org and bugs.squeak.org are not available.
As before, connecting to http://source.squeak.org:9090 brings up the web page normally.
It turns out that apache did not successfully restart. Why, remains a mystery. Yet I now have started it and the pages should be fine again.
Best -Tobias
Ah, very good.
We seem to have had the same problem two weeks in a row. I would not be surprised to see it come back next week at the same time. If so, we can look for excess Mantis processes.
Dave
On Mon, Jun 09, 2014 at 07:18:42PM +0200, Levente Uzonyi wrote:
I just checked it again, and I'm pretty sure it's mantis what leaks the child processes. When the server runs out of them, it will not serve anything anymore.
On Mon, 9 Jun 2014, Levente Uzonyi wrote:
I checked it last time, and apache ran out of child processes (200). If it happens again, we should check what leaks them.
Levente
On Mon, 9 Jun 2014, David T. Lewis wrote:
On Mon, Jun 09, 2014 at 07:29:20AM -0400, David T. Lewis wrote:
The problem seems to be back. The symptoms that I see are:
- http://source.squeak.org and http://bugs.squeak.org are not responding
- http://squeak.org is working normally
squeak.org is served from box4.
- http://source.squeak.org:9090 brings up the source.squeak.org page
Processes are running normally on box2.squeak.org.
The last time this happened, restarting apache2 two times seems to have resolved the problem. I will try doing that now.
I have restarted apache a total of three times (sudo /etc/init.d/apache2 restart), and the problem remains.
I think the best way is to stop it first, then check if it has really stopped, and then restart.
The last time this happened was one week ago, and at the time we noticed an rsync backup job eating a lot of system resource. That is not the case right now, but it's possible there is some issue associated with backup jobs (run from cron) that somehow affects the apache addressing.
I don't think that the local rsync has anything to do with this. It will eat lots of resources, because it's reading and writing the same hard drive at the same time. Because of the high IO usage, the server will show high load.
I will probably not be able to take any further action on this today. My suggestion is that someone (anyone) on box-admins restart the apache2 service about an hour from now, and keep trying every hour until it decides to start working.
I think it's a crawler which triggers the problem, so getting rid of that might resolve this problem for a while.
Levente
Dave
On Mon, Jun 02, 2014 at 02:35:01PM +0200, Tobias Pape wrote:
Hi,
On 02.06.2014, at 13:18, David T. Lewis lewis@mail.msen.com wrote:
[?]
> >Thanks Tobias, > >I will be away with no easy access to box2 for the next day, so I >don't want to touch anything that might cause a bigger problem that >I can't fix. That said ... I don't think the runaway rnapshot process >is the cause of the problem, although it might be a symptom of >something >else. I can connect to source.squeak.org on 9090 and it works fine, so >the CPU usage is not directly causing a problem. > >I will try restarting apache (sudo /etc/init.d/apache2 restart) and >see >if it helps. >
No, that did not fix it. www.squeak.org remains active, but source.squeak.org and bugs.squeak.org are not available.
As before, connecting to http://source.squeak.org:9090 brings up the web page normally.
It turns out that apache did not successfully restart. Why, remains a mystery. Yet I now have started it and the pages should be fine again.
Best -Tobias
box-admins@lists.squeakfoundation.org