Hi,
source.squeak.org became unacceptably slow. Both the web interface, and the MC API are very slow. Can someone take a look?
Cheers, Levente
P.S.: It's suspicious that some packages seem to be uploaded multiple times: http://lists.squeakfoundation.org/pipermail/vm-dev/2013-November/date.html
P.P.S: I've lost the private key to the server, that's why I can't check it myself.
Clearly, something is wrong.
top shows the box quite busy. One of the tasks is:
/bin/cp -al /var/cache/rsnapshot/daily.0/ /var/cache/rsnapshot/daily.1/
owned by root. Is someone doing some sort of big copy?
The squeaksource process supporting source.squeak.org is also railing on the CPU. I even see some qmail processes, anyone know about this?
qmailr 28180 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org ademkin@earthlink.net qmailr 28181 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org charleshixsn@earthlink.net qmailr 28182 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org shaping@earthlink.net qmailr 28183 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org agantoniii@earthlink.net
On Sat, Nov 2, 2013 at 7:35 AM, Levente Uzonyi leves@elte.hu wrote:
Hi,
source.squeak.org became unacceptably slow. Both the web interface, and the MC API are very slow. Can someone take a look?
Cheers, Levente
P.S.: It's suspicious that some packages seem to be uploaded multiple times: http://lists.squeakfoundation.org/pipermail/vm-dev/2013-November/date.html
P.P.S: I've lost the private key to the server, that's why I can't check it myself.
On Sat, 2 Nov 2013, Chris Muller wrote:
Clearly, something is wrong.
top shows the box quite busy. One of the tasks is:
/bin/cp -al /var/cache/rsnapshot/daily.0/ /var/cache/rsnapshot/daily.1/
owned by root. Is someone doing some sort of big copy?
That's probably a script doing the daily backups.
The squeaksource process supporting source.squeak.org is also railing
That probably means that full GCs are happening too often, which can be a sign of running low on external semaphores. It's hard to investigate it without having access to the box, but if I were you, I would simply restart it.
on the CPU. I even see some qmail processes, anyone know about this?
Probably the mailing list. Hard to tell if it's normal or not, but it probably is.
qmailr 28180 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org ademkin@earthlink.net qmailr 28181 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org charleshixsn@earthlink.net qmailr 28182 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org shaping@earthlink.net qmailr 28183 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org agantoniii@earthlink.net
Pasting emails is not nice :).
Levente
On Sat, Nov 2, 2013 at 7:35 AM, Levente Uzonyi leves@elte.hu wrote:
Hi,
source.squeak.org became unacceptably slow. Both the web interface, and the MC API are very slow. Can someone take a look?
Cheers, Levente
P.S.: It's suspicious that some packages seem to be uploaded multiple times: http://lists.squeakfoundation.org/pipermail/vm-dev/2013-November/date.html
P.P.S: I've lost the private key to the server, that's why I can't check it myself.
On Sat, Nov 2, 2013 at 3:41 PM, Levente Uzonyi leves@elte.hu wrote:
On Sat, 2 Nov 2013, Chris Muller wrote:
Clearly, something is wrong.
top shows the box quite busy. One of the tasks is:
/bin/cp -al /var/cache/rsnapshot/daily.0/ /var/cache/rsnapshot/daily.1/
owned by root. Is someone doing some sort of big copy?
That's probably a script doing the daily backups.
Hm, still running.
The squeaksource process supporting source.squeak.org is also railing
That probably means that full GCs are happening too often, which can be a sign of running low on external semaphores. It's hard to investigate it without having access to the box, but if I were you, I would simply restart it.
Is there anything I can do to help you get access? Wasn't your public-key ever copied into the .ssh directory? I can't remember exactly which subdir it goes into, but I don't have access to /root/.ssh anyway.
I don't know how to restart it. I tried killing the process by pid but it said "Operation not permitted". Who has sudo access on this box?
on the CPU. I even see some qmail processes, anyone know about this?
Probably the mailing list. Hard to tell if it's normal or not, but it probably is.
qmailr 28180 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org ademkin@earthlink.net qmailr 28181 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org charleshixsn@earthlink.net qmailr 28182 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org shaping@earthlink.net qmailr 28183 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org agantoniii@earthlink.net
Pasting emails is not nice :).
This is a relatively private list and I just want to get things fixed.
Now there are 8 emails "backed up". The 4 above plus 4 more.
On 11/02/2013 04:02 PM, Chris Muller wrote:
I don't know how to restart it. I tried killing the process by pid but it said "Operation not permitted". Who has sudo access on this box?
User_Alias ADMINS=kencausey, colinputney, gorankrampe, bertfreudenberg,\ chrismuller, craig, leventeuzonyi, ceesdegroot, randalschwartz,\ chriscunnington
As you will note that includes you. Perhaps the problem is your password?
This system is the first one I setup sudo on and just expediently started listing users, not really considering that there may be other options. On the other servers I instead enabled it for all users in the sudo group and instead add users to the group. I've had little reason to re-evaluate this on box2 and haven't bothered to change it.
I was considering restarting source.squeak.org myself but I saw you were logged in so I didn't want to interfere with anything you might be doing. I'll assume that I'm free to restart it, however I need to check that it appears the image on the filesystem is not corrupted first. Also I would like to check things first, so it may be a bit before I restart it.
Ken
I don't care for the look of this:
$ ls -l /home/squeaksource/Squeak3.11-8824-SS.image -rw-r--r-- 1 squeaksource squeaksource 206639644 Nov 4 19:04 /home/squeaksource/Squeak3.11-8824-SS.image
and from recent backups:
$ ls -l /var/cache/rsnapshot/daily.*/localhost/home/squeaksource/Squeak3.11-8824-SS.image -rw-r--r-- 2 squeaksource squeaksource 126410176 Nov 3 20:53 /var/cache/rsnapshot/daily.0/localhost/home/squeaksource/Squeak3.11-8824-SS.image -rw-r--r-- 2 squeaksource squeaksource 126410176 Nov 3 20:53 /var/cache/rsnapshot/daily.1/localhost/home/squeaksource/Squeak3.11-8824-SS.image -rw-r--r-- 1 squeaksource squeaksource 107722376 Nov 3 01:45 /var/cache/rsnapshot/daily.2/localhost/home/squeaksource/Squeak3.11-8824-SS.image -rw-r--r-- 1 squeaksource squeaksource 81522832 Nov 2 02:20 /var/cache/rsnapshot/daily.3/localhost/home/squeaksource/Squeak3.11-8824-SS.image -rw-r--r-- 1 squeaksource squeaksource 36027872 Oct 31 15:11 /var/cache/rsnapshot/daily.4/localhost/home/squeaksource/Squeak3.11-8824-SS.image -rw-r--r-- 1 squeaksource squeaksource 36022052 Oct 30 15:09 /var/cache/rsnapshot/daily.5/localhost/home/squeaksource/Squeak3.11-8824-SS.image -rw-r--r-- 1 squeaksource squeaksource 36016332 Oct 29 15:06 /var/cache/rsnapshot/daily.6/localhost/home/squeaksource/Squeak3.11-8824-SS.image
Does anyone with some Squeaksource expertise care to take a look at this?
I apologize for my rustiness on Squeaksource issues. What are we in danger of losing if we replace the image with one from say the 29th?
Ken
Ken, thank you so much. The cause of the slowness we've been seeing the last few days is some runaway seaside client Process. It was busy rendering HTML on the difference between two different versions of 45Deprecated-fbs.8 (with two different UUID's) but, somehow, had itself caught in an endless loop that was sucking resources and causing the image to grow.
I downloaded the image to my local machine, terminated the rogue process from the Process Browser, reuploaded it, renamed the old Squeak3.11-8824-SS.image to .bad, then killed the SS process.
At first, it looked like daemontools was restarting it, but something looked weird when I grepped for the process; there were two of them.
chrismuller@box2:/home/squeaksource$ ps -ef | grep Squeak3.11-8824-SS.image
*root 2218 2109 0 Sep13 ? 00:00:01 readproctitle service errors: ...in BlockContext>newProcess?+ exec /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 -vm-display-none /home/squeaksource/Squeak3.11-8824-SS.image?+ exec /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 -vm-display-none /home/squeaksource/Squeak3.11-8824-SS.image?* *squeaks 30167 2224 0 21:34 ? 00:00:00 /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 -vm-display-none /home/squeaksource/Squeak3.11-8824-SS.image*
And yet, I could not access source.squeak.org.
So, I killed both of those and waited for daemontools to start it again. I thought it would be immediate, but repeated ps checks did not show anything. Maybe I didn't wait long enough, but I went ahead and started it manually.
So source.squeak.org is up but now there are two processes, each owned by root, instead of squeaksource. Hmph.
root 5753 26447 0 21:45 pts/0 00:00:00 sudo /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 -vm-display-none /home/squeaksource/Squeak3.11-8824-SS.image root 5754 5753 15 21:45 pts/0 00:00:12 /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 -vm-display-none /home/squeaksource/Squeak3.11-8824-SS.image
I don't know why daemontools didn't start it automatically, but as I was sitting there, waiting for it, I realized how much I dislike not being in control of whether the process is running or not...
Any advice?
On Mon, Nov 4, 2013 at 1:49 PM, Ken Causey ken@kencausey.com wrote:
On 11/02/2013 04:02 PM, Chris Muller wrote:
I don't know how to restart it. I tried killing the process by pid
but it said "Operation not permitted". Who has sudo access on this box?
User_Alias ADMINS=kencausey, colinputney, gorankrampe, bertfreudenberg,\ chrismuller, craig, leventeuzonyi, ceesdegroot, randalschwartz,\ chriscunnington
As you will note that includes you. Perhaps the problem is your password?
This system is the first one I setup sudo on and just expediently started listing users, not really considering that there may be other options. On the other servers I instead enabled it for all users in the sudo group and instead add users to the group. I've had little reason to re-evaluate this on box2 and haven't bothered to change it.
I was considering restarting source.squeak.org myself but I saw you were logged in so I didn't want to interfere with anything you might be doing. I'll assume that I'm free to restart it, however I need to check that it appears the image on the filesystem is not corrupted first. Also I would like to check things first, so it may be a bit before I restart it.
Ken
On 4Nov, 2013, at 16:58, Chris Muller ma.chris.m@gmail.com wrote:
I don't know why daemontools didn't start it automatically, but as I was sitting there, waiting for it, I realized how much I dislike not being in control of whether the process is running or not...
Heh. Risking that one of my few postings here is going to be classified as off-topic, you remind me of Management. Starting with the bit of being in control…
Daemontools is your minion. It will do menial work, and is quite good at it. However, as any manager can tell you, underlings are dumb and need to be told very precisely wat to do.
What most underlings can tell you, is that when the manager gets impatient, bypasses the underling, and just executes some random actions, it usually ends up in a mess. As any underling can tell you, managers are dumb.
;-)
Anyway, the two processes, the “sudo” and the “squeak”, are what you expect when manually starting squeak source through sudo. In the meantime, daemontools is trying to start a second copy, failing all the time. We should be logging this, emitting a “Restarting…” from the run script is much easier.
I don’t want to jump in and mess around, but the two processes need to be killed and then proper analysis on why daemontools cannot restart the image needs to be done. More than happy to help but with a Skype chat or similar to sync stuff :)
Hth,
Cees
On 11/04/2013 04:10 PM, Cees de Groot wrote:
On 4Nov, 2013, at 16:58, Chris Muller ma.chris.m@gmail.com wrote:
I don't know why daemontools didn't start it automatically, but as I was sitting there, waiting for it, I realized how much I dislike not being in control of whether the process is running or not...
Heh. Risking that one of my few postings here is going to be classified as off-topic, you remind me of Management. Starting with the bit of being in control…
Daemontools is your minion. It will do menial work, and is quite good at it. However, as any manager can tell you, underlings are dumb and need to be told very precisely wat to do.
What most underlings can tell you, is that when the manager gets impatient, bypasses the underling, and just executes some random actions, it usually ends up in a mess. As any underling can tell you, managers are dumb.
;-)
Anyway, the two processes, the “sudo” and the “squeak”, are what you expect when manually starting squeak source through sudo. In the meantime, daemontools is trying to start a second copy, failing all the time. We should be logging this, emitting a “Restarting…” from the run script is much easier.
I don’t want to jump in and mess around, but the two processes need to be killed and then proper analysis on why daemontools cannot restart the image needs to be done. More than happy to help but with a Skype chat or similar to sync stuff :)
Hth,
Cees
We didn't have much luck determining why supervise was not able to restart the source.squeak.org process and finally decided it was more important to get it working and decided to reboot the server with the thought that something more low-level was involved.
The server is restarting as I am emailing this.
Ken
The server has been restarted and at the moment it appears that all services, including source.squeak.org, are working. It had only been 29 days since the last reboot, but maybe this was the first symptom indicating it was time to reboot. Alternatively maybe it would have been sufficient to restart daemontools, we considered that but decided that rebooting the server as a whole was the more expedient choice.
Ken
Thanks for your help Ken! The source.squeak.org process is no longer saturating CPU, and the slowness seems not as bad as it was.
On Mon, Nov 4, 2013 at 6:21 PM, Ken Causey ken@kencausey.com wrote:
The server has been restarted and at the moment it appears that all services, including source.squeak.org, are working. It had only been 29 days since the last reboot, but maybe this was the first symptom indicating it was time to reboot. Alternatively maybe it would have been sufficient to restart daemontools, we considered that but decided that rebooting the server as a whole was the more expedient choice.
Ken
On 11/04/2013 03:58 PM, Chris Muller wrote:
Ken, thank you so much. The cause of the slowness we've been seeing the last few days is some runaway seaside client Process. It was busy rendering HTML on the difference between two different versions of 45Deprecated-fbs.8 (with two different UUID's) but, somehow, had itself caught in an endless loop that was sucking resources and causing the image to grow.
I downloaded the image to my local machine, terminated the rogue process from the Process Browser, reuploaded it, renamed the old Squeak3.11-8824-SS.image to .bad, then killed the SS process.
At first, it looked like daemontools was restarting it, but something looked weird when I grepped for the process; there were two of them.
chrismuller@box2:/home/squeaksource$ ps -ef | grep Squeak3.11-8824-SS.image
*root 2218 2109 0 Sep13 ? 00:00:01 readproctitle service errors: ...in BlockContext>newProcess?+ exec /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 -vm-display-none /home/squeaksource/Squeak3.11-8824-SS.image?+ exec /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 -vm-display-none /home/squeaksource/Squeak3.11-8824-SS.image?
*squeaks 30167 2224 0 21:34 ? 00:00:00 /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 -vm-display-none /home/squeaksource/Squeak3.11-8824-SS.image*
And yet, I could not access source.squeak.org http://source.squeak.org.
So, I killed both of those and waited for daemontools to start it again. I thought it would be immediate, but repeated ps checks did not show anything. Maybe I didn't wait long enough, but I went ahead and started it manually.
So source.squeak.org http://source.squeak.org is up but now there are two processes, each owned by root, instead of squeaksource. Hmph.
root 5753 26447 0 21:45 pts/0 00:00:00 sudo /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 -vm-display-none /home/squeaksource/Squeak3.11-8824-SS.image root 5754 5753 15 21:45 pts/0 00:00:12 /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 -vm-display-none /home/squeaksource/Squeak3.11-8824-SS.image
I don't know why daemontools didn't start it automatically, but as I was sitting there, waiting for it, I realized how much I dislike not being in control of whether the process is running or not...
Any advice?
I'm looking into it and trying to interface with Chris on Google Chat.
Ken
On 11/02/2013 03:02 PM, Chris Muller wrote:
Clearly, something is wrong.
top shows the box quite busy. One of the tasks is:
/bin/cp -al /var/cache/rsnapshot/daily.0/ /var/cache/rsnapshot/daily.1/
owned by root. Is someone doing some sort of big copy?
Yes, it is the local backup, it takes many hours. You can see some details in /var/log/rsnapshot.log
The squeaksource process supporting source.squeak.org is also railing on the CPU. I even see some qmail processes, anyone know about this?
What does 'railing' mean?
qmailr 28180 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org ademkin@earthlink.net qmailr 28181 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org charleshixsn@earthlink.net qmailr 28182 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org shaping@earthlink.net qmailr 28183 2063 0 19:57 ? 00:00:00 qmail-remote earthlink.net squeak-dev-bounces@lists.squeakfoundation.org agantoniii@earthlink.net
This sort of thing is not unusual.
Ken
On Sat, Nov 2, 2013 at 7:35 AM, Levente Uzonyi leves@elte.hu wrote:
Hi,
source.squeak.org became unacceptably slow. Both the web interface, and the MC API are very slow. Can someone take a look?
Cheers, Levente
P.S.: It's suspicious that some packages seem to be uploaded multiple times: http://lists.squeakfoundation.org/pipermail/vm-dev/2013-November/date.html
P.P.S: I've lost the private key to the server, that's why I can't check it myself.
box-admins@lists.squeakfoundation.org