Technically I could give you access but that image is owned and created by the webteam and is their responsibility. As such I suggest you contact them and volunteer your assistance within that context.
This may seem a little silly, but a lack of communication has been a problem around here in the past.
Ken
-------- Original Message -------- Subject: Re: [Box-Admins] RE: [FWD: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL **] From: Casey Ransberger casey.obrien.r@gmail.com Date: Sat, February 26, 2011 11:43 am To: Squeak Hosting Support box-admins@lists.squeakfoundation.org
Drives me a little nuts that we don't know what's causing that. I'd like to get a copy of the image we're using and spend some time with it. Is this something I currently have access to? Not at my machine now, but I think I'm an updates user.
On Feb 26, 2011, at 2:46 AM, "Ken Causey" ken@kencausey.com wrote:
Well it continued to grow quickly so I did indeed kill it. I have restarted it under vnc at port :1. It appears the image was being saved regularly and the site is back up and fine as far as I'm aware.
Ken
-------- Original Message -------- Subject: [FWD: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL **] From: "Ken Causey" ken@kencausey.com Date: Sat, February 26, 2011 4:39 am To: webteam@lists.squeakfoundation.org
I happen to be awake and online during this event, still on-going: appearance of the process in top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10856 website 25 0 1027m 753m 3324 R 94.2 79.5 82:28.27 /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 /home/website/w ...
as you can see it is pegging the CPU and using a lot of memory (in no more than a minute it has gone from about 700MB to over 750MB). I'll watch it for a few minutes but I am almost certainly going to have to kill it and restart. I hope you have been saving....
Ken
-------- Original Message -------- Subject: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL ** From: nagios@mivsek.eranova.si (User for Nagios) Date: Sat, February 26, 2011 4:24 am To: ken@kencausey.com
***** Nagios *****
Notification Type: PROBLEM
Service: Squeak website Host: squeak box2 Address: 85.10.195.197 State: CRITICAL
Date/Time: Sat Feb 26 11:24:15 CET 2011
Additional Info:
CRITICAL - Socket timeout after 10 seconds
Hi guys,
This fast growing image problem could be cause because of Dos attack, So Sean, go looking there if there you'll see some enormous amount of requests from our site and specially, are they coming from there same IP. Knowing that IP we can narrower the culpit closer.
Past two image crashes were caused by image not snapshoting every hour. We switched snapshoting off a time ago and forgot to switch on, ok, now it is on again.
Best regards Janko
On 26. 02. 2011 18:56, Ken Causey wrote:
Technically I could give you access but that image is owned and created by the webteam and is their responsibility. As such I suggest you contact them and volunteer your assistance within that context.
This may seem a little silly, but a lack of communication has been a problem around here in the past.
Ken
-------- Original Message -------- Subject: Re: [Box-Admins] RE: [FWD: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL **] From: Casey Ransberger casey.obrien.r@gmail.com Date: Sat, February 26, 2011 11:43 am To: Squeak Hosting Support box-admins@lists.squeakfoundation.org
Drives me a little nuts that we don't know what's causing that. I'd like to get a copy of the image we're using and spend some time with it. Is this something I currently have access to? Not at my machine now, but I think I'm an updates user.
On Feb 26, 2011, at 2:46 AM, "Ken Causey" ken@kencausey.com wrote:
Well it continued to grow quickly so I did indeed kill it. I have restarted it under vnc at port :1. It appears the image was being saved regularly and the site is back up and fine as far as I'm aware.
Ken
-------- Original Message -------- Subject: [FWD: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL **] From: "Ken Causey" ken@kencausey.com Date: Sat, February 26, 2011 4:39 am To: webteam@lists.squeakfoundation.org
I happen to be awake and online during this event, still on-going: appearance of the process in top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10856 website 25 0 1027m 753m 3324 R 94.2 79.5 82:28.27 /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 /home/website/w ...
as you can see it is pegging the CPU and using a lot of memory (in no more than a minute it has gone from about 700MB to over 750MB). I'll watch it for a few minutes but I am almost certainly going to have to kill it and restart. I hope you have been saving....
Ken
-------- Original Message -------- Subject: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL ** From: nagios@mivsek.eranova.si (User for Nagios) Date: Sat, February 26, 2011 4:24 am To: ken@kencausey.com
***** Nagios *****
Notification Type: PROBLEM
Service: Squeak website Host: squeak box2 Address: 85.10.195.197 State: CRITICAL
Date/Time: Sat Feb 26 11:24:15 CET 2011
Additional Info:
CRITICAL - Socket timeout after 10 seconds
Ops, Casey, not Sean! Casey sorry for misnamed you :)
On 26. 02. 2011 19:08, Janko Mivšek wrote:
Hi guys,
This fast growing image problem could be cause because of Dos attack, So Sean, go looking there if there you'll see some enormous amount of requests from our site and specially, are they coming from there same IP. Knowing that IP we can narrower the culpit closer.
Past two image crashes were caused by image not snapshoting every hour. We switched snapshoting off a time ago and forgot to switch on, ok, now it is on again.
Best regards Janko
On 26. 02. 2011 18:56, Ken Causey wrote:
Technically I could give you access but that image is owned and created by the webteam and is their responsibility. As such I suggest you contact them and volunteer your assistance within that context.
This may seem a little silly, but a lack of communication has been a problem around here in the past.
Ken
-------- Original Message -------- Subject: Re: [Box-Admins] RE: [FWD: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL **] From: Casey Ransberger casey.obrien.r@gmail.com Date: Sat, February 26, 2011 11:43 am To: Squeak Hosting Support box-admins@lists.squeakfoundation.org
Drives me a little nuts that we don't know what's causing that. I'd like to get a copy of the image we're using and spend some time with it. Is this something I currently have access to? Not at my machine now, but I think I'm an updates user.
On Feb 26, 2011, at 2:46 AM, "Ken Causey" ken@kencausey.com wrote:
Well it continued to grow quickly so I did indeed kill it. I have restarted it under vnc at port :1. It appears the image was being saved regularly and the site is back up and fine as far as I'm aware.
Ken
-------- Original Message -------- Subject: [FWD: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL **] From: "Ken Causey" ken@kencausey.com Date: Sat, February 26, 2011 4:39 am To: webteam@lists.squeakfoundation.org
I happen to be awake and online during this event, still on-going: appearance of the process in top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10856 website 25 0 1027m 753m 3324 R 94.2 79.5 82:28.27 /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 /home/website/w ...
as you can see it is pegging the CPU and using a lot of memory (in no more than a minute it has gone from about 700MB to over 750MB). I'll watch it for a few minutes but I am almost certainly going to have to kill it and restart. I hope you have been saving....
Ken
-------- Original Message -------- Subject: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL ** From: nagios@mivsek.eranova.si (User for Nagios) Date: Sat, February 26, 2011 4:24 am To: ken@kencausey.com
***** Nagios *****
Notification Type: PROBLEM
Service: Squeak website Host: squeak box2 Address: 85.10.195.197 State: CRITICAL
Date/Time: Sat Feb 26 11:24:15 CET 2011
Additional Info:
CRITICAL - Socket timeout after 10 seconds
That's interesting. So you think the image is growing like that because it's getting flooded with requests? Guessing the growth in that scenario would be due to allocation of session objects?
I note that I do not see a robots.txt. Could the problem actually be a spider?
On Feb 26, 2011, at 10:11 AM, Janko Mivšek janko.mivsek@eranova.si wrote:
Ops, Casey, not Sean! Casey sorry for misnamed you :)
On 26. 02. 2011 19:08, Janko Mivšek wrote:
Hi guys,
This fast growing image problem could be cause because of Dos attack, So Sean, go looking there if there you'll see some enormous amount of requests from our site and specially, are they coming from there same IP. Knowing that IP we can narrower the culpit closer.
Past two image crashes were caused by image not snapshoting every hour. We switched snapshoting off a time ago and forgot to switch on, ok, now it is on again.
Best regards Janko
On 26. 02. 2011 18:56, Ken Causey wrote:
Technically I could give you access but that image is owned and created by the webteam and is their responsibility. As such I suggest you contact them and volunteer your assistance within that context.
This may seem a little silly, but a lack of communication has been a problem around here in the past.
Ken
-------- Original Message -------- Subject: Re: [Box-Admins] RE: [FWD: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL **] From: Casey Ransberger casey.obrien.r@gmail.com Date: Sat, February 26, 2011 11:43 am To: Squeak Hosting Support box-admins@lists.squeakfoundation.org
Drives me a little nuts that we don't know what's causing that. I'd like to get a copy of the image we're using and spend some time with it. Is this something I currently have access to? Not at my machine now, but I think I'm an updates user.
On Feb 26, 2011, at 2:46 AM, "Ken Causey" ken@kencausey.com wrote:
Well it continued to grow quickly so I did indeed kill it. I have restarted it under vnc at port :1. It appears the image was being saved regularly and the site is back up and fine as far as I'm aware.
Ken
-------- Original Message -------- Subject: [FWD: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL **] From: "Ken Causey" ken@kencausey.com Date: Sat, February 26, 2011 4:39 am To: webteam@lists.squeakfoundation.org
I happen to be awake and online during this event, still on-going: appearance of the process in top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10856 website 25 0 1027m 753m 3324 R 94.2 79.5 82:28.27 /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 /home/website/w ...
as you can see it is pegging the CPU and using a lot of memory (in no more than a minute it has gone from about 700MB to over 750MB). I'll watch it for a few minutes but I am almost certainly going to have to kill it and restart. I hope you have been saving....
Ken
> -------- Original Message -------- > Subject: ** PROBLEM Service Alert: squeak box2/Squeak website is > CRITICAL ** > From: nagios@mivsek.eranova.si (User for Nagios) > Date: Sat, February 26, 2011 4:24 am > To: ken@kencausey.com > > > ***** Nagios ***** > > Notification Type: PROBLEM > > Service: Squeak website > Host: squeak box2 > Address: 85.10.195.197 > State: CRITICAL > > Date/Time: Sat Feb 26 11:24:15 CET 2011 > > Additional Info: > > CRITICAL - Socket timeout after 10 seconds
-- Janko Mivšek Svetovalec za informatiko Eranova d.o.o. Ljubljana, Slovenija www.eranova.si tel: 01 514 22 55 faks: 01 514 22 56 gsm: 031 674 565
On 26. 02. 2011 19:29, Casey Ransberger wrote:
That's interesting. So you think the image is growing like that because it's getting flooded with requests? Guessing the growth in that scenario would be due to allocation of session objects?
First let me note that such crash we had very rarely, once or twice in few years since this image runs. So, if this wont repeat much, I simply wouldn't spend much time on it. If it starts to repeat, then of course we need to react.
In Aida when a request came, a new session is open and a cookie is requested to be set in response. At the next request from a normal web browser the same session is used.
But if the client (like wget command) sends requests repeately without the cookie, a session is created for every request and this can grow up memory fast.
I note that I do not see a robots.txt. Could the problem actually be a spider?
We have spider visits everyday, so no, spiders are not the cause.
Janko
On Feb 26, 2011, at 10:11 AM, Janko Mivšek janko.mivsek@eranova.si wrote:
Ops, Casey, not Sean! Casey sorry for misnamed you :)
On 26. 02. 2011 19:08, Janko Mivšek wrote:
Hi guys,
This fast growing image problem could be cause because of Dos attack, So Sean, go looking there if there you'll see some enormous amount of requests from our site and specially, are they coming from there same IP. Knowing that IP we can narrower the culpit closer.
Past two image crashes were caused by image not snapshoting every hour. We switched snapshoting off a time ago and forgot to switch on, ok, now it is on again.
Best regards Janko
On 26. 02. 2011 18:56, Ken Causey wrote:
Technically I could give you access but that image is owned and created by the webteam and is their responsibility. As such I suggest you contact them and volunteer your assistance within that context.
This may seem a little silly, but a lack of communication has been a problem around here in the past.
Ken
-------- Original Message -------- Subject: Re: [Box-Admins] RE: [FWD: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL **] From: Casey Ransberger casey.obrien.r@gmail.com Date: Sat, February 26, 2011 11:43 am To: Squeak Hosting Support box-admins@lists.squeakfoundation.org
Drives me a little nuts that we don't know what's causing that. I'd like to get a copy of the image we're using and spend some time with it. Is this something I currently have access to? Not at my machine now, but I think I'm an updates user.
On Feb 26, 2011, at 2:46 AM, "Ken Causey" ken@kencausey.com wrote:
Well it continued to grow quickly so I did indeed kill it. I have restarted it under vnc at port :1. It appears the image was being saved regularly and the site is back up and fine as far as I'm aware.
Ken
> -------- Original Message -------- > Subject: [FWD: ** PROBLEM Service Alert: squeak box2/Squeak website is > CRITICAL **] > From: "Ken Causey" ken@kencausey.com > Date: Sat, February 26, 2011 4:39 am > To: webteam@lists.squeakfoundation.org > > > I happen to be awake and online during this event, still on-going: > appearance of the process in top: > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > > 10856 website 25 0 1027m 753m 3324 R 94.2 79.5 82:28.27 > /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding > UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 /home/website/w ... > > as you can see it is pegging the CPU and using a lot of memory (in no > more than a minute it has gone from about 700MB to over 750MB). I'll > watch it for a few minutes but I am almost certainly going to have to > kill it and restart. I hope you have been saving.... > > Ken > > >> -------- Original Message -------- >> Subject: ** PROBLEM Service Alert: squeak box2/Squeak website is >> CRITICAL ** >> From: nagios@mivsek.eranova.si (User for Nagios) >> Date: Sat, February 26, 2011 4:24 am >> To: ken@kencausey.com >> >> >> ***** Nagios ***** >> >> Notification Type: PROBLEM >> >> Service: Squeak website >> Host: squeak box2 >> Address: 85.10.195.197 >> State: CRITICAL >> >> Date/Time: Sat Feb 26 11:24:15 CET 2011 >> >> Additional Info: >> >> CRITICAL - Socket timeout after 10 seconds
inline, irrelevant part of conversation snipped
On Feb 26, 2011, at 10:38 AM, Janko Mivšek janko.mivsek@eranova.si wrote:
First let me note that such crash we had very rarely, once or twice in few years since this image runs. So, if this wont repeat much, I simply wouldn't spend much time on it. If it starts to repeat, then of course we need to react.
Am I conflating two different issues that we've seen with the site? For some reason I thought this happened more often than that.
But if the client (like wget command) sends requests repeately without the cookie, a session is created for every request and this can grow up memory fast.
Ah, now I see why you're thinking DOS. How long does an orphaned session live?
If we got hit really hard by a lot of requests all at once, would we be able to tell by examining the Apache logs?
What evidence might Aida leave in the image, if any?
box-admins@lists.squeakfoundation.org