VM dump? I'm not sure what you mean. The VM does not crash, the image is simply locked up. It's probable that I could have interrupted it with alt-. but that doesn't seem to work over VNC.
No website process active? You are mistaken and have created a second one now:
box2:~# ps auwx | grep website website 8850 0.0 0.5 6800 5020 ? S 22:30 0:00 Xtightvnc :1 -desktop X -auth /home/website/.Xauthority -geometry 1024x768 -depth 24 -rfbwait 120000 -rfbauth /home/website/.vnc/passwd -rfbport 5901 -fp /usr/X11R6/lib/X11/fonts/Type1/,/usr/X11R6/lib/X11/fonts/Speedo/,/usr/X11R6/lib/X11/fonts/misc/,/usr/X11R6/lib/X11/fonts/75dpi/,/usr/X11R6/lib/X11/fonts/100dpi/ -co /usr/X11R6/lib/X11/rgb website 8855 2.6 8.1 1052604 79196 ? S 22:30 1:07 /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 /home/website/website/squeaksite.image website 9044 0.0 0.6 8044 5920 ? S 22:32 0:00 Xtightvnc :4 -desktop X -auth /home/website/.Xauthority -geometry 1024x768 -depth 24 -rfbwait 120000 -rfbauth /home/website/.vnc/passwd -rfbport 5904 -fp /usr/X11R6/lib/X11/fonts/Type1/,/usr/X11R6/lib/X11/fonts/Speedo/,/usr/X11R6/lib/X11/fonts/misc/,/usr/X11R6/lib/X11/fonts/75dpi/,/usr/X11R6/lib/X11/fonts/100dpi/ -co /usr/X11R6/lib/X11/rgb website 9049 1.6 7.7 1052616 74836 ? S 22:32 0:39 /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 /home/website/website/squeaksite.image
That's two VNC servers running now and two website processes active. Did you think I didn't test the website after I restarted it?
Ken
-------- Original Message -------- Subject: Re: [Box-Admins] [FWD: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL **] From: Janko Mivšek janko.mivsek@eranova.si Date: Mon, February 28, 2011 4:52 pm To: Squeak Hosting Support box-admins@lists.squeakfoundation.org
Hi Ken,
I checked too and see no website process active, so I restarted it and now I'm connected with VNC to the image.
Today seems that image crashed, but snapshoted correctly at 9pm GMT last time.
Can we see some vm dump somewhere?
Best regards Janko
On 28. 02. 2011 23:47, Ken Causey wrote:
After I received this notice I checked and the website process had the CPU pegged with normal memory usage. I tried to connect with VNC and got connected but the image was locked up. There was a debugger open and I took a screenshot which can be found at
http://users.squeak.org/~kencausey/website_locked.png
I chatted in the IRC channel as I was fiddling with it:
2011-02-28 16:24:21 kencausey JankoMivsek: website process is flipping out again 2011-02-28 16:26:14 kencausey the memory usage is normal this time, it just has the CPU pegged 2011-02-28 16:26:41 kencausey looking at the logs, the last successful hit was the nagios check oddly enough, 2 hits before google hit the stats page again 2011-02-28 16:26:56 kencausey I don't see anything suspicious like the last time 2011-02-28 16:29:41 kencausey there is a debugger open on a send of #bottomContext to UndefinedObject 2011-02-28 16:29:50 kencausey I can't interact with it 2011-02-28 16:30:54 kencausey it's in a call to Process>>terminate 2011-02-28 16:32:38 kencausey restarting it now 2011-02-28 16:34:09 kencausey website is back up
From the apache logs:
this is when it went down:
80.81.242.100 - - [28/Feb/2011:21:41:03 +0000] "GET /stats.html?view=main&year=1684&month=8 HTTP/1.1" 200 24288 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www. google.com/bot.html)" 173.255.225.4 - - [28/Feb/2011:21:42:13 +0000] "GET /favicon.ico HTTP/1.1" 200 1406 "-" "Safari/6533.19.4 CFNetwork/454.11.5 Darwin/10.5.0 (i386) (MacBook3%2C1)" 89.212.16.244 - - [28/Feb/2011:21:42:18 +0000] "GET /ping.html HTTP/1.1" 200 - "-" "check_http/v1.4.14 (nagios-plugins 1.4.14)" 38.99.97.225 - - [28/Feb/2011:21:42:49 +0000] "GET /Smalltalk/ HTTP/1.1" 502 399 "-" "Mozilla/5.0 (compatible; ScoutJet; +http://www.scoutjet.com/)" 67.195.112.235 - - [28/Feb/2011:21:43:15 +0000] "GET /Merchandise/?version=3 HTTP/1.0" 502 403 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ ysearch/slurp)"
We have a googlebot hit to the stats page (relevant?), an irrelevant favicon request, a successful nagios ping which is I assume Janko's and not relevant, then hits start failing. Before that I see nothing suspicious and no flood of requests.
Ken
-------- Original Message -------- Subject: ** PROBLEM Service Alert: squeak box2/Squeak website is CRITICAL ** From: nagios@mivsek.eranova.si (User for Nagios) Date: Mon, February 28, 2011 3:49 pm To: ken@kencausey.com
***** Nagios *****
Notification Type: PROBLEM
Service: Squeak website Host: squeak box2 Address: 85.10.195.197 State: CRITICAL
Date/Time: Mon Feb 28 22:49:47 CET 2011
Additional Info:
CRITICAL - Socket timeout after 10 seconds
-- Janko Mivšek Svetovalec za informatiko Eranova d.o.o. Ljubljana, Slovenija www.eranova.si tel: 01 514 22 55 faks: 01 514 22 56 gsm: 031 674 565
box-admins@lists.squeakfoundation.org