On Thu, Sep 19, 2013 at 07:41:41PM +0200, Tobias Pape wrote:
Hi Am 18.09.2013 um 20:01 schrieb David T. Lewis lewis@mail.msen.com:
Yes, this is how the image is configured. It is a copy of the actual squeaksource.com image, with only a few changes that I made to correct problems with repository loading, image saving, etc.
Ok, then I know this image.
I installed it in /home/ssdotcom/SqueakSource/ and the file and directory structure below this is an exact copy of the files at squeaksource.com.
Ok.
Can anyone tell me how to write a rule that would cause http://box3.squeak.org to be mapped to the squeaksource service running on port 8888?
First, arrange that squeaksource is the default seaside app in the image so that it responds to requests on /.
I believe that this is the current configuration (although I am away and cannot check anything in the image right now). The image is currently running on box3.squeak.org:8888 and squeaksource is the default application.
good.
Then, just use the first (with On) and the last Rewrite-statement.
I am attaching a copy of the /etc/apache2/sites-available/squeaksource.com file. This is a copy of the one that SCG provided to us, and I am trying to edit it so that it will work on box3.squeak.org. Can you please take a look at this file and tell me if it looks right to you?
It looks exactly as the scg one with your obvious changes :)
In the future, the public URL will be squeaksource.com, but of course for now that URL is in use for the real squeaksource.com. I want to set up the Apache configuration so that it will work when we switch the real URL, but in advance of that I want to test it to make sure it is actually going to work.
Well, the ServerName and ServerAlias really have to match the public DNS name when we use <VirtualHost *:80>. Apache then checks the host header and matches against that ServerName/Alias. What are the logs? (squeaksource-error.log)
Ah, I think I see now. Thanks.
I'm not sure what those logs are, but it looks like they will be just the normal apache logging (whatever that might happen to be) renamed so you can see that they came from the squeaksource.com virtual host.
shouldn't we move this to box-admins?
Yes (I cc'ed box-admins this time). I've been trying to work this out on the box-admins list but I suspect there may be a few more Apache gurus here on squeak-dev, so I decided to hijack this thread and see if I could get some tips. It worked :-)
Best -Tobias
Thanks a lot for your help.
Dave
Am 20.09.2013 um 02:40 schrieb "David T. Lewis" lewis@mail.msen.com:
On Thu, Sep 19, 2013 at 07:41:41PM +0200, Tobias Pape wrote:
[…]
Well, the ServerName and ServerAlias really have to match the public DNS name when we use <VirtualHost *:80>. Apache then checks the host header and matches against that ServerName/Alias. What are the logs? (squeaksource-error.log)
Ah, I think I see now. Thanks.
I'm not sure what those logs are, but it looks like they will be just the normal apache logging (whatever that might happen to be) renamed so you can see that they came from the squeaksource.com virtual host.
No, I meant, is there something (interesting) in that logs :)
shouldn't we move this to box-admins?
Yes (I cc'ed box-admins this time). I've been trying to work this out on the box-admins list but I suspect there may be a few more Apache gurus here on squeak-dev, so I decided to hijack this thread and see if I could get some tips. It worked :-)
:) I am on box-admins, too ;)
Best -Tobias
On Fri, Sep 20, 2013 at 09:14:01AM +0200, Tobias Pape wrote:
Am 20.09.2013 um 02:40 schrieb "David T. Lewis" lewis@mail.msen.com:
On Thu, Sep 19, 2013 at 07:41:41PM +0200, Tobias Pape wrote:
[?]
Well, the ServerName and ServerAlias really have to match the public DNS name when we use <VirtualHost *:80>. Apache then checks the host header and matches against that ServerName/Alias. What are the logs? (squeaksource-error.log)
Ah, I think I see now. Thanks.
I'm not sure what those logs are, but it looks like they will be just the normal apache logging (whatever that might happen to be) renamed so you can see that they came from the squeaksource.com virtual host.
No, I meant, is there something (interesting) in that logs :)
The two log files exist but are empty so far.
shouldn't we move this to box-admins?
Yes (I cc'ed box-admins this time). I've been trying to work this out on the box-admins list but I suspect there may be a few more Apache gurus here on squeak-dev, so I decided to hijack this thread and see if I could get some tips. It worked :-)
:) I am on box-admins, too ;)
Best -Tobias
If I get you an account on box3, could you look at the apache configurations (what user ID do you prefer)? I won't have much time to work on this until Sunday evening, but perhaps if you could look at it you will be able to see what I have been missing.
My immediate goal is to get it working such that http://build.squeak.org brings up the existing Jenkins home page (running on port 8080), and http://box3.squeak.org brings up the squeaksource home page (running on 8888).
This will confirm that both servers can run on box3 without conflict. Once this is done, we can change the configuration to support http://squeaksource.com and update DNS records accordingly.
Dave
Am 20.09.2013 um 14:30 schrieb David T. Lewis lewis@mail.msen.com:
On Fri, Sep 20, 2013 at 09:14:01AM +0200, Tobias Pape wrote:
Am 20.09.2013 um 02:40 schrieb "David T. Lewis" lewis@mail.msen.com:
On Thu, Sep 19, 2013 at 07:41:41PM +0200, Tobias Pape wrote:
[…]
No, I meant, is there something (interesting) in that logs :)
The two log files exist but are empty so far.
better than errors…
shouldn't we move this to box-admins?
Yes (I cc'ed box-admins this time). I've been trying to work this out on the box-admins list but I suspect there may be a few more Apache gurus here on squeak-dev, so I decided to hijack this thread and see if I could get some tips. It worked :-)
:) I am on box-admins, too ;)
Best -Tobias
If I get you an account on box3, could you look at the apache configurations (what user ID do you prefer)?
tpape
I won't have much time to work on this until Sunday evening, but perhaps if you could look at it you will be able to see what I have been missing.
I will.
My immediate goal is to get it working such that http://build.squeak.org brings up the existing Jenkins home page (running on port 8080), and http://box3.squeak.org brings up the squeaksource home page (running on 8888).
so the public port in both cases is 80 and the names bulid.… and box3.… respectively?
This will confirm that both servers can run on box3 without conflict. Once this is done, we can change the configuration to support http://squeaksource.com and update DNS records accordingly.
Yey.
Best -Tobias
On Sat, Sep 21, 2013 at 10:27:43AM +0200, Tobias Pape wrote:
Am 20.09.2013 um 14:30 schrieb David T. Lewis lewis@mail.msen.com:
On Fri, Sep 20, 2013 at 09:14:01AM +0200, Tobias Pape wrote:
Am 20.09.2013 um 02:40 schrieb "David T. Lewis" lewis@mail.msen.com:
On Thu, Sep 19, 2013 at 07:41:41PM +0200, Tobias Pape wrote:
[?]
No, I meant, is there something (interesting) in that logs :)
The two log files exist but are empty so far.
better than errors?
shouldn't we move this to box-admins?
Yes (I cc'ed box-admins this time). I've been trying to work this out on the box-admins list but I suspect there may be a few more Apache gurus here on squeak-dev, so I decided to hijack this thread and see if I could get some tips. It worked :-)
:) I am on box-admins, too ;)
Best -Tobias
If I get you an account on box3, could you look at the apache configurations (what user ID do you prefer)?
tpape
Thanks, I created user account tpape on box3.squeak.org with sudo privileges. I will send the password to you in a separate email.
If you make any system changes, please edit the file /root/admin-log.txt to leave a record of what you did.
Squeaksource.com is running from user account ssdotcom.
I won't have much time to work on this until Sunday evening, but perhaps if you could look at it you will be able to see what I have been missing.
I will.
My immediate goal is to get it working such that http://build.squeak.org brings up the existing Jenkins home page (running on port 8080), and http://box3.squeak.org brings up the squeaksource home page (running on 8888).
so the public port in both cases is 80 and the names bulid.? and box3.? respectively?
Yes, we have build.squeak.org pointing to the Jenkins home. That is the site that people expect to see, and we need to be careful not to break this during the transition.
The name box3.squeak.org points to the same machine, but just brings up a default web page. I want to temporarily arrange for http://box3.squeak.org to bring up the squeaksource.com site (running on 8888), and once we are comfortable that it works happily alongside build.squeak.org, we will change it to work with the URL for squeaksource.com.
This will confirm that both servers can run on box3 without conflict. Once this is done, we can change the configuration to support http://squeaksource.com and update DNS records accordingly.
Yey.
Best -Tobias
Thanks! Dave
Hi.
It works now. 2 things: - the squeaksource.com configfile was present in /etc/apache2/sites-availabe but not linked into /etc/apache2/sites-enabled/ - the default site linked into /etc/apache2/sites-enabled/000-default shadowed the squeaksource.com vhost, as it was the default (ie, first) vhost and it shared the server-name box3.squeak.org.
Best -Tobias
On Sat, Sep 21, 2013 at 08:37:03PM -0400, David T. Lewis wrote:
On Sat, Sep 21, 2013 at 10:54:59PM +0200, Tobias Pape wrote:
Hi.
It works now.
Levente, Tobias:
Brilliant! Thanks to both of you.
And as long as I am asking stupid questions, here is one more:
Squeaksource.com is currently using a proxy server that apparently is activited in the squeaksource image with this:
HTTPSocket useProxyServerNamed: 'proxy.unibe.ch' port: 80.
What does this do? Is it relevant to our copy of squeaksource.com that is currently running on http://box3.squeak.org?
I am assuming that it is not relevant, but better to ask a foolish question than prove myself a fool by not asking.
Thanks,
Dave
On 2013-09-23, at 01:12, "David T. Lewis" lewis@mail.msen.com wrote:
Squeaksource.com is currently using a proxy server that apparently is activited in the squeaksource image with this:
HTTPSocket useProxyServerNamed: 'proxy.unibe.ch' port: 80.
What does this do?
It sets up a proxy server for outgoing http requests.
Is it relevant to our copy of squeaksource.com that is currently running on http://box3.squeak.org?
It's irrelevant for serving, but e.g. if you use a Monticello http repository it would try to use that proxy. Better to clean it out.
- Bert -
On Mon, Sep 23, 2013 at 12:37:47PM +0200, Bert Freudenberg wrote:
On 2013-09-23, at 01:12, "David T. Lewis" lewis@mail.msen.com wrote:
Squeaksource.com is currently using a proxy server that apparently is activited in the squeaksource image with this:
HTTPSocket useProxyServerNamed: 'proxy.unibe.ch' port: 80.
What does this do?
It sets up a proxy server for outgoing http requests.
Is it relevant to our copy of squeaksource.com that is currently running on http://box3.squeak.org?
It's irrelevant for serving, but e.g. if you use a Monticello http repository it would try to use that proxy. Better to clean it out.
- Bert -
Good, thanks.
The setting is disabled in our squeaksource image that is currently running at http://box3.squeak.org, so I'll leave it that way.
Monticello http repository access is working for me without problems. Yesterday I made some updates to this repository, and it worked as expected:
MCHttpRepository location: 'http://box3.squeak.org/CommandShell' user: 'myuser' password: 'mypasswd'
Dave
Am 23.09.2013 um 14:04 schrieb "David T. Lewis" lewis@mail.msen.com:
On Mon, Sep 23, 2013 at 12:37:47PM +0200, Bert Freudenberg wrote:
On 2013-09-23, at 01:12, "David T. Lewis" lewis@mail.msen.com wrote:
Squeaksource.com is currently using a proxy server that apparently is activited in the squeaksource image with this:
HTTPSocket useProxyServerNamed: 'proxy.unibe.ch' port: 80.
What does this do?
It sets up a proxy server for outgoing http requests.
Is it relevant to our copy of squeaksource.com that is currently running on http://box3.squeak.org?
It's irrelevant for serving, but e.g. if you use a Monticello http repository it would try to use that proxy. Better to clean it out.
- Bert -
Good, thanks.
The setting is disabled in our squeaksource image that is currently running at http://box3.squeak.org, so I'll leave it that way.
Monticello http repository access is working for me without problems. Yesterday I made some updates to this repository, and it worked as expected:
MCHttpRepository location: 'http://box3.squeak.org/CommandShell' user: 'myuser' password: 'mypasswd'
What Bert ment here, is Monticello access _from_ the image on the box _to_ some other repo. As long as http is working, MC _to_ the image will be fine, anyway.
Best -Tobias
On 2013-09-23, at 14:22, Tobias Pape Das.Linux@gmx.de wrote:
Am 23.09.2013 um 14:04 schrieb "David T. Lewis" lewis@mail.msen.com:
On Mon, Sep 23, 2013 at 12:37:47PM +0200, Bert Freudenberg wrote:
On 2013-09-23, at 01:12, "David T. Lewis" lewis@mail.msen.com wrote:
Squeaksource.com is currently using a proxy server that apparently is activited in the squeaksource image with this:
HTTPSocket useProxyServerNamed: 'proxy.unibe.ch' port: 80.
What does this do?
It sets up a proxy server for outgoing http requests.
Is it relevant to our copy of squeaksource.com that is currently running on http://box3.squeak.org?
It's irrelevant for serving, but e.g. if you use a Monticello http repository it would try to use that proxy. Better to clean it out.
- Bert -
Good, thanks.
The setting is disabled in our squeaksource image that is currently running at http://box3.squeak.org, so I'll leave it that way.
Monticello http repository access is working for me without problems. Yesterday I made some updates to this repository, and it worked as expected:
MCHttpRepository location: 'http://box3.squeak.org/CommandShell' user: 'myuser' password: 'mypasswd'
What Bert ment here, is Monticello access _from_ the image on the box _to_ some other repo. As long as http is working, MC _to_ the image will be fine, anyway.
Best -Tobias
I'm pretty sure Dave understood :) It's just accidental he uses a repo that happens to be on the same machine, perhaps even served by the same image.
- Bert -
On Mon, Sep 23, 2013 at 02:36:08PM +0200, Bert Freudenberg wrote:
On 2013-09-23, at 14:22, Tobias Pape Das.Linux@gmx.de wrote:
What Bert ment here, is Monticello access _from_ the image on the box _to_ some other repo. As long as http is working, MC _to_ the image will be fine, anyway.
Best -Tobias
I'm pretty sure Dave understood :) It's just accidental he uses a repo that happens to be on the same machine, perhaps even served by the same image.
Bert,
You are giving me way too much credit ;-)
FYI, this is my "punch list" of things to do for the squeaksource.com migration, where '#' means that the item is done. We now have a running copy of squeaksource.com, and I think we are close to a point where we can consider making the DNS switchover. Some of the items on the list (e.g. configure and enable outbound mail) can reasonably be finished after the switchover is done.
# - Obtain SS copy (complete) # - Unpack and run locally (complete, brought image up to date with repo) # - Fix problems with reloading repository at image start (worked around # it, conflict between author initials as they appear in file name versus # in the MCZ zip versus registered user initials, multi-byte characters # trigger the bug). # - Identify server (not done, awaiting box-admin guidance) (no help, so # took an executive decision to run it on box.squeak.org, should be good # enough for the transition) # - Upload to squeak.org (partly done, full copy is on box3 in ~lewis/SqueakSource) # - Verify headless operation on squeak.org (complete) # - Create squeaksource account on boxN, install in suitable directory # (using box3 for now, account is called 'ssdotcom' # - Verify and/or provide suitable VM on boxN (use the installed interpreter VM) # - Implement auto-start / restart (use procedure from source.squeak.org # if possible) (wrote a new one for now, called runimage.sh). - Wire the auto start into cron or system start scripts # - Implement apache front end (help needed, I don't know how this is # done) (Thanks Tobias and Levente) # - Configure and enable proxy server in image (related to apache?) (not required) - Test mail agent, configure email targets - Set time zone in image (property in the repository) # - Add some kind of image save for persistence (fixed - needed to schedule # save in UI thread) - Update introText (need input from SCG, provide credit and thanks) - Turn on mail delivery - Switch the DNS entry - Update DNS entry owner/caretaker? - Announce availability - Document the support process - Identify process owner for support ongoing (may be dtl for now) - Kudos and thanks to SCG
The transition of SqueakSource to our box3 server was completed earlier today, with the DNS record transition being completed about 11 hours ago. Everything seems to be working fine, but I want to note that I did some disruptive things about a half hour ago in case anyone has noticed any problems.
Right before the transition (yesterday), I ran some wget comparisons to look for MCZ commits that were present on the SCG server but not yet updated on our copy of SqueakSource. I found two missing commits, one in Chronos, and the other in Cryptography. After allowing our SqueakSource to run without interruption today, I copied those two files into our ~ssdotcom/Squeaksource/ss repository, and killed the SqueakSource image to allow it to restart normally and pick up the two missing commits. I actually restarted the image several times, because I noticed high CPU loads that I could not explain. This turned out to be normal processing of the repository cache thread, but it looked alarming to me at the time.
So ... if anyone noticed some brief outages on our squeaksource.com service, it was just me fumbling around trying to load the missing updates.
We will want to keep an eye on overall system performance on box3 for a while. The squeaksource.com service is competing for resources with our Jenkins jobs, and I don't yet know if this will lead to any issues.
In the event that squeaksource.com requires a restart, as will be the case whenever box3 is rebooted, see the instructions in ~ssdotcom/README and copied below.
Dave
HOW TO START THE SYSTEM -----------------------
Starting SqueakSource after a system boot:
1) Use sudo to log in to the system with the ssdotcom user:
lewis@box3-squeak:~$ sudo su - ssdotcom ssdotcom@box3-squeak:~$
2) Verify that runimage.sh is not currently running. When it is running, you will see the following, which is NOT what you want (so make sure the line with "/bin/sh ./runimage.sh" is NOT present):
ssdotcom@box3-squeak:~$ ps -aef | grep runimage.sh ssdotcom 6005 5999 0 20:35 pts/1 00:00:00 grep runimage.sh ssdotcom 20344 1 0 Sep15 ? 00:00:12 /bin/sh ./runimage.sh
You can also do a similar check to satisfy yourself that the SqueakSource image is not running (so you should NOT see the line with "squeaksource.2.image" in the example below):
ssdotcom@box3-squeak:~$ ps -aef | grep squeaksource ssdotcom 3255 20344 8 Sep22 ? 03:40:47 /usr/local/lib/squeak/4.10.5-2619/squeakvm -vm-display-null squeaksource.2.image ssdotcom 6008 5999 0 20:35 pts/1 00:00:00 grep squeaksource
3) Change to administrative directory
ssdotcom@box3-squeak:~$ cd SqueakSource ssdotcom@box3-squeak:~/SqueakSource$
4) Run the shell script using nohup.
ssdotcom@box3-squeak:~/SqueakSource$ nohup ./runimage.sh&
Note the trailing ampersand, which puts the script into background mode. The nohup command will prevent the script from being terminated, and also directs all of the script output (stdout and stderr) to a file called nohup.out.
Once started, the shell script will keep track of the running SqueakSource image, and will restart the image within 60 seconds of any unexpected exit.
It is generally safe to kill the image (i.e. kill the squeakvm process) at any time, and allow the runimage.sh script to restart it automatically. This may be necessary, for example, if the VNC server process within the image becomes inaccessible for some reason.
I'm watching our squeaksource.com service, which is now running for real on box3. Not unexpectedly, I see that it has a socket leak problem. I know this has been discussed in the past, but I don't really recall the status, hence my question:
Does our source.squeak.org service have the socket leak problem?
The symptoms are an accumulation of socket file handles as displayed in the /proc/<vmpid>/fd/ directory, while the the image itself does not have a corresponding accumulation of Socket instances. This is an indication of the image discarding socket references without having properly closed them.
Thanks, Dave
On 2013-10-02, at 14:57, "David T. Lewis" lewis@mail.msen.com wrote:
I'm watching our squeaksource.com service, which is now running for real on box3. Not unexpectedly, I see that it has a socket leak problem. I know this has been discussed in the past, but I don't really recall the status, hence my question:
Does our source.squeak.org service have the socket leak problem?
The symptoms are an accumulation of socket file handles as displayed in the /proc/<vmpid>/fd/ directory, while the the image itself does not have a corresponding accumulation of Socket instances. This is an indication of the image discarding socket references without having properly closed them.
Thanks, Dave
Nope:
box2:~# ps ax | grep squeaksource 2224 ? S 0:00 supervise squeaksource 2231 ? S 889:25 /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 -vm-display-none /home/squeaksource/Squeak3.11-8824-SS.image 24768 pts/0 S+ 0:00 grep squeaksource box2:~# ll /proc/2231/fd/ total 8 lr-x------ 1 squeaksource squeaksource 64 Oct 2 13:07 0 -> /dev/null l-wx------ 1 squeaksource squeaksource 64 Oct 2 13:07 1 -> pipe:[4421] l-wx------ 1 squeaksource squeaksource 64 Oct 2 13:07 2 -> pipe:[4421] lr-x------ 1 squeaksource squeaksource 64 Oct 2 13:07 3 -> /home/squeaksource/ss/trunk lr-x------ 1 squeaksource squeaksource 64 Oct 2 13:07 4 -> /home/squeaksource/SqueakV39.sources lrwx------ 1 squeaksource squeaksource 64 Oct 2 13:07 5 -> /home/squeaksource/Squeak3.11-8824-SS.changes lrwx------ 1 squeaksource squeaksource 64 Oct 2 13:07 6 -> socket:[19488796] lrwx------ 1 squeaksource squeaksource 64 Oct 2 13:07 8 -> socket:[10334046]
- Bert -
On Wed, Oct 02, 2013 at 03:09:21PM +0200, Bert Freudenberg wrote:
On 2013-10-02, at 14:57, "David T. Lewis" lewis@mail.msen.com wrote:
I'm watching our squeaksource.com service, which is now running for real on box3. Not unexpectedly, I see that it has a socket leak problem. I know this has been discussed in the past, but I don't really recall the status, hence my question:
Does our source.squeak.org service have the socket leak problem?
The symptoms are an accumulation of socket file handles as displayed in the /proc/<vmpid>/fd/ directory, while the the image itself does not have a corresponding accumulation of Socket instances. This is an indication of the image discarding socket references without having properly closed them.
Thanks, Dave
Nope:
box2:~# ps ax | grep squeaksource 2224 ? S 0:00 supervise squeaksource 2231 ? S 889:25 /usr/local/lib/squeak/3.11.3-2135/squeakvm -pathenc UTF-8 -encoding UTF-8 -plugins /usr/local/lib/squeak/3.11.3-2135 -vm-display-none /home/squeaksource/Squeak3.11-8824-SS.image 24768 pts/0 S+ 0:00 grep squeaksource box2:~# ll /proc/2231/fd/ total 8 lr-x------ 1 squeaksource squeaksource 64 Oct 2 13:07 0 -> /dev/null l-wx------ 1 squeaksource squeaksource 64 Oct 2 13:07 1 -> pipe:[4421] l-wx------ 1 squeaksource squeaksource 64 Oct 2 13:07 2 -> pipe:[4421] lr-x------ 1 squeaksource squeaksource 64 Oct 2 13:07 3 -> /home/squeaksource/ss/trunk lr-x------ 1 squeaksource squeaksource 64 Oct 2 13:07 4 -> /home/squeaksource/SqueakV39.sources lrwx------ 1 squeaksource squeaksource 64 Oct 2 13:07 5 -> /home/squeaksource/Squeak3.11-8824-SS.changes lrwx------ 1 squeaksource squeaksource 64 Oct 2 13:07 6 -> socket:[19488796] lrwx------ 1 squeaksource squeaksource 64 Oct 2 13:07 8 -> socket:[10334046]
Thanks Bert,
I was hoping that was the case. It sounds like I need to educate myself as to how to migrate the old squeaksource.com to an image like the one we are using for source.squeak.org.
Would you mind putting a copy of the image and changes files for source.squeak.org on box3 so I can take a look at it (I don't have access to box2)? Thanks.
Meanwhile I'll keep an eye on the socket leak and restart the squeaksource.org image as needed. I'm guessing that this will be required about once per week at the current rate of leakage.
Dave
On 2013-10-02, at 17:17, "David T. Lewis" lewis@mail.msen.com wrote:
Would you mind putting a copy of the image and changes files for source.squeak.org on box3 so I can take a look at it (I don't have access to box2)? Thanks.
Done: ~bertfreudenberg/
- Bert -
On Wed, Oct 02, 2013 at 05:33:31PM +0200, Bert Freudenberg wrote:
On 2013-10-02, at 17:17, "David T. Lewis" lewis@mail.msen.com wrote:
Would you mind putting a copy of the image and changes files for source.squeak.org on box3 so I can take a look at it (I don't have access to box2)? Thanks.
Done: ~bertfreudenberg/
Got it, thanks!
Dave
On Wed, Oct 02, 2013 at 05:33:31PM +0200, Bert Freudenberg wrote:
On 2013-10-02, at 17:17, "David T. Lewis" lewis@mail.msen.com wrote:
Would you mind putting a copy of the image and changes files for source.squeak.org on box3 so I can take a look at it (I don't have access to box2)? Thanks.
Done: ~bertfreudenberg/
I tried exporting the squeaksource.com repository from a copy of the old image that has the socket leak problem, and I imported it into a copy of the newer source.squeak.org image. The export and import process takes some time, but based on a few minutes playing around with the resulting image on my PC at home, it seems to work just fine.
This looks much too easy, what's the catch? ;-)
I will be traveling for a few days, so I'm not going to make any radical changes to the real squeaksource running on box3. But unless I'm missing something, it seems like moving the squeaksource.com repository into the newer image is a no-brainer. So I'll plan to do that some time next week.
Dave
On Wed, Oct 02, 2013 at 05:33:31PM +0200, Bert Freudenberg wrote:
On 2013-10-02, at 17:17, "David T. Lewis" lewis@mail.msen.com wrote:
Would you mind putting a copy of the image and changes files for source.squeak.org on box3 so I can take a look at it (I don't have access to box2)? Thanks.
Done: ~bertfreudenberg/
I have replaced the older squeaksource.com image with a newer image based on the source.squeak.org image that Bert provided. I activated the new image on box3.squeak.org today, and am monitoring it for problems.
I started the repository export yesterday and completed the import to the new image today. During that period there have been no commits to squeaksource.com, so I anticipate no loss of data.
The new image is called ~ssdotcom/SqueakSource/squeaksource.3.image. In the event of problems, the rollback plan is to reactivate the older ~ssdotcom/SqueakSource/squeaksource.2.image (see ~ssdotcom/README for details).
Currently the new image is running and appears to work fine (although users may have noticed some brief outages, for which I apologize). However it is currently consuming a heavy CPU load, so I am watching to see if this goes down (I have seen similar patterns in the past related to cache updates, which seem to settle down after a while). If the CPU load does not go back down within the next hour, I will revert back to the old image and try this again on another day.
Dave
On Sat, Oct 12, 2013 at 05:58:06PM -0400, David T. Lewis wrote:
On Wed, Oct 02, 2013 at 05:33:31PM +0200, Bert Freudenberg wrote:
On 2013-10-02, at 17:17, "David T. Lewis" lewis@mail.msen.com wrote:
Would you mind putting a copy of the image and changes files for source.squeak.org on box3 so I can take a look at it (I don't have access to box2)? Thanks.
Done: ~bertfreudenberg/
I have replaced the older squeaksource.com image with a newer image based on the source.squeak.org image that Bert provided. I activated the new image on box3.squeak.org today, and am monitoring it for problems.
I started the repository export yesterday and completed the import to the new image today. During that period there have been no commits to squeaksource.com, so I anticipate no loss of data.
The new image is called ~ssdotcom/SqueakSource/squeaksource.3.image. In the event of problems, the rollback plan is to reactivate the older ~ssdotcom/SqueakSource/squeaksource.2.image (see ~ssdotcom/README for details).
Currently the new image is running and appears to work fine (although users may have noticed some brief outages, for which I apologize). However it is currently consuming a heavy CPU load, so I am watching to see if this goes down (I have seen similar patterns in the past related to cache updates, which seem to settle down after a while). If the CPU load does not go back down within the next hour, I will revert back to the old image and try this again on another day.
All is well now. The initial CPU load was associated with the Cache-Thread process, which runs a statistics gathering job once daily. This is normal, and the system is now running with 98% idle as expected. I will continue to monitor, but I think the upgrade is successful.
Dave
On Sat, Oct 12, 2013 at 05:58:06PM -0400, David T. Lewis wrote:
On Wed, Oct 02, 2013 at 05:33:31PM +0200, Bert Freudenberg wrote:
On 2013-10-02, at 17:17, "David T. Lewis" lewis@mail.msen.com wrote:
Would you mind putting a copy of the image and changes files for source.squeak.org on box3 so I can take a look at it (I don't have access to box2)? Thanks.
Done: ~bertfreudenberg/
I have replaced the older squeaksource.com image with a newer image based on the source.squeak.org image that Bert provided. I activated the new image on box3.squeak.org today, and am monitoring it for problems.
I started the repository export yesterday and completed the import to the new image today. During that period there have been no commits to squeaksource.com, so I anticipate no loss of data.
The new image is called ~ssdotcom/SqueakSource/squeaksource.3.image. In the event of problems, the rollback plan is to reactivate the older ~ssdotcom/SqueakSource/squeaksource.2.image (see ~ssdotcom/README for details).
Currently the new image is running and appears to work fine (although users may have noticed some brief outages, for which I apologize). However it is currently consuming a heavy CPU load, so I am watching to see if this goes down (I have seen similar patterns in the past related to cache updates, which seem to settle down after a while). If the CPU load does not go back down within the next hour, I will revert back to the old image and try this again on another day.
The new squeaksource image seems to be running well. It is too early to say for sure, but the runaway socket leaks that I saw in the old image do not appear to be occurring in the new image. There is however one socket leak that I have found to be associated with the VNC server. This apparently leaves one open file descriptor when a client closes its connection (BYW, this is the first time I've really had occasion to use Ian's RFBServer, it is really quite amazing). This does not directly impact squeaksource.com stability so I'm not going to worry about it for now.
Overall performance seems similar to that of the previous image. I made several commits to the repository today with OSProcess updates, and I watched the effect on squeaksource (using top). CPU and memory impact was negligible.
However, there are other things going on in the image that generate a lot of load. There is a once-daily statistics update that consumes CPU for about half a minute or so. This is likely a much heavier impact than we would see with source.squeak.org due to the size of the repository. We also saw one incident in the old image in which memory utilization went through the roof, and at this point I have no clue what may have caused it. Aside from the socket leak problem, this was the most important problem affecting squeaksource stability (and the rest of the system, e.g. Jenkins, as well), so I'm watching for evidence of a recurrence.
As per guidance from Ken, squeaksource.com will remain on box3 (good, I don't really want to move it again). Frank, this does present a risk to the Jenkins jobs that are the primary purpose of box3. Please let me know if you see problems that may be related to squeaksource load.
Dave
On 13 October 2013 22:34, David T. Lewis lewis@mail.msen.com wrote:
On Sat, Oct 12, 2013 at 05:58:06PM -0400, David T. Lewis wrote:
On Wed, Oct 02, 2013 at 05:33:31PM +0200, Bert Freudenberg wrote:
On 2013-10-02, at 17:17, "David T. Lewis" lewis@mail.msen.com wrote:
Would you mind putting a copy of the image and changes files for source.squeak.org on box3 so I can take a look at it (I don't have access to box2)? Thanks.
Done: ~bertfreudenberg/
I have replaced the older squeaksource.com image with a newer image based on the source.squeak.org image that Bert provided. I activated the new image on box3.squeak.org today, and am monitoring it for problems.
I started the repository export yesterday and completed the import to the new image today. During that period there have been no commits to squeaksource.com, so I anticipate no loss of data.
The new image is called ~ssdotcom/SqueakSource/squeaksource.3.image. In the event of problems, the rollback plan is to reactivate the older ~ssdotcom/SqueakSource/squeaksource.2.image (see ~ssdotcom/README for details).
Currently the new image is running and appears to work fine (although users may have noticed some brief outages, for which I apologize). However it is currently consuming a heavy CPU load, so I am watching to see if this goes down (I have seen similar patterns in the past related to cache updates, which seem to settle down after a while). If the CPU load does not go back down within the next hour, I will revert back to the old image and try this again on another day.
The new squeaksource image seems to be running well. It is too early to say for sure, but the runaway socket leaks that I saw in the old image do not appear to be occurring in the new image. There is however one socket leak that I have found to be associated with the VNC server. This apparently leaves one open file descriptor when a client closes its connection (BYW, this is the first time I've really had occasion to use Ian's RFBServer, it is really quite amazing). This does not directly impact squeaksource.com stability so I'm not going to worry about it for now.
Overall performance seems similar to that of the previous image. I made several commits to the repository today with OSProcess updates, and I watched the effect on squeaksource (using top). CPU and memory impact was negligible.
However, there are other things going on in the image that generate a lot of load. There is a once-daily statistics update that consumes CPU for about half a minute or so. This is likely a much heavier impact than we would see with source.squeak.org due to the size of the repository. We also saw one incident in the old image in which memory utilization went through the roof, and at this point I have no clue what may have caused it. Aside from the socket leak problem, this was the most important problem affecting squeaksource stability (and the rest of the system, e.g. Jenkins, as well), so I'm watching for evidence of a recurrence.
As per guidance from Ken, squeaksource.com will remain on box3 (good, I don't really want to move it again). Frank, this does present a risk to the Jenkins jobs that are the primary purpose of box3. Please let me know if you see problems that may be related to squeaksource load.
I'll keep an eye out. It sounds like the load-causing jobs are intermittent though, so shouldn't present a massive problem. It's also a problem that can be mitigated by farming out the work to non-box3 build slaves.
I set up another slave, running off the laptop that's turned into my sons' Scratch-pad, but haven't yet mastered the unix-fu to make the build slave start automatically. That will help a bit, as do Tony's slaves (now that we beat their memory usage into shape).
frank
Dave
On Sun, Oct 13, 2013 at 05:34:11PM -0400, David T. Lewis wrote:
The new squeaksource image seems to be running well. It is too early to say for sure, but the runaway socket leaks that I saw in the old image do not appear to be occurring in the new image. There is however one socket leak that I have found to be associated with the VNC server. This apparently leaves one open file descriptor when a client closes its connection (BYW, this is the first time I've really had occasion to use Ian's RFBServer, it is really quite amazing). This does not directly impact squeaksource.com stability so I'm not going to worry about it for now.
To follow up - the new squeaksource.com image has continued to run without problems since I wrote the above. I intentionally restarted it two days ago (testing signal handlers for SIGHUP and SIGTERM based on suggestion from Chris), but otherwise it has run reliably and no manual intervention has been needed.
The socket leak problem is resolved, aside from the minor VNC bug mentioned above, and there has been no recurrence of the unexplained system overload that had occurred one time with the old image.
I still need to set up squeaksource.com under daemontools (tips or examples from source.squeak.org welcome, I've never done it before). Once that is done, I expect that the squeaksource.com image can run indefinitely with little or no manual intervention.
Dave
On 2013-10-17, at 14:56, "David T. Lewis" lewis@mail.msen.com wrote:
On Sun, Oct 13, 2013 at 05:34:11PM -0400, David T. Lewis wrote:
The new squeaksource image seems to be running well. It is too early to say for sure, but the runaway socket leaks that I saw in the old image do not appear to be occurring in the new image. There is however one socket leak that I have found to be associated with the VNC server. This apparently leaves one open file descriptor when a client closes its connection (BYW, this is the first time I've really had occasion to use Ian's RFBServer, it is really quite amazing). This does not directly impact squeaksource.com stability so I'm not going to worry about it for now.
To follow up - the new squeaksource.com image has continued to run without problems since I wrote the above. I intentionally restarted it two days ago (testing signal handlers for SIGHUP and SIGTERM based on suggestion from Chris), but otherwise it has run reliably and no manual intervention has been needed.
The socket leak problem is resolved, aside from the minor VNC bug mentioned above, and there has been no recurrence of the unexplained system overload that had occurred one time with the old image.
I still need to set up squeaksource.com under daemontools (tips or examples from source.squeak.org welcome, I've never done it before). Once that is done, I expect that the squeaksource.com image can run indefinitely with little or no manual intervention.
Dave
Do you know how much traffic it has?
- Bert -
On 2013-10-17, at 14:56, "David T. Lewis" lewis@mail.msen.com wrote:
On Sun, Oct 13, 2013 at 05:34:11PM -0400, David T. Lewis wrote:
The new squeaksource image seems to be running well. It is too early to say for sure, but the runaway socket leaks that I saw in the old image do not appear to be occurring in the new image. There is however one socket leak that I have found to be associated with the VNC server. This apparently leaves one open file descriptor when a client closes its connection (BYW, this is the first time I've really had occasion to use Ian's RFBServer, it is really quite amazing). This does not directly impact squeaksource.com stability so I'm not going to worry about it for now.
To follow up - the new squeaksource.com image has continued to run without problems since I wrote the above. I intentionally restarted it two days ago (testing signal handlers for SIGHUP and SIGTERM based on suggestion from Chris), but otherwise it has run reliably and no manual intervention has been needed.
The socket leak problem is resolved, aside from the minor VNC bug mentioned above, and there has been no recurrence of the unexplained system overload that had occurred one time with the old image.
I still need to set up squeaksource.com under daemontools (tips or examples from source.squeak.org welcome, I've never done it before). Once that is done, I expect that the squeaksource.com image can run indefinitely with little or no manual intervention.
Dave
Do you know how much traffic it has?
No, I don't know how much traffic we are getting. There is probably some information in the apache logs and/or the ss/ss.log file, but I have only cell phone access now so I cannot look at it today.
Dave
On Thu, Oct 17, 2013 at 04:29:09PM +0200, Bert Freudenberg wrote:
On 2013-10-17, at 14:56, "David T. Lewis" lewis@mail.msen.com wrote:
On Sun, Oct 13, 2013 at 05:34:11PM -0400, David T. Lewis wrote:
The new squeaksource image seems to be running well. It is too early to say for sure, but the runaway socket leaks that I saw in the old image do not appear to be occurring in the new image. There is however one socket leak that I have found to be associated with the VNC server. This apparently leaves one open file descriptor when a client closes its connection (BYW, this is the first time I've really had occasion to use Ian's RFBServer, it is really quite amazing). This does not directly impact squeaksource.com stability so I'm not going to worry about it for now.
To follow up - the new squeaksource.com image has continued to run without problems since I wrote the above. I intentionally restarted it two days ago (testing signal handlers for SIGHUP and SIGTERM based on suggestion from Chris), but otherwise it has run reliably and no manual intervention has been needed.
The socket leak problem is resolved, aside from the minor VNC bug mentioned above, and there has been no recurrence of the unexplained system overload that had occurred one time with the old image.
I still need to set up squeaksource.com under daemontools (tips or examples from source.squeak.org welcome, I've never done it before). Once that is done, I expect that the squeaksource.com image can run indefinitely with little or no manual intervention.
Dave
Do you know how much traffic it has?
The current apache log /var/log/apache2/squeaksource-access.log covers over five days of access logging (beginning at 13/Oct/2013:05:32:19 +0200) and it shows a good deal of read access activity:
ot@box3-squeak:/var/log/apache2# grep GET squeaksource-access.log | wc -l 54163 root@box3-squeak:/var/log/apache2#
So we are seeing around 10,000 GETs per day.
Here are the repository updates that have been made during the last seven days:
2013-10-11T13:13:22+00:00 STORED SmaccDevelopment/SmaCCDev-ThierryGoubier.34.mcz 2013-10-11T15:37:48+00:00 STORED ProcessWrapper/ProcessWrapper-Core-GustavoSantos.3.mcz 2013-10-13T17:21:39.327+00:00 STORED OSProcess/OSProcess-Base-ThierryGoubier.38.mcz 2013-10-13T17:22:49.056+00:00 STORED OSProcess/OSProcess-Base-dtl.39.mcz 2013-10-13T17:23:33.093+00:00 STORED OSProcess/OSProcess-dtl.85.mcz 2013-10-16T01:19:17.699+00:00 STORED OSProcessPlugin/VMConstruction-Plugins-OSProcessPlugin.oscog-eem.43.mcz 2013-10-17T08:26:45.651+00:00 STORED DaliotsPlayground/ConfigurationOfDaliotsPlayground-HwaJong.350.mcz 2013-10-17T11:59:46.884+00:00 STORED DaliotsPlayground/ConfigurationOfDaliotsPlayground-HwaJong.351.mcz 2013-10-17T18:07:40.466+00:00 STORED DaliotsPlayground/ConfigurationOfDaliotsPlayground-HwaJong.352.mcz 2013-10-17T18:34:03.74+00:00 STORED DaliotsPlayground/ConfigurationOfDaliotsPlayground-HwaJong.353.mcz
Dave
On 2013-10-18, at 14:33, "David T. Lewis" lewis@mail.msen.com wrote:
On Thu, Oct 17, 2013 at 04:29:09PM +0200, Bert Freudenberg wrote:
On 2013-10-17, at 14:56, "David T. Lewis" lewis@mail.msen.com wrote:
On Sun, Oct 13, 2013 at 05:34:11PM -0400, David T. Lewis wrote:
The new squeaksource image seems to be running well. It is too early to say for sure, but the runaway socket leaks that I saw in the old image do not appear to be occurring in the new image. There is however one socket leak that I have found to be associated with the VNC server. This apparently leaves one open file descriptor when a client closes its connection (BYW, this is the first time I've really had occasion to use Ian's RFBServer, it is really quite amazing). This does not directly impact squeaksource.com stability so I'm not going to worry about it for now.
To follow up - the new squeaksource.com image has continued to run without problems since I wrote the above. I intentionally restarted it two days ago (testing signal handlers for SIGHUP and SIGTERM based on suggestion from Chris), but otherwise it has run reliably and no manual intervention has been needed.
The socket leak problem is resolved, aside from the minor VNC bug mentioned above, and there has been no recurrence of the unexplained system overload that had occurred one time with the old image.
I still need to set up squeaksource.com under daemontools (tips or examples from source.squeak.org welcome, I've never done it before). Once that is done, I expect that the squeaksource.com image can run indefinitely with little or no manual intervention.
Dave
Do you know how much traffic it has?
The current apache log /var/log/apache2/squeaksource-access.log covers over five days of access logging (beginning at 13/Oct/2013:05:32:19 +0200) and it shows a good deal of read access activity:
ot@box3-squeak:/var/log/apache2# grep GET squeaksource-access.log | wc -l 54163 root@box3-squeak:/var/log/apache2#
So we are seeing around 10,000 GETs per day.
Here are the repository updates that have been made during the last seven days:
2013-10-11T13:13:22+00:00 STORED SmaccDevelopment/SmaCCDev-ThierryGoubier.34.mcz 2013-10-11T15:37:48+00:00 STORED ProcessWrapper/ProcessWrapper-Core-GustavoSantos.3.mcz 2013-10-13T17:21:39.327+00:00 STORED OSProcess/OSProcess-Base-ThierryGoubier.38.mcz 2013-10-13T17:22:49.056+00:00 STORED OSProcess/OSProcess-Base-dtl.39.mcz 2013-10-13T17:23:33.093+00:00 STORED OSProcess/OSProcess-dtl.85.mcz 2013-10-16T01:19:17.699+00:00 STORED OSProcessPlugin/VMConstruction-Plugins-OSProcessPlugin.oscog-eem.43.mcz 2013-10-17T08:26:45.651+00:00 STORED DaliotsPlayground/ConfigurationOfDaliotsPlayground-HwaJong.350.mcz 2013-10-17T11:59:46.884+00:00 STORED DaliotsPlayground/ConfigurationOfDaliotsPlayground-HwaJong.351.mcz 2013-10-17T18:07:40.466+00:00 STORED DaliotsPlayground/ConfigurationOfDaliotsPlayground-HwaJong.352.mcz 2013-10-17T18:34:03.74+00:00 STORED DaliotsPlayground/ConfigurationOfDaliotsPlayground-HwaJong.353.mcz
Dave
Interesting, thanks! Hope it holds up :)
- Bert -
Dave, did you happen to merge the latest versions of SqueakSource packages hosted at source.squeak.org into your new server for SqueakSource.com?
Also, would you please commit back any additional fixes you made (e.g., socket leak problem?) to same repository?
Thanks.
On Thu, Oct 17, 2013 at 7:56 AM, David T. Lewis lewis@mail.msen.com wrote:
On Sun, Oct 13, 2013 at 05:34:11PM -0400, David T. Lewis wrote:
The new squeaksource image seems to be running well. It is too early to say for sure, but the runaway socket leaks that I saw in the old image do not appear to be occurring in the new image. There is however one socket leak that I have found to be associated with the VNC server. This apparently leaves one open file descriptor when a client closes its connection (BYW, this is the first time I've really had occasion to use Ian's RFBServer, it is really quite amazing). This does not directly impact squeaksource.com stability so I'm not going to worry about it for now.
To follow up - the new squeaksource.com image has continued to run without problems since I wrote the above. I intentionally restarted it two days ago (testing signal handlers for SIGHUP and SIGTERM based on suggestion from Chris), but otherwise it has run reliably and no manual intervention has been needed.
The socket leak problem is resolved, aside from the minor VNC bug mentioned above, and there has been no recurrence of the unexplained system overload that had occurred one time with the old image.
I still need to set up squeaksource.com under daemontools (tips or examples from source.squeak.org welcome, I've never done it before). Once that is done, I expect that the squeaksource.com image can run indefinitely with little or no manual intervention.
Dave
At this point, the SqueakSource code in our squeaksource.com image should be identical to that of our source.squeak.org image. If I fix anything, I'll certainly commit the changes, but someone else fixed the socket leak problem and all I did is get squeaksource.com updated to take advantage of those fixes.
I know you are also working on the updates for source.squeak.org, and what I'd like to do if you agree is be a "close follower" of that work. When the source.squeak.org update goes live, I would like to apply the identical updates to our squeaksource.com image shortly afterwards. Does that make sense?
Dave
Dave, did you happen to merge the latest versions of SqueakSource packages hosted at source.squeak.org into your new server for SqueakSource.com?
Also, would you please commit back any additional fixes you made (e.g., socket leak problem?) to same repository?
Thanks.
On Thu, Oct 17, 2013 at 7:56 AM, David T. Lewis lewis@mail.msen.com wrote:
On Sun, Oct 13, 2013 at 05:34:11PM -0400, David T. Lewis wrote:
The new squeaksource image seems to be running well. It is too early to say for sure, but the runaway socket leaks that I saw in the old image do not appear to be occurring in the new image. There is however one socket leak that I have found to be associated with the VNC server. This apparently leaves one open file descriptor when a client closes its connection (BYW, this is the first time I've really had occasion to use Ian's RFBServer, it is really quite amazing). This does not directly impact squeaksource.com stability so I'm not going to worry about it for now.
To follow up - the new squeaksource.com image has continued to run without problems since I wrote the above. I intentionally restarted it two days ago (testing signal handlers for SIGHUP and SIGTERM based on suggestion from Chris), but otherwise it has run reliably and no manual intervention has been needed.
The socket leak problem is resolved, aside from the minor VNC bug mentioned above, and there has been no recurrence of the unexplained system overload that had occurred one time with the old image.
I still need to set up squeaksource.com under daemontools (tips or examples from source.squeak.org welcome, I've never done it before). Once that is done, I expect that the squeaksource.com image can run indefinitely with little or no manual intervention.
Dave
On Fri, Oct 18, 2013 at 12:59 PM, David T. Lewis lewis@mail.msen.com wrote:
At this point, the SqueakSource code in our squeaksource.com image should be identical to that of our source.squeak.org image. If I fix anything, I'll certainly commit the changes, but someone else fixed the socket leak problem and all I did is get squeaksource.com updated to take advantage of those fixes.
What are those fixes? I would like to ensure they're part of the new-trunk SS image at box4.squeak.org:8888.
I know you are also working on the updates for source.squeak.org, and what I'd like to do if you agree is be a "close follower" of that work. When the source.squeak.org update goes live, I would like to apply the identical updates to our squeaksource.com image shortly afterwards. Does that make sense?
Dave
Dave, did you happen to merge the latest versions of SqueakSource packages hosted at source.squeak.org into your new server for SqueakSource.com?
Also, would you please commit back any additional fixes you made (e.g., socket leak problem?) to same repository?
Thanks.
On Thu, Oct 17, 2013 at 7:56 AM, David T. Lewis lewis@mail.msen.com wrote:
On Sun, Oct 13, 2013 at 05:34:11PM -0400, David T. Lewis wrote:
The new squeaksource image seems to be running well. It is too early to say for sure, but the runaway socket leaks that I saw in the old image do not appear to be occurring in the new image. There is however one socket leak that I have found to be associated with the VNC server. This apparently leaves one open file descriptor when a client closes its connection (BYW, this is the first time I've really had occasion to use Ian's RFBServer, it is really quite amazing). This does not directly impact squeaksource.com stability so I'm not going to worry about it for now.
To follow up - the new squeaksource.com image has continued to run without problems since I wrote the above. I intentionally restarted it two days ago (testing signal handlers for SIGHUP and SIGTERM based on suggestion from Chris), but otherwise it has run reliably and no manual intervention has been needed.
The socket leak problem is resolved, aside from the minor VNC bug mentioned above, and there has been no recurrence of the unexplained system overload that had occurred one time with the old image.
I still need to set up squeaksource.com under daemontools (tips or examples from source.squeak.org welcome, I've never done it before). Once that is done, I expect that the squeaksource.com image can run indefinitely with little or no manual intervention.
Dave
On Fri, Oct 18, 2013 at 12:59 PM, David T. Lewis lewis@mail.msen.com wrote:
At this point, the SqueakSource code in our squeaksource.com image should be identical to that of our source.squeak.org image. If I fix anything, I'll certainly commit the changes, but someone else fixed the socket leak problem and all I did is get squeaksource.com updated to take advantage of those fixes.
What are those fixes? I would like to ensure they're part of the new-trunk SS image at box4.squeak.org:8888.
I do not know what the fixes were, and I cannot say if they were fixes to SqueakSource, Seaside, or something in Squeak itself. I would certainly expect that the new image you are preparing on box4 will already contain the necessary fixes, but the only way find out for sure is to keep an eye on your new image and watch for socket leaks. That's just a matter of watching /proc/<squeakpid>/fd/* and looking at how many sockets are open. If the number grows over time, that's not good. If the total number of open file descriptors approaches 1024, it is a Very Bad Thing.
Dave
On 18 October 2013 21:17, David T. Lewis lewis@mail.msen.com wrote:
On Fri, Oct 18, 2013 at 12:59 PM, David T. Lewis lewis@mail.msen.com wrote:
At this point, the SqueakSource code in our squeaksource.com image should be identical to that of our source.squeak.org image. If I fix anything, I'll certainly commit the changes, but someone else fixed the socket leak problem and all I did is get squeaksource.com updated to take advantage of those fixes.
What are those fixes? I would like to ensure they're part of the new-trunk SS image at box4.squeak.org:8888.
I do not know what the fixes were, and I cannot say if they were fixes to SqueakSource, Seaside, or something in Squeak itself. I would certainly expect that the new image you are preparing on box4 will already contain the necessary fixes, but the only way find out for sure is to keep an eye on your new image and watch for socket leaks. That's just a matter of watching /proc/<squeakpid>/fd/* and looking at how many sockets are open. If the number grows over time, that's not good. If the total number of open file descriptors approaches 1024, it is a Very Bad Thing.
Obviously you want to address the root cause - leaking descriptors - but a mitigation is to up the fd quota through /etc/security/limits.conf
frank
Dave
On Sat, Oct 19, 2013 at 08:02:22AM +0100, Frank Shearar wrote:
On 18 October 2013 21:17, David T. Lewis lewis@mail.msen.com wrote:
On Fri, Oct 18, 2013 at 12:59 PM, David T. Lewis lewis@mail.msen.com wrote:
At this point, the SqueakSource code in our squeaksource.com image should be identical to that of our source.squeak.org image. If I fix anything, I'll certainly commit the changes, but someone else fixed the socket leak problem and all I did is get squeaksource.com updated to take advantage of those fixes.
What are those fixes? I would like to ensure they're part of the new-trunk SS image at box4.squeak.org:8888.
I do not know what the fixes were, and I cannot say if they were fixes to SqueakSource, Seaside, or something in Squeak itself. I would certainly expect that the new image you are preparing on box4 will already contain the necessary fixes, but the only way find out for sure is to keep an eye on your new image and watch for socket leaks. That's just a matter of watching /proc/<squeakpid>/fd/* and looking at how many sockets are open. If the number grows over time, that's not good. If the total number of open file descriptors approaches 1024, it is a Very Bad Thing.
Obviously you want to address the root cause - leaking descriptors - but a mitigation is to up the fd quota through /etc/security/limits.conf
One more update - the file descriptor leak is not gone, although it is clearly much improved compared to the old image. Within the last day or so, the open descriptor count went up from about 40 to about 340. So the problem still happens, but much less frequently.
I am not going to restart the image, as I want to keep monitoring it and see how long it can go unattended. I am running a process in the image that will check fd count every few hours, and restart it if the count goes over 800. That should protect against image lockups if the count goes too high while I am not paying attention.
For the record, the socket leak process is:
[[vmFileCount := (FileDirectory on: '/proc/', OSProcess thisOSProcess pid asString, '/fd') entries size. OSProcess trace: DateAndTime now asString, ' squeakvm has ', vmFileCount asString, ' open file descriptors'. vmFileCount > 800 ifTrue: [ OSProcess trace: 'Too many open file handles, save image and exit'. "Save the image, exit and wait for the supervisory script to restart" Smalltalk snapshot: true andQuit: true]. (Delay forSeconds: 3 * 3600) wait] repeat] fork name: 'the Socket leak monitor'.
Dave
On 2013-10-19, at 14:35, "David T. Lewis" lewis@mail.msen.com wrote:
For the record, the socket leak process is:
[[vmFileCount := (FileDirectory on: '/proc/', OSProcess thisOSProcess pid asString, '/fd') entries size. OSProcess trace: DateAndTime now asString, ' squeakvm has ', vmFileCount asString, ' open file descriptors'. vmFileCount > 800 ifTrue: [ OSProcess trace: 'Too many open file handles, save image and exit'. "Save the image, exit and wait for the supervisory script to restart" Smalltalk snapshot: true andQuit: true]. (Delay forSeconds: 3 * 3600) wait] repeat] fork name: 'the Socket leak monitor'.
Dave
Wouldn't it be better to snapshot in the UI process?
- Bert -
On Mon, Oct 21, 2013 at 11:56:23AM +0200, Bert Freudenberg wrote:
On 2013-10-19, at 14:35, "David T. Lewis" lewis@mail.msen.com wrote:
For the record, the socket leak process is:
[[vmFileCount := (FileDirectory on: '/proc/', OSProcess thisOSProcess pid asString, '/fd') entries size. OSProcess trace: DateAndTime now asString, ' squeakvm has ', vmFileCount asString, ' open file descriptors'. vmFileCount > 800 ifTrue: [ OSProcess trace: 'Too many open file handles, save image and exit'. "Save the image, exit and wait for the supervisory script to restart" Smalltalk snapshot: true andQuit: true]. (Delay forSeconds: 3 * 3600) wait] repeat] fork name: 'the Socket leak monitor'.
Dave
Wouldn't it be better to snapshot in the UI process?
- Bert -
Eeek! Thanks. I changed it to:
[[vmFileCount := (FileDirectory on: '/proc/', OSProcess thisOSProcess pid asString, '/fd') entries size. OSProcess trace: DateAndTime now asString, ' squeakvm has ', vmFileCount asString, ' open file descriptors'. vmFileCount > 800 ifTrue: [ OSProcess trace: 'Too many open file handles, save image and exit'. "Save the image, exit and wait for the supervisory script to restart" WorldState addDeferredUIMessage: [Smalltalk snapshot: true andQuit: true]]. (Delay forSeconds: 3 * 3600) wait] repeat] fork name: 'the Socket leak monitor'.
Dave
Why is it better to snapshot in the UI process than a background process?
On Mon, Oct 21, 2013 at 7:32 AM, David T. Lewis lewis@mail.msen.com wrote:
On Mon, Oct 21, 2013 at 11:56:23AM +0200, Bert Freudenberg wrote:
On 2013-10-19, at 14:35, "David T. Lewis" lewis@mail.msen.com wrote:
For the record, the socket leak process is:
[[vmFileCount := (FileDirectory on: '/proc/', OSProcess thisOSProcess pid asString, '/fd') entries size. OSProcess trace: DateAndTime now asString, ' squeakvm has ', vmFileCount asString, ' open file descriptors'. vmFileCount > 800 ifTrue: [ OSProcess trace: 'Too many open file handles, save image and exit'. "Save the image, exit and wait for the supervisory script to restart" Smalltalk snapshot: true andQuit: true]. (Delay forSeconds: 3 * 3600) wait] repeat] fork name: 'the Socket leak monitor'.
Dave
Wouldn't it be better to snapshot in the UI process?
- Bert -
Eeek! Thanks. I changed it to:
[[vmFileCount := (FileDirectory on: '/proc/', OSProcess thisOSProcess pid asString, '/fd') entries size. OSProcess trace: DateAndTime now asString, ' squeakvm has ', vmFileCount asString, ' open file descriptors'. vmFileCount > 800 ifTrue: [ OSProcess trace: 'Too many open file handles, save image and exit'. "Save the image, exit and wait for the supervisory script to restart" WorldState addDeferredUIMessage: [Smalltalk snapshot: true andQuit: true]]. (Delay forSeconds: 3 * 3600) wait] repeat] fork name: 'the Socket leak monitor'.
Dave
On 2013-10-21, at 16:44, Chris Muller asqueaker@gmail.com wrote:
Why is it better to snapshot in the UI process than a background process?
Because the startup code will normally be executed in the UI process, not a background process. I'm pretty sure some weird things could happen if the UI process interrupts the startup code.
- Bert -
Today I discovered SeasidePlatformSupport class>>#deliverMailFrom:to: is part of the error-handler for SqueakSource.
When IT has a problem, however, the same error handler calls #deliverMailFrom:to: again as a means for handling the error which occurred while trying to handle the original error.
So, it's a runaway stack in a background process that is also inside some Mutex's critical: block which causes other requests to not be processed.
deliverMailFrom:to: opens up a socket. Hmmm.....
I've duplicated the problem in my localhost and applied a patch which I'll commit to source.squeak.org/ss shortly.
- Chris
On Sat, Oct 19, 2013 at 7:35 AM, David T. Lewis lewis@mail.msen.com wrote:
On Sat, Oct 19, 2013 at 08:02:22AM +0100, Frank Shearar wrote:
On 18 October 2013 21:17, David T. Lewis lewis@mail.msen.com wrote:
On Fri, Oct 18, 2013 at 12:59 PM, David T. Lewis lewis@mail.msen.com wrote:
At this point, the SqueakSource code in our squeaksource.com image should be identical to that of our source.squeak.org image. If I fix anything, I'll certainly commit the changes, but someone else fixed the socket leak problem and all I did is get squeaksource.com updated to take advantage of those fixes.
What are those fixes? I would like to ensure they're part of the new-trunk SS image at box4.squeak.org:8888.
I do not know what the fixes were, and I cannot say if they were fixes to SqueakSource, Seaside, or something in Squeak itself. I would certainly expect that the new image you are preparing on box4 will already contain the necessary fixes, but the only way find out for sure is to keep an eye on your new image and watch for socket leaks. That's just a matter of watching /proc/<squeakpid>/fd/* and looking at how many sockets are open. If the number grows over time, that's not good. If the total number of open file descriptors approaches 1024, it is a Very Bad Thing.
Obviously you want to address the root cause - leaking descriptors - but a mitigation is to up the fd quota through /etc/security/limits.conf
One more update - the file descriptor leak is not gone, although it is clearly much improved compared to the old image. Within the last day or so, the open descriptor count went up from about 40 to about 340. So the problem still happens, but much less frequently.
I am not going to restart the image, as I want to keep monitoring it and see how long it can go unattended. I am running a process in the image that will check fd count every few hours, and restart it if the count goes over 800. That should protect against image lockups if the count goes too high while I am not paying attention.
For the record, the socket leak process is:
[[vmFileCount := (FileDirectory on: '/proc/', OSProcess thisOSProcess pid asString, '/fd') entries size. OSProcess trace: DateAndTime now asString, ' squeakvm has ', vmFileCount asString, ' open file descriptors'. vmFileCount > 800 ifTrue: [ OSProcess trace: 'Too many open file handles, save image and exit'. "Save the image, exit and wait for the supervisory script to restart" Smalltalk snapshot: true andQuit: true]. (Delay forSeconds: 3 * 3600) wait] repeat] fork name: 'the Socket leak monitor'.
Dave
We have squeaksource.com set up on box3, but with mail disabled (so updates to the repository do not currently generated email notifications). After box3 becomes the official host box for squeaksource.com next week, I would like to be able to re-enable mail notifications.
box3 does not have mail installed, but box2 does. Until or unless box2 gets moved, we may as well use box2 for mail notifications, as opposed to installing mail on box3 just to support squeaksource.
So ....
I need someone with access to box2 to configure it to accept smtp requests from box3, so that I do not get the following:
'553 sorry, that domain isn''t in my list of allowed rcpthosts (#5.7.1)'
Can someone please enable this mail transport from box3 to box2? I don't know what mail transport is running on box2, but I expect this would be a matter of editing some /etc/foo file to allow inbound smtp from box3.
TIA,
Dave
I've checked the config file. I think the following lines should be added:
ProxyRequests Off #We don't want to be an open proxy ProxyPreserveHost On #Seaside might need this ProxyPassReverse / http://127.0.0.1:8888/ #Seaside might need this
I prefer to use 127.0.0.1 instead of "localhost", because the image listens only on the IPv4 address, and I ran into situations where "localhost" resolved to an IPv6 address. So I'd replace the line 41 with:
RewriteRule ^/(.*)$ http://127.0.0.1:8888/$1 [P,L]
Using vhost_combined for log format is not necessary, since squeaksource will have its own log files. So I'd replace line 23 with:
CustomLog ${APACHE_LOG_DIR}/squeaksource-access.log combined
www.box3.squeak.org doesn't resolve to the same IP as box3.squeak.org, so line 9 and 10 should be replaced for now with:
ServerName box3.squeak.org
Levente
On Fri, 20 Sep 2013, David T. Lewis wrote:
On Fri, Sep 20, 2013 at 09:14:01AM +0200, Tobias Pape wrote:
Am 20.09.2013 um 02:40 schrieb "David T. Lewis" lewis@mail.msen.com:
On Thu, Sep 19, 2013 at 07:41:41PM +0200, Tobias Pape wrote:
[?]
Well, the ServerName and ServerAlias really have to match the public DNS name when we use <VirtualHost *:80>. Apache then checks the host header and matches against that ServerName/Alias. What are the logs? (squeaksource-error.log)
Ah, I think I see now. Thanks.
I'm not sure what those logs are, but it looks like they will be just the normal apache logging (whatever that might happen to be) renamed so you can see that they came from the squeaksource.com virtual host.
No, I meant, is there something (interesting) in that logs :)
The two log files exist but are empty so far.
shouldn't we move this to box-admins?
Yes (I cc'ed box-admins this time). I've been trying to work this out on the box-admins list but I suspect there may be a few more Apache gurus here on squeak-dev, so I decided to hijack this thread and see if I could get some tips. It worked :-)
:) I am on box-admins, too ;)
Best -Tobias
If I get you an account on box3, could you look at the apache configurations (what user ID do you prefer)? I won't have much time to work on this until Sunday evening, but perhaps if you could look at it you will be able to see what I have been missing.
My immediate goal is to get it working such that http://build.squeak.org brings up the existing Jenkins home page (running on port 8080), and http://box3.squeak.org brings up the squeaksource home page (running on 8888).
This will confirm that both servers can run on box3 without conflict. Once this is done, we can change the configuration to support http://squeaksource.com and update DNS records accordingly.
Dave
Am 21.09.2013 um 21:30 schrieb Levente Uzonyi leves@elte.hu:
I've checked the config file. I think the following lines should be added:
ProxyRequests Off #We don't want to be an open proxy
this is off by default.
ProxyPreserveHost On #Seaside might need this ProxyPassReverse / http://127.0.0.1:8888/ #Seaside might need this
this is correct.
best -tobias
On Sat, 21 Sep 2013, Tobias Pape wrote:
Am 21.09.2013 um 21:30 schrieb Levente Uzonyi leves@elte.hu:
I've checked the config file. I think the following lines should be added:
ProxyRequests Off #We don't want to be an open proxy
this is off by default.
It wasn't off by default, which caused some trouble on this box.
Levente
ProxyPreserveHost On #Seaside might need this ProxyPassReverse / http://127.0.0.1:8888/ #Seaside might need this
this is correct.
best -tobias
Am 21.09.2013 um 23:18 schrieb Levente Uzonyi leves@elte.hu:
On Sat, 21 Sep 2013, Tobias Pape wrote:
Am 21.09.2013 um 21:30 schrieb Levente Uzonyi leves@elte.hu:
I've checked the config file. I think the following lines should be added:
ProxyRequests Off #We don't want to be an open proxy
this is off by default.
It wasn't off by default, which caused some trouble on this box.
ah. is it off now? I didn't check…
best -tobias
On Sat, 21 Sep 2013, Tobias Pape wrote:
Am 21.09.2013 um 23:18 schrieb Levente Uzonyi leves@elte.hu:
On Sat, 21 Sep 2013, Tobias Pape wrote:
Am 21.09.2013 um 21:30 schrieb Levente Uzonyi leves@elte.hu:
I've checked the config file. I think the following lines should be added:
ProxyRequests Off #We don't want to be an open proxy
this is off by default.
It wasn't off by default, which caused some trouble on this box.
ah. is it off now? I didn't check…
I think so, but I still added it to the config file. Also removed the duplicate "ProxyPreserveHost On" directive.
Levente
best -tobias
box-admins@lists.squeakfoundation.org