Re: [Box-Admins] box3.squeak.org off line - HELP neeeded

8 Oct 2013


      The answer is yes but of course source.squeak.org is so so much smaller 
in terms of the data it has to handle and the traffic level that its 
swings in memory usage are rarely a significant problem.
I should have said something more when you started discussing this, and 
I apologize for my level of silence. I'm concerned that given the amount 
of trouble the original owners had with SqueakSource.com, I don't see 
how we can expect to do better, particularly with a virtual server with 
only 1GB of RAM allocated to it. Setting it to read-only, and I'm not 
sure but perhaps that is how it is set now, is of course going to reduce 
the load. But how much? It's not an easy question to answer.
Ken
On 10/08/2013 01:08 PM, David T. Lewis wrote:
...
Aha. That would do it for sure. So something is going on in the squeaksource
image that is using a *lot* of object memory for some period of time. The use
of 957m resident memory would very likely be enough to cause the symptoms
that we saw.
Do you know if we see any similar pattern of memory usage on the source.squeak.org
server? I'm already convinced that the squeaksource.com image badly needs
to be updated to the same level of Squeak/Seaside/SqueakSource as our
source.squeak.org server (due to socket leak problems if nothing else).
I also recall that squeaksource.com on the SCG server had horrible performance
problems whenever we tried to commit a large MCZ to the VMMaker repository,
and I always assumed that it was memory related in some way on another.
Thanks a lot Ken,
Dave
On Tue, Oct 08, 2013 at 12:37:17PM -0500, Ken Causey wrote:
...
oom_killer
PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12929 ssdotcom  20   0 1028m 957m  784 R 99.9 95.0  12:05.45
/usr/local/lib/squeak/4.10.5-2619/squeakvm -vm-display-null
squeaksource.2.image
Note the 6th field.
Ken
On 10/08/2013 12:22 PM, Frank Shearar wrote:
...
On 8 October 2013 17:48, David T. Lewislewis@mail.msen.com   wrote:
...
On Tue, Oct 08, 2013 at 04:33:28PM +0100, Frank Shearar wrote:
...
On 6 October 2013 19:00, David T. Lewislewis@mail.msen.com   wrote:
...
On Sun, Oct 06, 2013 at 11:18:25AM -0400, David T. Lewis wrote:
> On Sun, Oct 06, 2013 at 04:52:21PM +0200, Tobias Pape wrote:
>>
>> So, uptime said:
>> root@box3-squeak:/home/ssdotcom# uptime
>>   16:32:49 up 166 days, 22 min,  1 user,  load average: 0.70, 4.74,
>>   8.52
>>
>> And the _last two_ numbers are concerning. Basically, the server was
>> overloaded.
>> the Squeak vm uses about a gig of virtual memory (really?) and seems
>> to compete with
>> the jenkins running on the server. htop says, jenkins uses 25% of the
>> systems memory
>> while Squeak uses 19% (both of which I deem high).
>>    So in the event of some jenkins jobs firing off and Squeaksource
>>    answering some requests,
>> the server might become un-responsive?
>>
>
> Something like that I think. I'm not sure what was generating the
> load, although
> there is no question that adding squeaksource to box3 adds a
> significant resource
> demand above that of the Jenkins jobs.
>
> Allocating a big address space (1G) is normal for the VM, and in this
> case the
> image is actually using a bit under 200MB, which is 20% of the system
> memory.
> If there is some combination of squeaksource and jenkins activity that
> pushes
> the total demand to the point of requiring swapping, then it's
> possible that
> this would make the system unresponsive as I was seeing.
>
> A number of the Jenkins jobs run squeak VMs in addition to the Java
> stuff,
> so some combination of these might add up to a problem.
>
I am now running top every 30 seconds for the next 24 hours, with
output directed
to ~ssdotcom/tmp/top.out. Possibly this will show us something
interesting.
If that's still running, it's probably saying "ow! ow! stop it!" right
now. If Tony Garnock-Jones&   I could figure out why jobs are failing
on his slaves, I'd suggest moving builds off the box entirely. I'll
probably turn my old laptop into a build slave... once I can get it up
&   running again. That too will help with keeping work off the box.
No worries, I only ran it for a 24 hour period. I saw occasional load
increases, but nothing like the "load average: 0.70, 4.74, 8.52" that
Tobias spotted right after the outage. I think we just need to keep
our eyes open for problems in case it comes back ... usually if a thing
can fail once, it will fail again eventually ;-)
Hm, OK, so build.squeak.org could be unavailable for a different reason!
frank
...
Dave

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Box-Admins] box3.squeak.org off line - HELP neeeded