[Box-Admins] Long running build process on build.squeak.org?

David T. Lewis lewis at mail.msen.com
Mon Jan 14 13:24:55 UTC 2013


Good idea to add a watchdog timer. Another good practice is to use
the 'nice' command (/usr/bin/nice) in the command lines that run Squeak.
This runs the tests at lower scheduling priority, so if a process gets
stuck consuming close to 100% cpu, it impact on other system users will
be reduced (it will still gobble up all the cpu, it just won't drag the
system down so badly).

I don't know what the problem was in this particular case, but one
thing that can result in Squeak consuming 100% is an error in the image
that causes too much memory usage, such as a recursion error. Squeak
keeps asking for more memory, the VM asks the OS for more, and eventually
you are swapping. If this turns out to have been the problem, you can
prevent the runaway memory condition with the '-memory' command line
option to the VM (but don't do that unless we can confirm that it really
*is* the problem, I'm just mentioning it for future reference).

Dave

On Mon, Jan 14, 2013 at 08:06:27AM +0000, Frank Shearar wrote:
> Ah, no, that's not a debugger then.
> 
> I'm going to slap a 15 minute kill time on the jobs later today: our
> longest running jobs so far are around 9 minutes.
> 
> frank
> 
> On 13 January 2013 20:26, Ken Causey <ken at kencausey.com> wrote:
> > Great, also I think I should point out that I don't think it was just that
> > an exception had not been caught.  The process was pegging the CPU (running
> > full out, 99%+ CPU usage).
> >
> > Ken
> >
> >
> > On 01/13/2013 02:18 PM, Frank Shearar wrote:
> >>
> >> I just killed the job. I'll need to add more output to the script,
> >> like the precise Cog version involved. I expect that particular job to
> >> be less stable than SqueakTrunk - it _is_ bleeding edge on both image
> >> _and_ VM side, after all.
> >>
> >> frank
> >>
> >> On 13 January 2013 19:37, Ken Causey<ken at kencausey.com>  wrote:
> >>>
> >>> Sorry, that process line was unintentionally chopped off
> >>>
> >>> jenkins  29126 99.6  2.3 1054380 24552 ?       R    03:20 1032:16
> >>> /var/lib/jenkins/workspace/CogVM/tmp/lib/squeak/4.0-2636/squeak
> >>> -vm-sound-null -vm-display-null
> >>>
> >>> /var/lib/jenkins/workspace/SqueakTrunkOnBleedingEdgeCog/target/TrunkImage.image
> >>> /var/lib/jenkins/workspace/SqueakTrunkOnBleedingEdgeCog/tests.st
> >>>
> >>> Ken
> >>>
> >>>
> >>> On 01/13/2013 01:10 PM, Ken Causey wrote:
> >>>>
> >>>>
> >>>> Roughly every day or two I login to box3 and check things out and check
> >>>> for package updates. With rare exception the system is quiet, I check
> >>>> for updates, apply any found, and move on. But today I find this (from
> >>>> ps auwx)
> >>>>
> >>>> jenkins 29126 99.7 2.3 1054380 24552 ? R 03:20 1000:40
> >>>> /var/lib/jenkins/workspace/CogVM/tmp/lib/squeak/4.0-2636/squeak
> >>>> -vm-sound-null -vm-display-null
> >>>>
> >>>>
> >>>> /var/lib/jenkins/workspace/SqueakTrunkOnBleedingEdgeCog/target/TrunkImage.image
> >>>> /var/lib/jenkins/workspace/Sq
> >>>>
> >>>> As you can see this has used 1000+ minutes of CPU time (which is less
> >>>> than the actual running time). I've not seen this before on the server.
> >>>> Is it perhaps the result of a new build project and expected? Or an
> >>>> actual problem? Out of caution and since the system is already busy I
> >>>> haven't checked for package updates yet today (I think the last time I
> >>>> did so was Friday).
> >>>>
> >>>> Ken
> >>>>
> >>>>
> >>>
> >>
> >>
> >


More information about the Box-Admins mailing list