[Box-Admins] Disk space usage on box3
David T. Lewis
lewis at mail.msen.com
Mon Nov 18 14:53:45 UTC 2013
On Mon, Nov 18, 2013 at 01:26:26PM +0000, Frank Shearar wrote:
> On 18 November 2013 12:38, David T. Lewis <lewis at mail.msen.com> wrote:
> > On Mon, Nov 18, 2013 at 10:02:17AM +0000, Frank Shearar wrote:
> >> On 17 November 2013 00:32, David T. Lewis <lewis at mail.msen.com> wrote:
> >> > On Sat, Nov 16, 2013 at 05:10:18PM -0600, Ken Causey wrote:
> >> >> The disk space on box3 is beginning to get dangerously low on filesystem
> >> >> space. At the moment it is 94% full with 3.8GB of free space. Lately
> >> >> it seems to increase 1% every 2 or 3 days.
> >> >>
> >> >> Primary offenders seem to be
> >> >>
> >> >> /var/lib/jenkins/ 33GB
> >> >>
> >> >> /home/ssdotcom/ 18GB
> >> >>
> >> >> I hope one or both of you can find something to delete.
> >> >
> >> > Most of the variation in disk usage is related to our Jenkins jobs. This
> >> > is to be expected, but it does mean that we will need to tend to the garden
> >> > and make sure that weeds do not take over. I have two suggestions:
> >> >
> >> > 1) Every Jenkins job has a description that is set up when we configure
> >> > the job. The description should (of course) explain the purpose of the job,
> >> > but it should also have some sort of tag line to identify the person who
> >> > is responsible for maintaining that job. For example, the description for
> >> > the InterpreterVM job includes this:
> >> >
> >> > "This Jenkins project is maintained by Dave Lewis (lewis at mail.msen.com)"
> >> >
> >> > 2) All of the jobs consume a fair amount of disk space, and it is pretty
> >> > easy to let this get out of control. I think this can usually be managed
> >> > in the Jenkins project configurations, so so need to keep an eye on the
> >> > high-usage jobs an fix up their settings accordingly.
> >> >
> >> > Here is the current disk utilization for our Jenkins jobs:
> >> >
> >> > jenkins at box3-squeak:~/workspace$ du -s *
> >> > 204660 CogVM
> >> > 260320 ExternalPackage-AndreasSystemProfiler
> >> > 262948 ExternalPackage-Control
> >> > 286952 ExternalPackage-FFI
> >> > 392016 ExternalPackage-FileSystem
> >> > 456600 ExternalPackage-Fuel
> >> > 427560 ExternalPackage-Magma
> >> > 256144 ExternalPackage-Nebraska
> >> > 256184 ExternalPackage-Nutcracker
> >> > 263092 ExternalPackage-OSProcess
> >> > 228668 ExternalPackage-Phexample
> >> > 256732 ExternalPackage-Quaternion
> >> > 255692 ExternalPackage-RoelTyper
> >> > 417300 ExternalPackages
> >> > 414608 ExternalPackages-Metacello
> >> > 255580 ExternalPackage-SqueakCheck
> >> > 412592 ExternalPackages-Squeak4.3
> >> > 338264 ExternalPackages-Squeak4.4
> >> > 256288 ExternalPackage-Universes
> >> > 255752 ExternalPackage-WebClient
> >> > 256268 ExternalPackage-XML-Parser
> >> > 444996 ExternalPackage-Xtreams
> >> > 387160 ExternalPackage-Xtreams-FileSystem
> >> > 256144 ExternalPackage-Zippers
> >> > 384448 InterpreterVM
> >> > 205396 LatestReleasedVM
> >> > 5370500 ReleaseSqueakTrunk
> >> > 127408 Squeak 64-bit image
> >> > 766392 SqueakTrunk
> >> > 291772 SqueakTrunkOnBleedingEdgeCog
> >> > 412972 SqueakTrunkOnInterpreter
> >> > 419892 SqueakTrunkPerformance
> >> >
> >> > At the moment, the ReleaseSqueakTrunk job is using a lot of space, and
> >> > this is mostly due to saved images in its ./target directory. Most likely
> >> > we can purge out some of the older images to free up some space.
> >> With the exception of ReleaseSqueakTrunk, the CI jobs _should_ be
> >> fairly careful with disk space. One way they'd leak disk space is
> >> through target/package-cache, which will tend to accumulate MCZs over
> >> time. The SqueakTrunk* and ExternalPackage* jobs are set up to work
> >> from a blank slate, so they should always be in a position to have
> >> their workspaces wiped.
> > The issue is that we are running out of disk space and need to take action.
> > In the project configuration for ReleaseSqueakTrunk, the following option
> > is *not* selected:
> > Discard all but the last successful/stable artifact to save disk space
> > Should this be changed?
> I don't know. The job's set up to consider _all_ Squeak-*-*.zip files
> as part of the artifact, which is itself a problem. I suspect that if
> this switch is flagged, you won't be able to go to, say,
> ReleaseSqueakTrunk 22, and get the artifact. I _think_.
> > The /var/lib/jenkins/workspace/ReleaseSqueakTrunk/target directory is using
> > the majority of the disk space for this job, and it looks to me like most
> > of this consists of transient files that could be deleted (or compressed).
> What are the transient files? The only things possibly worth keeping
> are Squeak-*-*.zip files. The rest of the files should be either
> interim build steps (TrunkImage.* and so on), XML files for
> test/performance coverage, and Squeak cruft like update logs,
> package-cache and so on. All of those can go.
> > I don't want to touch anything here without your approval.
> You have my permission to wipe any of "my" workspaces at any time: if
> they break because of a wipe, I didn't do my job properly! :)
OK, if we actually get dangerously close to running out of disk space,
one of us can delete /var/lib/jenkins/workspace/ReleaseSqueakTrunk/target
to relieve the problem.
But we need to deal with this in a sustainable way that does not require
manual intervention. In the case of ReleaseSqueakTrunk, possibly this can
be done by adding one more build step to the job that cleans up files
when the job is complete.
Right now you have these two build steps:
DEBUG=1 bundle exec rake release
I don't know anything about ruby or rake but I am guessing that you might
be able to add one more build step that might look something like this:
bundle exec rake cleanup
Or maybe it can be just a shell command that cleans up unneeded files
in the workspace/ReleaseSqueakTrunk/targets directory. I see that there
is a Cog VM executable in that directory, so I'm assuming that you do not
want it completely wiped out. But perhaps the rest of the files can be
purged as part of the job steps.
I'll be happy to help with a shell command if you like, but I'm not
familiar enough with rake and ruby to know if that is the right thing
to do in this case. If there is some way that you can do this in the
ruby scripts, that would probably be a better approach.
More information about the Box-Admins