Build.squeak.org and squeaksource.com in danger (was Re: [Box-Admins] Disk space usage on box3)

Chris Muller asqueaker at gmail.com
Sat Dec 7 22:58:51 UTC 2013


Hi guys -- First, Ken, thanks again for setting a reference example of
how we need to be thinking about our systems from an admin-support
perspective.  Taking a proactive tact is not just excellent service,
there are many intangible benefits like image and community
constitution.

I totally agree with Dave.  Pruning SqueakSource may be something we
might want to consider in the future, but not now, and never simply
because we're low on disk space.  First because pruning empty projects
won't recover anything significant, secondly because we're in a
"archival preservation" mode right now with SqueakSource -- not a mode
to be making significant changes to it.

How many times have we said, "disk space is cheap"?  We cannot go back
on that now!  :)

Frank wrote:

> As a separate step, we can think about how to both produce versioned
> artifacts (i.e., zipfiles with versions in their file names) and not
> eat all the disk space.

I thought about it 4 years ago and made an object model to take care
of it.  The problem with zip-files is how wasteful they are.  When
someone changes one single method of the Morphic package, the other 1K
definitions (however many there are) are duplicated in new zip (mcz)
file.  By contrast, the object model refers to the same canonicalized
MCDefinition instances across Versions, adding only one new
MCDefinition to the bulk of the model in that example.

The result is that the redundant Magma-backed copy of
source.squeak.org consumes less than 1/4th the space of the original
File based version.  A Magma-backed copy of squeaksource.com would
about 12GB of space.

For now, though, Frank has the ability and responsibility to trim up
the jenkins stuff.  Thanks Frank.

PS -- For interest, I just kicked off a bulk-load of entire
squeaksource.com repository into Magma to see how much space it will
take..

On Sat, Dec 7, 2013 at 9:37 AM, David T. Lewis <lewis at mail.msen.com> wrote:
> On Sat, Dec 07, 2013 at 10:17:06AM +0000, Frank Shearar wrote:
>> I mailed Ken separately, but the time zone happy conjunction must have
>> just passed. I'd have solved the problem (wipe out the
>> ReleaseSqueakTrunk/target/ directory (5.9GB)) except that I can't sudo
>> because I don't know the password for the account (because I don't
>> think I ever actually set it)! If some kind soul can change it and let
>> me know the password, I'll (a) change it to something only I know and
>> (b) wipe out the directory causing the problem.
>>
>> As a separate step, we can think about how to both produce versioned
>> artifacts (i.e., zipfiles with versions in their file names) and not
>> eat all the disk space.
>>
>> frank
>
> Hi Frank,
>
> Cool, I think we just came to the identical conclusions :) I was just
> about to send the following email to you before I read this, so here
> is what I was going to say:
>
> I think we can pretty easily free up a bunch of disk space. It turns out
> that our SqueakTrunk job is currently using 44% of the entire disk space
> on box3, and a lot of that can probably be freed without harming anything
> for the SqueakTrunk job itself.
>
> Here are some things I think we can do, but I want to run it by you before
> actually changing anything:
>
> 1) The build artifacts are TrunkImage.changes, TrunkImage.image,
> TrunkImage.manifest and TrunkImage.version. The image and changes files
> take most of the space, so we could add a build step to compress them
> like this:
>
> $ zip TrunkImage.zip TrunkImage.changes TrunkImage.image TrunkImage.manifest TrunkImage.version
>   adding: TrunkImage.changes (deflated 77%)
>   adding: TrunkImage.image (deflated 54%)
>   adding: TrunkImage.manifest (deflated 54%)
>   adding: TrunkImage.version (stored 0%)
>
> Then we can specify TrunkImage.zip as the build artifact. This will save
> a lot of disk space in the future.
>
> 2) The Jenkins job is saving all of the build artifacts since we began
> running the job.  I'm not sure if that's what you want it to do, but if
> the old artifacts are not needed, then we can change the job configuration.
> In the "Archive the Artifacts" section of the job configuration, there
> is a setting for "Discard all but the last successful/stable artifact to
> save disk space". That might more aggressive that we want, but there must
> be some setting that would let us trim the archives down to what we really
> need.
>
> 3) If we run out of disk space and need to take emergency action, we can
> just compress the older build artifacts (from the unix command line). It's
> probably not good to do this outside of Jenkins tools, but at least we would
> not lose the actual data, and it would free up a lot of disk space right away.
>
> 4) If we don't need all of the historical artifacts, and if we can't figure
> out how to trim them down through Jenkins job configurations, then I can
> delete the older ones from the unix command line.
>
> 5) Not directly related to disk space, but we should probably also enable the
> "Abort the build if it's stuck" option under "Build Environment". We can set
> it to time out after 30 minutes or so, and I think that might cure our problem
> with stuck ruby and squeakvm processes.
>
> Dave
>
>
>>
>> On 7 December 2013 01:23, David T. Lewis <lewis at mail.msen.com> wrote:
>> > On Fri, Dec 06, 2013 at 11:40:54AM -0600, Ken Causey wrote:
>> >> On 12/06/2013 11:19 AM, David T. Lewis wrote:
>> >> >Thanks Ken,
>> >> >
>> >> >I will look at it as soon as I get home, about 8 hours from now.
>> >> >
>> >> >Frank,
>> >> >
>> >> >There's not much I can do on the squeaksource.com side other than move the
>> >> >repository to another box (which is not easy to do). Short term we need to
>> >> >tidy up the build.squeak.org jobs where can can.
>> >> >
>> >> >Dave
>> >>
>> >> Thanks Dave,
>> >>
>> >> I agree that it is probably easier to find some space to clear up under
>> >> build.squeak.org but I think the community as a whole has to give some
>> >> serious thought to the future of squeaksource.com.
>> >
>> > I fully agree that the community should give some consideration to how
>> > squeaksource.com should be managed moving forward. But please do not
>> > portray this as a disk space problem. If that is the problem, then I'll
>> > pay for the disk space myself, just tell me where to send the check.
>> >
>> > The disk utilization problem is due to unnecessary accumulation of build
>> > artifacts from Jenkins jobs. It looks to me like most of this is accumulating
>> > by accident rather than by intent, and this can probably be easily fixed
>> > with some changes to the job configurations, with no loss of useful data
>> > from the jobs themselves. Clearly this needs to be addressed anyway,
>> > because if you doubled our available disk space we would be having the
>> > same discussion 12 months from now. So we need to fix it.
>> >
>> > I'll try to get with Frank over the weekend and see if we can clean up
>> > some easy stuff (Frank, I am "dtlewis290 at gmail.com" on gmail, so I'll
>> > try to connect with you there).
>> >
>> > Meanwhile I deleted a few unnecessary backup files under ~ssdotcom, which
>> > gives us another 1% free disk space to keep things going for another day
>> > or so ;-)
>> >
>> > Dave
>> >


More information about the Box-Admins mailing list