[Webteam] File cleanups, yes... again

Ken Causey ken at kencausey.com
Thu Dec 20 21:16:39 UTC 2007


So?  What's the consensus?  I would like to make progress on this issue
and lighten the webteam's disk usage.

Here is my understand of the system based on an examine of what I can
see on the server.  There appear to be three levels of backups
currently.

1.  I can see that for every site image there is also a
secondary .backup image.  My assumption is that whenever someone
requests that the image be saved (changes in the website) that the
previously saved image is renamed and the current image is saved.

2.  There is a cron script under the website account that periodically
builds a tarball from all .image and .changes files in the site
directories.  This unfortunately includes old temporary and failed
images that are not in fact used.  If nothing else, I would appreciate
this being cleaned up.

3.  The entire server is backed locally on a short term frequent (every
4 hours) schedule and remotely on a more long term less frequent (daily)
schedule.  This of course includes the entire contents of the website
account.

Again as a first priority I would appreciate a simple cleanup of old
undesirable files from the website account with a focus on those extra
image and changes files being repeatedly backed up.  Secondarily I
certainly support reexamining the webteam's internal backup policy,
particularly in light of the fact that the entire server is backed up
regularly.

Ken

On Tue, 2007-12-18 at 15:05 -0500, Jason Rogers wrote:
> Oh... I misunderstood.  I was think to fork a process that would
> snapshot the image every so often and create/update the back up at
> that time.  I see what you are getting at with buttons in the Web Page
> and will do it that way.
> 
> I don't think we need more than one copy of the backup, though.  Can
> you tell me why we should have multiple copies?
> 
> Thanks.
> 
> On Dec 18, 2007 2:58 PM, Karl <karl.ramberg at comhem.se> wrote:
> > Jason Rogers wrote:
> > > I could implement it.  It shouldn't be too hard right?  I don't know
> > > about how to save the image as another image name though.  Once I find
> > > how to do that it will be easy:
> > >
> > >   1. Capture current image name
> > >   2. Snapshot as backup image first
> > >   3. Snapshot as current image
> > >
> > > Right?  There aren't any gotchas are there?
> > >
> > Keep the backup and snapshot as two different buttons or issues. I think
> > most bad things happen to the image while editing and adding or deleting
> > features so it would be good to snapshot, see that everything is working
> > for a few days, then do a backup. Or do a backup before starting to
> > edit, and then edit, snapshot and wait a few days and then backup again ?
> >
> > Another issue is how many backup images do we need to keep ?  2 or 3 of
> > the most resent and delete the older ones ?
> >
> > Karl
> >
> > > On Dec 18, 2007 10:59 AM, Karl <karl.ramberg at comhem.se> wrote:
> > >
> > >> Jason Rogers wrote:
> > >>
> > >>> I will hop on as soon as I can to take care of this.  I am in New York
> > >>> right now and unable (company firewall) to access the box.  We really
> > >>> need a better backup policy in general, but I don't know what to do.
> > >>> Perhaps we don't use a Unix process at all.  We could schedule a
> > >>> process in the images that will snapshot the image as a current and a
> > >>> backup.
> > >>>
> > >>> What do you all think?
> > >>>
> > >>>
> > >> Sounds good. We already do manual image save on each change on the
> > >> Smallwiki process. Maybe a similar backup button would be enough ? Do
> > >> you want to implement it?
> > >> Karl
> > >>
> > >>
> > >>> On Dec 17, 2007 6:56 PM, Ken Causey <ken at kencausey.com> wrote:
> > >>>
> > >>>
> > >>>> On Tue, 2007-12-18 at 00:45 +0100, karl wrote:
> > >>>>
> > >>>>
> > >>>>> Ken Causey wrote:
> > >>>>>
> > >>>>>
> > >>>>>> We are climbing up above 90% disk usage on box2 so time for another
> > >>>>>> audit.  Previously I managed to talk you into a more conservative backup
> > >>>>>> schedule.  Now I would like to ask you to cleanup what is being backed
> > >>>>>> up.  A little nosing around indicates that you are backing up a lot of
> > >>>>>> files that I suspect were just used in setting up the sites/testing and
> > >>>>>> or just junk at this point:
> > >>>>>>
> > >>>>>> # tar ztf backups/foundation/2007-12-17-0005.tgz
> > >>>>>> SqF-Pier-1.5-maybe-bad.image
> > >>>>>> SqF-Pier-1.5-safe.image
> > >>>>>> SqF-Pier-1.5-safe.old.image
> > >>>>>> SqF-Pier-1.5.image
> > >>>>>> SqF-Pier-1.5.old.image
> > >>>>>> SqueakFoundation.image
> > >>>>>> SqF-Pier-1.5-maybe-bad.changes
> > >>>>>> SqF-Pier-1.5-safe.changes
> > >>>>>> SqF-Pier-1.5-safe.old.changes
> > >>>>>> SqF-Pier-1.5.changes
> > >>>>>> SqF-Pier-1.5.old.changes
> > >>>>>> SqueakFoundation.changes
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>> Squeak foundation images are not used at all. Brad Fuller put a lot of
> > >>>>> effort into it but the foundation is a a few pages in the squeak.org
> > >>>>> image. You can delete foundation directory and backups
> > >>>>>
> > >>>>>
> > >>>> I'd prefer if Brad could confirm he has no more interest in any of that
> > >>>> content and one of you take care of it.
> > >>>>
> > >>>>
> > >>>>
> > >>>>>> # tar ztf backups/2007-12-17-0005.tgz
> > >>>>>> smallwikiSnapshot.1.image
> > >>>>>> smallwikiSnapshot.backup.image
> > >>>>>> smallwikiSnapshot.image
> > >>>>>> smallwikiSnapshot.1.changes
> > >>>>>> smallwikiSnapshot.backup.changes
> > >>>>>> smallwikiSnapshot.changes
> > >>>>>>
> > >>>>>> s# tar ztf backups/testing/2007-12-17-0005.tgz
> > >>>>>> smallwikiSnapshot.backup.image
> > >>>>>> smallwikiSnapshot.image
> > >>>>>> wwwtest.squeak.org.backup.image
> > >>>>>> wwwtest.squeak.org.image
> > >>>>>> smallwikiSnapshot.backup.changes
> > >>>>>> smallwikiSnapshot.changes
> > >>>>>> wwwtest.squeak.org.backup.changes
> > >>>>>> wwwtest.squeak.org.changes
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>> I'm not at all sure how backup is run. I screwed up the squeak.org image
> > >>>>> a few years back and found that the images backed up were useless
> > >>>>> because they were copied from a unix process on a running image I think.
> > >>>>> We need backup of squeak.org image.
> > >>>>>
> > >>>>>
> > >>>> That's fine, but even the backup of the main site involves backing up 3
> > >>>> image and changes file sets.  I can maybe imagine 2 sets (current and
> > >>>> previous to last modification), but 3?
> > >>>>
> > >>>>
> > >>>>
> > >>>>> The wwwtest.squeak.org image we hardly use anymore, but it is good for
> > >>>>> testing major changes to style scripts etc. wwwtest.squeak.org does not
> > >>>>> need backup now. I guess we can turn backup on when someone get the urge
> > >>>>> to hack at stuff.
> > >>>>>
> > >>>>>
> > >>>> Either that or just scale back the extent to which wwwtest is backed up.
> > >>>> Again, I'm primarily concerned about the backing up of files which never
> > >>>> change.
> > >>>>
> > >>>>
> > >>>>
> > >>>>>> Are all of these files needed at all, much less needing to be backed up
> > >>>>>> over and over again?
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>> I guess not
> > >>>>>
> > >>>>>
> > >>>>>> The home directory for the website team totals 4.3GB.  Since there is an
> > >>>>>> rsync backup also on the server that is doubled, and then any images
> > >>>>>> that change are backed up in their entirety again.  So in effect the
> > >>>>>> website team ends up using perhaps as much as 10GB on the server.
> > >>>>>> Anything you can do to lower this I would greatly appreciate.
> > >>>>>>
> > >>>>>>
> > >>>>> I think you can delete all the files I mentioned.
> > >>>>>
> > >>>>>
> > >>>> I'd rather not delete anything myself.  However designed the backup
> > >>>> process of course would need to change that.
> > >>>>
> > >>>> Thanks,
> > >>>>
> > >>>> Ken
> > >>>>
> > >>>>
> > >>>>
> > >>>>> Karl
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>> _______________________________________________
> > >>>> Webteam mailing list
> > >>>> Webteam at lists.squeakfoundation.org
> > >>>> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/webteam
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>
> > >
> > >
> > >
> > >
> >
> >
> 
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.squeakfoundation.org/pipermail/webteam/attachments/20071220/70753956/attachment.pgp


More information about the Webteam mailing list