[Box-Admins] RE: Losing rsync mirror of squeak.org

Ken Causey ken at kencausey.com
Fri Apr 30 16:21:25 UTC 2010


As an addendum:  What is actually more useful from a remote backup
standpoint is weekly backups.  We have a week's worth of daily backups
locally.  In fact I think Göran was previously keeping something like 4
weeks worth of weekly backups.  I expect the changes from one week to
the next, at the level of complete files, is not much more than the
daily changes, although a large update to the FTP site could of course
change that.

Ken

On Thu, 2010-04-29 at 11:39 -0700, Ken Causey wrote:
> > -------- Original Message --------
> > Subject: Re: Losing rsync mirror of squeak.org
> > From: Göran_Krampe <goran at krampe.se>
> > Date: Wed, April 28, 2010 3:45 pm
> > To: Ken Causey <ken at kencausey.com>
> > Cc: box-admins Support <box-admins at lists.squeakfoundation.org>,  Squeak
> > Oversight Board <board at lists.squeakfoundation.org>
> > 
> > 
> > On 04/28/2010 06:32 PM, Ken Causey wrote:
> > > OK, I understand.  Thanks for providing this service for so long.  So
> > > this eliminates our remote backup.  Does anyone have any ideas for a
> > > replacement?  An rsync/rsnapshot setup would be nicest (easiest).
> > 
> > I wonder how much traffic this is, I mean, I do have a server here at 
> > home that could sync it at night time. And so could probably someone 
> > else too.
> > 
> > regards, Göran
> 
> Related to this I've been thinking more and more that rsync/rsnapshot
> are wasteful, especially for us with images that are often saved daily
> if not more often.  The problem is that the granularity of rsnapshot is
> at the file level.
> 
> Another problem is that rsnapshot continues to retain files after they
> have been deleted.  At first glance this is very desirable.  But in
> practice this means that the snapshot continuously grow and contain lots
> of files which we don't expect to ever see again.
> 
> I've looked into the idea of using something like git instead (see
> eigenclass for example) which should be significantly better.  But this
> is complex and I, like you, only have limited time to look into this. 
> If anyone has related experience I would love to hear about it.
> 
> Regarding size...  Some quick estimates:
> 
> I believe as a rough estimate the entire installation, excepting the
> backups themselves, comes to about 40GB currently.  So that is something
> of an upper estimate, say if the entire thing was transferred each day.
> 
> As another estimate the sizes of the local backups (once a day, 7 days
> worth) are:
> 
> box2:~# rsnapshot du
> 49G     /var/cache/rsnapshot/daily.0/
> 2.0G    /var/cache/rsnapshot/daily.1/
> 2.0G    /var/cache/rsnapshot/daily.2/
> 2.6G    /var/cache/rsnapshot/daily.3/
> 2.1G    /var/cache/rsnapshot/daily.4/
> 2.0G    /var/cache/rsnapshot/daily.5/
> 2.0G    /var/cache/rsnapshot/daily.6/
> 61G     total
> 
> The .0 is the most recent and represents a complete backup, along with
> the accumulated history of past backups.  The others represents the
> files that have changed from the previous day.  So in theory if only
> changed files are transferred in whole, it should average less than 3GB
> per day.  If instead we had a system which transferred only the changes,
> it would be far far less.
> 
> As far as using rsnapshot itself goes I'm not clear on what is
> transferred.  Each of these backups represent a complete backup, it's
> simply that when the file is not modified a hard link is created.  I'm
> not certain when a remote backup is made on which side the decision is
> made, whether or not every file is transferred and it is only on the
> backup server that the decision to hard link is made, hopefully not.
> 
> You will note that .0 backup is quite a bit larger than my estimate of
> the actual size of the current content on disk, this shows the problem
> with the continued collection of deleted files.  My research of git
> based solutions indicates that this is a problem there as well.  The
> only solution is to clean out files that really should be forgotten.  I
> would expect the problem to be far less significant with a system that
> stores inter-file diffs though.
> 
> Ken
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part
Url : http://lists.squeakfoundation.org/pipermail/box-admins/attachments/20100430/c7138a4c/attachment.pgp


More information about the Box-Admins mailing list