[Box-Admins] RE: Losing rsync mirror of squeak.org
Ken Causey
ken at kencausey.com
Fri Apr 30 16:21:25 UTC 2010
As an addendum: What is actually more useful from a remote backup
standpoint is weekly backups. We have a week's worth of daily backups
locally. In fact I think Göran was previously keeping something like 4
weeks worth of weekly backups. I expect the changes from one week to
the next, at the level of complete files, is not much more than the
daily changes, although a large update to the FTP site could of course
change that.
Ken
On Thu, 2010-04-29 at 11:39 -0700, Ken Causey wrote:
> > -------- Original Message --------
> > Subject: Re: Losing rsync mirror of squeak.org
> > From: Göran_Krampe <goran at krampe.se>
> > Date: Wed, April 28, 2010 3:45 pm
> > To: Ken Causey <ken at kencausey.com>
> > Cc: box-admins Support <box-admins at lists.squeakfoundation.org>, Squeak
> > Oversight Board <board at lists.squeakfoundation.org>
> >
> >
> > On 04/28/2010 06:32 PM, Ken Causey wrote:
> > > OK, I understand. Thanks for providing this service for so long. So
> > > this eliminates our remote backup. Does anyone have any ideas for a
> > > replacement? An rsync/rsnapshot setup would be nicest (easiest).
> >
> > I wonder how much traffic this is, I mean, I do have a server here at
> > home that could sync it at night time. And so could probably someone
> > else too.
> >
> > regards, Göran
>
> Related to this I've been thinking more and more that rsync/rsnapshot
> are wasteful, especially for us with images that are often saved daily
> if not more often. The problem is that the granularity of rsnapshot is
> at the file level.
>
> Another problem is that rsnapshot continues to retain files after they
> have been deleted. At first glance this is very desirable. But in
> practice this means that the snapshot continuously grow and contain lots
> of files which we don't expect to ever see again.
>
> I've looked into the idea of using something like git instead (see
> eigenclass for example) which should be significantly better. But this
> is complex and I, like you, only have limited time to look into this.
> If anyone has related experience I would love to hear about it.
>
> Regarding size... Some quick estimates:
>
> I believe as a rough estimate the entire installation, excepting the
> backups themselves, comes to about 40GB currently. So that is something
> of an upper estimate, say if the entire thing was transferred each day.
>
> As another estimate the sizes of the local backups (once a day, 7 days
> worth) are:
>
> box2:~# rsnapshot du
> 49G /var/cache/rsnapshot/daily.0/
> 2.0G /var/cache/rsnapshot/daily.1/
> 2.0G /var/cache/rsnapshot/daily.2/
> 2.6G /var/cache/rsnapshot/daily.3/
> 2.1G /var/cache/rsnapshot/daily.4/
> 2.0G /var/cache/rsnapshot/daily.5/
> 2.0G /var/cache/rsnapshot/daily.6/
> 61G total
>
> The .0 is the most recent and represents a complete backup, along with
> the accumulated history of past backups. The others represents the
> files that have changed from the previous day. So in theory if only
> changed files are transferred in whole, it should average less than 3GB
> per day. If instead we had a system which transferred only the changes,
> it would be far far less.
>
> As far as using rsnapshot itself goes I'm not clear on what is
> transferred. Each of these backups represent a complete backup, it's
> simply that when the file is not modified a hard link is created. I'm
> not certain when a remote backup is made on which side the decision is
> made, whether or not every file is transferred and it is only on the
> backup server that the decision to hard link is made, hopefully not.
>
> You will note that .0 backup is quite a bit larger than my estimate of
> the actual size of the current content on disk, this shows the problem
> with the continued collection of deleted files. My research of git
> based solutions indicates that this is a problem there as well. The
> only solution is to clean out files that really should be forgotten. I
> would expect the problem to be far less significant with a system that
> stores inter-file diffs though.
>
> Ken
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part
Url : http://lists.squeakfoundation.org/pipermail/box-admins/attachments/20100430/c7138a4c/attachment.pgp
More information about the Box-Admins
mailing list