Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 May 2009 10:00:56 +0300
From:      Valentin Bud <valentin.bud@gmail.com>
To:        vogelke+unix@pobox.com
Cc:        Kelly Jones <kelly.terry.jones@gmail.com>, freebsd-questions@freebsd.org
Subject:   Re: Backing up FreeBSD and other Unix systems securely
Message-ID:  <139b44430905200000l250c9ae6p1ddcb1a6ac10bef8@mail.gmail.com>
In-Reply-To: <20090518183829.D0E7BBEBB@kev.msw.wpafb.af.mil>
References:  <26face530905170912m3ca8b762nd0cfadc7db34da6f@mail.gmail.com>  <20090518183829.D0E7BBEBB@kev.msw.wpafb.af.mil>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, May 18, 2009 at 9:38 PM, Karl Vogel
<vogelke+unix@pobox.com<vogelke%2Bunix@pobox.com>
> wrote:

> >> On Sun, 17 May 2009 09:12:57 -0700,
> >> Kelly Jones <kelly.terry.jones@gmail.com> said:
>
> K> I like this plan because it does versioned backups, and doesn't backup
> K> identical files twice. I dislike it because I lose Mozy's unlimited disk
> K> space.
>
> K> % Is there software that already does this?
>
>   I have a 3-Tbyte server running FreeBSD-6.1 that does something very
>   similar.  I don't bother with encrypting the filenames or hashes
>   because we control the box, and if I'm not at work, other admins
>   might need to restore something quickly.
>
>   We have around 3.7 million files from 5 other servers backed up
>   under two 1.5-Tbyte filesystems, /mir01 and /mir02.  My setup looks
>   like this:
>
>     +-----mir01
>     |      +-----HASH
>     |      |      +-----00
>     |      |      |      +-----00
>     |      |      |      +-----01
>                          ...
>     |      |      +-----01
>                   ...
>     |      |      +-----fe
>     |      |      +-----ff
>     |      +-----server1
>     |      +-----server2
>     +-----mir02
>     |      +-----HASH
>     |      +-----server3
>     |      +-----server4
>     |      +-----server5
>
>   The HASH directories have two levels of subdirectories 00-ff.
>   That's been more than sufficient to keep directories from getting
>   too big; I average around 25 files per directory.
>
>   I do hourly backups on the other fileservers using something like the
>   find and timestamp method you mentioned, but I ignore 0-length files
>   because they always hash to the same value.  The backup directories
>   for the second fileserver look like this for 5 May 2009:
>
>     +-----mir01
>     |      +-----server2
>     |      |      +-----2009
>     |      |      |      +-----0505
>     |      |      |      |      +-----070700
>     |      |      |      |      |      +-----doc      (filesystem)
>     |      |      |      |      |      +-----home
>     |      |      |      |      +-----080700
>     |      |      |      |      |      +-----doc
>     |      |      |      |      |      +-----home
>     ...
>     |      |      |      |      +-----190700
>     |      |      |      |      |      +-----home
>
>   After the backups are rsynced to the backup server, I find any regular
>   files with only one link, compute the RMD160 hash of the contents, and
>   make a hardlink to the appropriate filename under the HASH directory.
>   People love to make copies of copies of files, so this really cuts down
>   on the disk space used.
>
>   The hardlinks make it easy to avoid restoring things that aren't what
>   the user had in mind; if a file's been corrupted, I can tell when it
>   happened just by looking at the inode, so I don't restore an earlier
>   version that's also junk.  I can also tell if there were duplicates
>   anywhere on the fileserver at the time the user lost the good version;
>   it's a lot faster for them to get a known good copy from somewhere
>   else on the fileserver than it is to restore over the network.
>
>   The software is just a few scripts to do things like find files with
>   just one link, compute hashes, do hardlinks, etc.  I can put up a tarball
>   if anyone's interested.
>

Hello Kelly,

 I am doing something similar at a company i work for. I would be interested
to see your scripts
to make a comparison.

thanks,
v

>
> --
> Karl Vogel                      I don't speak for the USAF or my company
>
> The best way for the Government to maintain its credit is to pay as it
> goes-not by resorting to loans, but by keeping out of debt-through an
> adequate income secured by a system of taxation, external or internal,
> or both.  --Pres. William McKinley's First Inaugural Address, March 4, 1897
> _______________________________________________
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "
> freebsd-questions-unsubscribe@freebsd.org"
>



-- 
network warrior since 2005



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?139b44430905200000l250c9ae6p1ddcb1a6ac10bef8>