Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Aug 2005 12:50:42 -0400
From:      Garance A Drosihn <drosih@rpi.edu>
To:        questions@freebsd.org
Subject:   Re: rsync and moving files [Re: backup w/ snapshots]
Message-ID:  <p06230912bf3a35c8b887@[128.113.24.47]>
In-Reply-To: <20050830091919.J13913@maren.thelosingend.net>
References:  <20050828234043.H22315@maren.thelosingend.net> <20050829161506.E2522@maren.thelosingend.net> <43131C85.1070100@meijome.net> <20050829170053.M3014@maren.thelosingend.net> <43133BA5.2010608@scls.lib.wi.us> <20050830091919.J13913@maren.thelosingend.net>

next in thread | previous in thread | raw e-mail | index | archive | help
At 9:32 AM +0200 8/30/05, Svein Halvor Halvorsen wrote:
>
>The solution: Somehow, I need to mirror all the move ops on the
>remote system before doing the rsync. This could probably be done
>by making a hash table of inodes/filenames pairs (or triplets, etc)
>each time i sync.  Then the next time, I could compare the old
>table with the new, to find out which files are the same only with
>new names, then find those names on the remote system, change them
>to the new ones, and then rsyncing.

Fwiw, I understand the problem you're trying to describe.  And the
basic issue is that rsync keeps no information between separate
runs of it.  It has no way of knowing that a given file on the
source volume used to be at a different location.  It does not even
know that the destination volume was sync'ed by a previous run of
rsync, so it does not even know that the file at the old location
on the destination is the same as the file at the old location on
the source.  It knows nothing more than the information it has at
the moment of any given run of rsync.

You could kinda fudge that information for rsync by creating a lot
of hard links, but that is probably going to create more of a mess
than it will solve.

So, you're left with doing something else outside of rsync.  The
script you are suggesting would probably be fairly easy to write
in something like ruby, perl, or python.  Use a key made up of the
inode number + lastchange date, or maybe inode number + file size.
Then save away the key-to-filename(s) mapping for every file.  On
the next run of rsync, see which files have moved on the source
directory.  If the destination volume has a file at the old location
which matches the file-size or lastchange date (depending on which
key you used...), then move it to the new location on the destination
volume.

   <vague_rambling>
Hmm. Thinking about this a little more, it's probably possible for
rsync to catch some of these cases itself.  It would require some
coding changes to rsync, but it could take the list of files that
it is deleting, compare it to the list of files that it is adding,
and if the MD5-checksum + size of some to-be-deleted file is the
same as some to-be-added file, it could try doing a 'mv' of that
file before it does the remainder of its processing.  I wonder how
hard that would be to do.
   </vague_rambling>

-- 
Garance Alistair Drosehn            =   gad@gilead.netel.rpi.edu
Senior Systems Programmer           or  gad@freebsd.org
Rensselaer Polytechnic Institute    or  drosih@rpi.edu



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?p06230912bf3a35c8b887>