Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Apr 2007 18:29:56 -0400
From:      Jerry McAllister <jerrymc@msu.edu>
To:        CyberLeo Kitsana <cyberleo@cyberleo.net>
Cc:        Roland Smith <rsmith@xs4all.nl>, FreeBSD Questions <freebsd-questions@freebsd.org>
Subject:   Re: dump/restore corrupted filesystems
Message-ID:  <20070416222956.GA49919@gizmo.acns.msu.edu>
In-Reply-To: <462449C2.9000302@cyberleo.net>
References:  <4623843B.40006@cyberleo.net> <20070416170825.GA91459@slackbox.xs4all.nl> <462449C2.9000302@cyberleo.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Apr 16, 2007 at 11:14:35PM -0500, CyberLeo Kitsana wrote:

> Roland Smith wrote:
> > On Mon, Apr 16, 2007 at 09:11:48AM -0500, CyberLeo Kitsana wrote:
> >> I have a 1.2TB UFS2 filesystem with irrecoverable corruption. As such, I
> >> must move all 500GB or so of data off of it and re-newfs it.
> > 
> > If the corruption is due to hardware failure, your data is probably lost.
> 
> Sorry if I wasn't clear. Most all of the data is readable and complete
> if I mount the filesystem read-only. It just panics the box when mounted
> read/write, and fsck can't fix the damage.
> 
> My question was more along the lines of whether or not dump/restore
> would see that those corrupted directory and file inodes were indeed
> corrupt and not bother attempting to back them up, or if it would
> happily back them up and restore them in their corrupted state to a new
> filesystem, thus trashing it.

It depends on how they are corrupted.  Really there are three situations.

In the first, something happened to cause a problem with the filesystem
structure - the block and their pointer chains/links.   That would make
fsck see errors and possibly refuse to complete.  If that also affects
the ability to read some actual file then neither dump/restore nor any
other copy method will fix the situation.  dump and other utilities will
fail when reading the files and abort.    

You might be able to tinker around a little, figure out which actual files 
are affected and delete them or set dump not to read them and then copy 
all the rest.   But, if you are unable to mount the filesystem as write, 
this might not work.   If you are able to copy most, then those files 
would be uncorrupted in the new location.   You would just have to 
figure out what to do about the files you could not read.

Second would be a similar corruption to the filesystem structure 
blocks and links, but it happens to luckily not be in a place
currently being used by any actual files.   In this case, fsck
would fail, but you could still read the files enough to copy
them to some other space.   In this case, the copy process, whether
dump/restore or some other - dump/restore is probably best - would
fix the problem nicely.   The copy would be uncorrupted.

The third situation would be where the data itself was miswritten - 
maybe by a routine that cobbled some computation or database utility 
or whatever.  In this case, fsck would not see any problem with the 
filesystem.  It would see that all the blocks and links were nicely 
accounted for.  But the data would be bad and no amount of copying
would fix it.  If fact, dump or any other copy utility would read
the files without errors just fine and dandy, because it would not
know of the corruptions - so they would just follow it to the new copy.

dump/restore won't make any difference to/fix any fsck type errors.   
It works above that level - on the files' data itself.   fsck works
below the file level, on blocks and file chain links, etc.  If fsck
finds an unfixable error, dump or any other utility will fail too
if the error is in the area it is trying to read.

When you have dump-ed, then if you need to restore in to a cleanly created
new filesystem.  Remember that newfs created a filesystem on a partition.
Then the copy should not be corrupted from an fsck point of view.  This is 
not because of anything that dump/restore would do, but because the newfs 
made a clean new system that fsck would be happy with.

Now, if the data itself is corrupt - but readable, then dump will
happily read the corrupt data and restore will happily write out
what dump created.   The data would be just as incorrect.   But,
again, that is not at the fsck level.   It is at the file and
directory level.   fsck works on blocks and links and doesn't care
anything about the actual data written in the blocks.   It can
find errors in blocks and links that are both in a real file chain or
not currently part of any real file.   Generally fsck can fix those, 
but there are some things that it cannot make a reasonable guess on.

I hope this adds to the understanding rather than just confusing
you more.   Basically I am pointing out that there can be different
types or places for corruption.   No copying of files will fix a
problem if the errors are within the structure or data of the file
itself.   But, since fsck doesn't look at the actual data, but 
rather on structural integrity in the filesystem - the entity within
which the files reside, it is possible that it can find errors in
places that are not part of an actual current file.   If the latter
is the case, then copying the files out of the corrupt filesystem
in to a nice new one, freshly newfs-ed using dump/restore or some 
other method, can fix the problem.

But, if there are errors in the data, then no method of copying the
files will fix them.   And, if the filesystem corruption makes it
impossible to read some of the files, then no copying scheme will
fix them.   You might be able to tinker around, find which files
are unreadable, delete them, and then do the dump and get everything
else.   That could possibly made reconstruction a little easier.

> 
> If it does, I can always use rsync.
> 

I don't know how rsync will do it, but I am guessing that the result
will be somewhat similar to dump/restore.   If the files can be read
successfully, it will read them and write them in the new place.   If
actual data is corrupted, it will probably be transfered along with its
corruptions.   If the filesystem corruptions are not in an actual file,
then it should produce about the same results as dump/restore.

> > Dump examines the filesystem to see which files need to be backed up.
> > So dumping a corrupted FS will probably not produce the desired
> > results. If it did, we wouldn't need backups.

dump looks at the last modified date in the file header.  If it 
was changed (or created) after the last dump of the level you choose,
then it will add the file to its list of files to dump.  It does not
actually attempt to read file data until after making this index of
files to dump based on last dump data and dump level.

> 
> Ironically, this is the machine that holds the backups.

Well, it can happen.
All thing that can not easily be recreated should be backed up.

////jerry

> 
> --
> Fuzzy love,
> -CyberLeo
> Technical Administrator
> CyberLeo.Net Webhosting
> http://www.CyberLeo.Net
> <CyberLeo@CyberLeo.Net>
> 
> Furry Peace! - http://www.fur.com/peace/
> _______________________________________________
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070416222956.GA49919>