Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Oct 2008 10:32:21 -0700
From:      Jeremy Chadwick <koitsu@FreeBSD.org>
To:        JoaoBR <joao@matik.com.br>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: constant zfs data corruption
Message-ID:  <20081020173221.GA8889@icarus.home.lan>
In-Reply-To: <200810201518.01678.joao@matik.com.br>
References:  <200810171530.45570.joao@matik.com.br> <20081020164831.GA8016@icarus.home.lan> <45836B9A-CB6E-4B95-911E-0023230B8F82@mac.com> <200810201518.01678.joao@matik.com.br>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 20, 2008 at 03:18:01PM -0200, JoaoBR wrote:
> On Monday 20 October 2008 15:03:14 Chuck Swiger wrote:
> > On Oct 20, 2008, at 9:48 AM, Jeremy Chadwick wrote:
> > > Hm... I thought we determined earlier in this thread that the OP is
> > > not
> > > getting the benefits of ZFS checksums because he's not using raidz
> > > (only
> > > a single disk with a single pool)?
> >
> > He's not getting working filesystem redundancy with the existing
> > config and is vulnerable to losing data from a single drive failure,
> > agreed.  But the ZFS checksum mechanism should still be working to
> > detect data corruption, even though ZFS cannot recover the corrupted
> > data the way it otherwise would if redundancy was available.
> >
> 
> all right and understood but shouldn't something as fsck should correct the 
> error?

No.  You're using ZFS, not UFS.  fsck will not work.

In the case of underlying data corruption on ZFS, there is no way to fix
it unless you have mirroring or raidz in use.

But before you say "then ZFS sucks", realise that you have this *exact
same problem* with any other filesystem -- FFS/UFS can't repair this
situation either.  You could fsck and it would "supposedly work" for a
while, but then data would get corrupted again, etc...

Silent data corruption is such a low-level problem that you cannot
expect the filesystem to be able to solve it for you without some form
of parity (raidz) or redundancy (mirroring) involved.

FYI, we run into this problem at work using Linux on ext3fs.  Some
systems will occasionally see data corruption -- ext3fs is a journalling
filesystem, so it detects the problem, but it cannot "solve it".  In
machines which have 1 disk and are not using mirroring or RAID-5, we
still have to shut the box off and replace the disk.

> Seems kind of problematic to me mounting zfs in single user mode, 
> deleting the file and restarting the OS ?

As I said, you don't use fsck on ZFS.  Booting into single-user won't do
you any good either.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081020173221.GA8889>