FreeBSD Mail Archives

Date:      Sat, 25 Aug 2007 02:26:42 -0700
From:      "David Schwartz" <davids@webmaster.com>
To:        <tom@samplonius.org>
Cc:        freebsd-stable@freebsd.org
Subject:   RE: A little story of failed raid5 (3ware 8000 series)
Message-ID:  <MDEHLPKNGKAHNMBLJOLKAEJHGGAC.davids@webmaster.com>
In-Reply-To: <27560580.441188027503141.JavaMail.root@ly.sdf.com>


>   This isn't really accurate.  First of all, if the RAID=20
> controller isn't confirming checksums before giving the data to=20
> the OS, what is the checksum for exactly?

The checksum is used to recover the data in the event one piece of the =
data is lost. With all of the data but one piece, and the checksum, the =
data can be recovered. Confirming the checksum on every read would be a =
waste of time since the individual drives already checks the data for =
errors.

> It is supposed to be=20
> for detecting data corruption, so if the card isn't using the=20
> checksum, its kinda of useless.

You are confused. Checking for data corruption is done, by checking if =
the *DATA* is corrupt. This does not require looking at the RAID5 =
checksum since the data has its own data checksum.

> I know some RAID systems do fake=20
> their checksums, as they don't actually validate data against the=20
> checksums during normal reads because they don't have the=20
> processing power.  I'd stay away from these type of systems=20
> (cough ... Blue Arc ... cough).

It has nothing to do with processing power. It's simply a waste. The =
RAID 5 checksum isn't for verifying the data, it's for recovering the =
data if it can't be read.
=20
> Second, most RAID systems don't use their own checksums=20
> anymore.  Netapp is quite famous for their ZCS (zone checksum)=20
> drives, and still uses a variation of this technology on their=20
> newer systems (which are using 512 sectors).  But most RAID=20
> vendors just rely on the drives own error detection and=20
> correction systems (hamming code based usually, which is actually=20
> pretty solid).  I'm pretty sure that that 3ware doesn't use any =
checksums.

You are seriously confused. You are confusing the RAID 5 checksum with =
the drive data checksum. We are talking about making sure the RAID 5 =
checksums are readable so that, if a drive fails, the data can be =
reconstructed from the checksum.
=20
> However, in this particular case, validating checksums would=20
> have been unhelpful, since the disk was unreadable.  diskcheckd=20
> would have detected this issue.  It would probably have prevented=20
> the problem, if it had been running previously.

No, it would have saved him. The problem was he lost a drive, and =
checksums *ON* *OTHER* *DRIVES* were unreadable. Quite possibly they had =
been unreabable for months, but were never checked, since they are only =
*needed* to reconstruct the data.
=20
> ZFS is also a good option.  It has file level checksumming. =20
> ZFS never trusts the disks, and is super paranoid.  And ZFS can=20
> do background scrubbing too.  I can't wait for ZFS in FreeBSD 7,=20
> because ZFS in software is going to 10 x better than anything 3ware =
has.

That wouuld not have helped him one bit. When the drive failed, the RAID =
5 checksums on the other drives still would not have been scrubbed. The =
RAID 5 checksum (technically an XOR) is only needed to recover the RAID =
5 array if a drive (or sector) fails.

The only thing that will fix this is specifically verifying the RAID 5 =
checksum blocks. If a controller provides no way to do this, it is badly =
broken. If a verify operation does not verify the checksum blocks, it is =
broken.

DS

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?MDEHLPKNGKAHNMBLJOLKAEJHGGAC.davids>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation