Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 04 Mar 2008 17:15:22 -0700
From:      Joe Peterson <joe@skyrush.com>
To:        Eric Anderson <anderson@freebsd.org>
Cc:        freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: Analysis of disk file block with ZFS checksum error
Message-ID:  <47CDE61A.8040102@skyrush.com>
In-Reply-To: <47CD4DCF.5070505@freebsd.org>
References:  <47ACD7D4.5050905@skyrush.com>		<D6B0BBFB-D6DB-4DE1-9094-8EA69710A10C@apple.com>		<47ACDE82.1050100@skyrush.com>		<20080208173517.rdtobnxqg4g004c4@www.wolves.k12.mo.us>		<47ACF0AE.3040802@skyrush.com>	<1202747953.27277.7.camel@buffy.york.ac.uk> <47B0A45C.4090909@skyrush.com> <47CD4DCF.5070505@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Eric Anderson wrote:
> I'm starting to think there is a timing issue or some such problem with 
> ZFS, since I can use the same drives in a gmirror with UFS, and never 
> have any data problems (md5 checksums confirm it over-and-over).  I 
> highly doubt that everyone is seeing similar issues and it just is 
> because ZFS is so intense.  I've had plenty of systems under severe disk 
> load that have never exhibited corrupt files because of something like 
> this.

I also wondered this - i.e. if ZFS was triggering a certain timing
behavior that revealed the problem.  Still, if this is the case, it
seems to me that the problem lies in the ATA subsystem, since it should
prevent a higher-level things like ZFS to be able to create bad timings
(or am I not thinking of this correctly?).

Also, I think there were some reports of problems with DMA/ATA when
*not* using ZFS.

> I wish we could get our hands on this issue..  Seems like some common 
> threads are ATA/SATA disks.  Is your setup running 32bit or 64bit 
> FreeBSD?  (if you already mentioned it, I'm sorry, I missed it)

This was on 32bit FreeBSD with PATA.  I am the one who had no SMART
issues and no DMA errors reported under Linux.  Changing the cable may
have "fixed" it, since I did not see errors in some further testing, but
even if so, my theory is that there is some edge case (timing?) that the
FreeBSD ATA drivers were sensitive to, and perhaps my change of cables
pushed the problem to the other side of the threshold.  Since I never
saw errors under Linux (and I've been using that cable for a couple of
years), I do not necessarily think the cable was actually "defective".

						-Joe



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47CDE61A.8040102>