Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 8 Feb 2008 14:49:54 -0800
From:      Mark Day <mday@apple.com>
To:        Joe Peterson <joe@skyrush.com>
Cc:        freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: Analysis of disk file block with ZFS checksum error
Message-ID:  <D6B0BBFB-D6DB-4DE1-9094-8EA69710A10C@apple.com>
In-Reply-To: <47ACD7D4.5050905@skyrush.com>
References:  <47ACD7D4.5050905@skyrush.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Feb 8, 2008, at 2:29 PM, Joe Peterson wrote:

> For one thing (as I mentioned), only 65536 bytes are bad (and it's
> exactly this many, with a few "good" bytes thrown in, but not far from
> what matches random chance would produce.  Also, all bad bytes have a
> zero in the high bit - interesting?  Also, near the end of the block,
> the bad bytes all go to zero, strangely coincident with the first  
> "good"
> zero in that bad block - not sure if that's coincidence or not.   
> Also, I
> calculated the number of "Bits same" (matching bits) in the good vs.  
> bad
> bytes, and it appears fairly random, so it appears that the bad bytes
> are very random in nature and not correlated much at all with the good
> bytes.
>
> So except for the fact that the 2nd half (65536 bytes) of the ZFS  
> block
> are good, the bad block seems to consist of random data, except for  
> the
> string of zero bytes near the end and the zero high-bit.  It's not  
> as if
> one bit on the disk flipped - it affects the whole (1/2) block.  Does
> this seem like a disk error, controller error/bug, cable problem (I
> recently put a new cable on, so I doubt this).  It seems to me  
> something
> more systemic rather than a random bit error - opinions are more than
> welcome.

Based on the subset of data you posted, the bad data looks like ASCII  
text.
The bad data from offset a0000 to a000f is:

${138AFE{@
@$$}1

The bad data from offset af6c1 to af6c8 is:

392A9}@

I don't recognize the content beyond that, but I'd guess that somehow  
the
contents of some other file managed to overwrite that portion of the bad
file.  As for how that happened, I don't know.  But if someone  
recognizes
where the bad content came from, that might be a clue.

-Mark




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D6B0BBFB-D6DB-4DE1-9094-8EA69710A10C>