From owner-freebsd-fs@FreeBSD.ORG Tue Feb 5 16:13:33 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8811916A417 for ; Tue, 5 Feb 2008 16:13:33 +0000 (UTC) (envelope-from joe@skyrush.com) Received: from shadow.wildlava.net (shadow.wildlava.net [67.40.138.81]) by mx1.freebsd.org (Postfix) with ESMTP id 5465413C447 for ; Tue, 5 Feb 2008 16:13:33 +0000 (UTC) (envelope-from joe@skyrush.com) Received: from [129.162.240.95] (unknown [129.162.240.95]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by shadow.wildlava.net (Postfix) with ESMTP id DB8C68F424; Tue, 5 Feb 2008 09:13:31 -0700 (MST) Message-ID: <47A88ADE.7050503@skyrush.com> Date: Tue, 05 Feb 2008 09:12:14 -0700 From: Joe Peterson User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: =?UTF-8?B?RGFnLUVybGluZyBTbcO4cmdyYXY=?= References: <47A73C8D.3000107@skyrush.com> <86prvby5o1.fsf@ds4.des.no> <47A864D9.4060504@skyrush.com> <864pcnxz8f.fsf@ds4.des.no> In-Reply-To: <864pcnxz8f.fsf@ds4.des.no> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: freebsd-fs@freebsd.org Subject: Re: Forcing full file read in ZFS even when checksum error encountered X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Feb 2008 16:13:33 -0000 Dag-Erling Smørgrav wrote: > There is now way to "read the bad data" since an unrecoverable checksum > error means that ZFS has no idea which of the multiple version of the > affected block is the right one. Nope, no mirror, no RAIDZ - just one partition. But as far as I know, there were no read errors, just a checksum error. I've also done a couple of surface scans of the drive, and no problems. So all I can imagine is that either data got "changed" on the disk (due to who know what), or the metadata got messed up (either hardware or some SW bug). I'd like to figure out what ZFS thinks the bytes in the file really are and why they are showing as a checksum error. So, since I only have one copy (i.e. no RAID/mirror), then I should be able to tell ZFS to "go ahead and read the bytes, not stopping when it hits the checksum mismatch). Then I could do an analysis of the data, compared to what the file should contain. I assume I can hack the ZFS source to "disable" stopping on the checksum problem, but I figured there might be some debug mode that would let me do this without delving into the code. > (I assume this was a raidz pool; if not, imagine Nelson Muntz from the > Simpsons yelling "ha ha!" at you) Ah, don't worry, I have backups (I'm just playing around with ZFS at the moment... :) > My advice is to use 'dd conv=noerror' with a sufficiently small block > size to recover what parts you can. I haven't lost anything, so no need to do that. I just want to see what's up with this particular ZFS issue. If it's a bug, at least I could submit it to Sun. Thanks, Joe