From owner-freebsd-fs@FreeBSD.ORG  Sun Apr 15 15:39:09 2007
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@FreeBSD.org
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 83D7916A407;
	Sun, 15 Apr 2007 15:39:09 +0000 (UTC) (envelope-from bp@barryp.org)
Received: from eden.barryp.org (host-42-60-230-24.midco.net [24.230.60.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 4372513C45A;
	Sun, 15 Apr 2007 15:39:09 +0000 (UTC) (envelope-from bp@barryp.org)
Received: from [10.66.1.10] (helo=barry-pedersons-computer.local)
	by eden.barryp.org with esmtpsa (TLSv1:AES256-SHA:256)
	(Exim 4.63 (FreeBSD)) (envelope-from <bp@barryp.org>)
	id 1Hd6om-0007k9-E2; Sun, 15 Apr 2007 10:39:08 -0500
Message-ID: <46224706.4010704@barryp.org>
Date: Sun, 15 Apr 2007 10:38:46 -0500
From: Barry Pederson <bp@barryp.org>
User-Agent: Thunderbird 2.0.0.0 (Macintosh/20070326)
MIME-Version: 1.0
To: Pawel Jakub Dawidek <pjd@FreeBSD.org>
References: <46205338.3090803@barryp.org>
	<20070415111955.GB16971@garage.freebsd.pl>
In-Reply-To: <20070415111955.GB16971@garage.freebsd.pl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org
Subject: Re: ZFS raidz device replacement problem
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 15 Apr 2007 15:39:09 -0000

Pawel Jakub Dawidek wrote:
> On Fri, Apr 13, 2007 at 11:06:16PM -0500, Barry Pederson wrote:
>> I've been playing with ZFS (awesome stuff, thanks PJD) and noticed something funny when replacing a device under a raidz pool.  It seems that even though ZFS says 
>> resilvering is complete, you still need to manually do a "zpool scrub" to really get the pool into a good state.
> 
> How do you tell it's not in a good state?

I took this last output showing cksum errors on the replaced device to 
mean that the device wasn't really synced up with the rest of the raidz 
members - it didn't have the data on it that ZFS expected it to.


>> # zpool status mypool
>>   pool: mypool
>>  state: ONLINE
>> status: One or more devices has experienced an unrecoverable error.  An
>>         attempt was made to correct the error.  Applications are unaffected.
>> action: Determine if the device needs to be replaced, and clear the errors
>>         using 'zpool clear' or replace the device with 'zpool replace'.
>>    see: http://www.sun.com/msg/ZFS-8000-9P
>>  scrub: scrub completed with 0 errors on Fri Apr 13 22:43:46 2007
>> config:
>>
>>         NAME        STATE     READ WRITE CKSUM
>>         mypool      ONLINE       0     0     0
>>           raidz1    ONLINE       0     0     0
>>             md0     ONLINE       0     0     0
>>             md1     ONLINE       0     0     0
>>             md3     ONLINE       0     0     5
> 
> If you are referring to this CKSUM count not beeing 0, this was a bug in
> ZFS itself, and was fixes in OpenSolaris already and fix was merged to
> FreeBSD.


OK, I'll update and try again.

But the thing that got me going on this was that if you *don't* do the 
scrub after the replace but instead do some other destructive things to 
one of the non-replaced devices, like:

  dd if=/dev/random bs=1m count=64 oseek=1 conv=notrunc of=/tmp/foo1

and *then* do a scrub and then "zpool status -v mypool", you get this 
really alarming message mentioning data corruption.

---------------------
zpool status -v mypool
   pool: mypool
  state: ONLINE
status: One or more devices has experienced an error resulting in data
         corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
         entire pool from backup.
    see: http://www.sun.com/msg/ZFS-8000-8A
  scrub: scrub completed with 184 errors on Sun Apr 15 09:55:43 2007
config:

         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0   761
           raidz1    ONLINE       0     0   761
             md0     ONLINE       0     0     0
             md1     ONLINE       0     0   521
             md3     ONLINE       0     0    25

errors: Permanent errors have been detected in the following files:

         mypool:<0x100>
         mypool:<0x104>
         mypool:<0x105>
         .
         .
---------------------

I suppose you have to actually have some files in /mypool for this to 
show up.

(are there supposed to be real filenames in the message instead of 
things like "<0x100>" ?)

Now that I look at this closer and run "diff -r" between files I copied 
into the pool and the originals, I don't see any differences - so ZFS's 
claim about detecting actual corrupted files seems wrong there too. 
("zpool clear mypool" doesn't make that "data corruption" warning go 
away, just the cksum counts).

I'll keep poking at it.

	Barry