From owner-freebsd-fs@FreeBSD.ORG  Mon Jun  7 08:55:30 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 092601065674
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 08:55:30 +0000 (UTC)
	(envelope-from avg@icyb.net.ua)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 4B9778FC19
	for <freebsd-fs@freebsd.org>; Mon,  7 Jun 2010 08:55:28 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA19509;
	Mon, 07 Jun 2010 11:55:25 +0300 (EEST)
	(envelope-from avg@icyb.net.ua)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1OLY7A-000GDF-QW; Mon, 07 Jun 2010 11:55:24 +0300
Message-ID: <4C0CB3FC.8070001@icyb.net.ua>
Date: Mon, 07 Jun 2010 11:55:24 +0300
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Thunderbird 2.0.0.24 (X11/20100321)
MIME-Version: 1.0
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
References: <4C0CAABA.2010506@icyb.net.ua>
	<20100607083428.GA48419@icarus.home.lan>
In-Reply-To: <20100607083428.GA48419@icarus.home.lan>
X-Enigmail-Version: 0.96.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: Re: zfs i/o error, no driver error
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Jun 2010 08:55:30 -0000

on 07/06/2010 11:34 Jeremy Chadwick said the following:
> On Mon, Jun 07, 2010 at 11:15:54AM +0300, Andriy Gapon wrote:
>> During recent zpool scrub one read error was detected and "128K repaired".
>>
>> In system log I see the following message:
>> ZFS: vdev I/O failure, zpool=tank
>> path=/dev/gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff offset=284456910848
>> size=131072 error=5
>>
>> On the other hand, there are no other errors, nothing from geom, ahci, etc.
>> Why would that happen? What kind of error could this be?
> 
> I believe this indicates silent data corruption[1], which ZFS can
> auto-correct if the pool is a mirror or raidz (otherwise it can detect
> the problem but not fix it).

This pool is a mirror.

> This can happen for a lot of reasons, but
> tracking down the source is often difficult.  Usually it indicates the
> disk itself has some kind of problem (cache going bad, some sector
> remaps which didn't happen or failed, etc.).

Please note that this is not a CKSUM error, but READ error.

> What I'd need to determine the cause:
> 
> - Full "zpool status tank" output before the scrub

This was "all clear".

> - Full "zpool status tank" output after the scrub

zpool status -v
  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed after 5h0m with 0 errors on Sat Jun  5 05:05:43 2010
config:

        NAME                                            STATE     READ WRITE CKSUM
        tank                                            ONLINE       0     0     0
          mirror                                        ONLINE       0     0     0
            ada0p4                                      ONLINE       0     0     0
            gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff  ONLINE       1     0
 0  128K repaired

> - Full "smartctl -a /dev/XXX" for all disk members of zpool "tank"

Those output for both disks are "perfect".
I monitor them regularly, also smartd is running and complaints from it.

> Furthermore, what made you decide to scrub the pool on a whim?

Why on a whim? It was a regularly scheduled scrub (bi-weekly).

-- 
Andriy Gapon