Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 07 Jun 2010 11:55:24 +0300
From:      Andriy Gapon <avg@icyb.net.ua>
To:        Jeremy Chadwick <freebsd@jdc.parodius.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: zfs i/o error, no driver error
Message-ID:  <4C0CB3FC.8070001@icyb.net.ua>
In-Reply-To: <20100607083428.GA48419@icarus.home.lan>
References:  <4C0CAABA.2010506@icyb.net.ua> <20100607083428.GA48419@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
on 07/06/2010 11:34 Jeremy Chadwick said the following:
> On Mon, Jun 07, 2010 at 11:15:54AM +0300, Andriy Gapon wrote:
>> During recent zpool scrub one read error was detected and "128K repaired".
>>
>> In system log I see the following message:
>> ZFS: vdev I/O failure, zpool=tank
>> path=/dev/gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff offset=284456910848
>> size=131072 error=5
>>
>> On the other hand, there are no other errors, nothing from geom, ahci, etc.
>> Why would that happen? What kind of error could this be?
> 
> I believe this indicates silent data corruption[1], which ZFS can
> auto-correct if the pool is a mirror or raidz (otherwise it can detect
> the problem but not fix it).

This pool is a mirror.

> This can happen for a lot of reasons, but
> tracking down the source is often difficult.  Usually it indicates the
> disk itself has some kind of problem (cache going bad, some sector
> remaps which didn't happen or failed, etc.).

Please note that this is not a CKSUM error, but READ error.

> What I'd need to determine the cause:
> 
> - Full "zpool status tank" output before the scrub

This was "all clear".

> - Full "zpool status tank" output after the scrub

zpool status -v
  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed after 5h0m with 0 errors on Sat Jun  5 05:05:43 2010
config:

        NAME                                            STATE     READ WRITE CKSUM
        tank                                            ONLINE       0     0     0
          mirror                                        ONLINE       0     0     0
            ada0p4                                      ONLINE       0     0     0
            gptid/536c6f78-e4f3-11de-b9f8-001cc08221ff  ONLINE       1     0
 0  128K repaired

> - Full "smartctl -a /dev/XXX" for all disk members of zpool "tank"

Those output for both disks are "perfect".
I monitor them regularly, also smartd is running and complaints from it.

> Furthermore, what made you decide to scrub the pool on a whim?

Why on a whim? It was a regularly scheduled scrub (bi-weekly).

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C0CB3FC.8070001>