Date: Mon, 28 Sep 2009 11:29:19 -0600 From: Kurt Touet <ktouet@gmail.com> To: freebsd-fs@freebsd.org Subject: Re: ZFS - Unable to offline drive in raidz1 based pool Message-ID: <2a5e326f0909281029p17334ceeoff4bb3e7adeb5cef@mail.gmail.com> In-Reply-To: <2a5e326f0909201500w1513aeb5ra644f1c748e22f34@mail.gmail.com> References: <2a5e326f0909201500w1513aeb5ra644f1c748e22f34@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I've run into a similar experience again with my zfs raidz1 array reporting itself as healthy when it's not. This, again, was after some drive spin_retry_count errors (and a power cycle when unable to shutdown -h). The pattern goes as follows: 1) A hard drive in the zfs array (for whatever reason) repeatedly times out.. in this case, generating spin_retry_count errors in the smart status. 2) The box is semi-frozen because it cannot deal with activity on the zfs array, so it won't gracefully shutdown -h now. 3) The box is power cycled. 4) Everything spins up fine on the box, the array is now accessible. 5) zpool status - shows the array as online with no degraded status 6) zpool scrub - shows the drives to be desynced and resilvers a couple of them 7) presumably, everything is fine monolith# zpool status pool: storage state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ad14 ONLINE 0 0 0 ad6 ONLINE 0 0 0 ad12 ONLINE 0 0 0 ad4 ONLINE 0 0 0 spares ad22 AVAIL errors: No known data errors monolith# zpool scrub storage monolith# zpool status pool: storage state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Mon Sep 28 11:17:05 2009 config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ad14 ONLINE 0 0 0 1.17M resilvered ad6 ONLINE 0 0 0 1.50K resilvered ad12 ONLINE 0 0 0 2K resilvered ad4 ONLINE 0 0 0 2K resilvered spares ad22 AVAIL errors: No known data errors So, my question still stands.. how does zfs upon scrubbing, instantly know that the drives need to be resilvered (it completes in a few seconds), but previous declares the array to be fine with no known date errors? Cheers, -kurt
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2a5e326f0909281029p17334ceeoff4bb3e7adeb5cef>