Date: Mon, 25 Jan 2010 20:12:48 -0800 From: Steven Schlansker <stevenschlansker@gmail.com> To: =?iso-8859-1?Q?Tommi_L=E4tti?= <sty@iki.fi> Cc: freebsd-fs@freebsd.org Subject: Re: slight zfs problem after playing with WDIDLE3 and WDTLER Message-ID: <02786740-7076-4C92-89EE-E1EFC2120E33@gmail.com> In-Reply-To: <f43ef3191001252007j4fb54a96l843f4515ad87bedd@mail.gmail.com> References: <f43ef3191001251043n3a2d2780jfb2aa24be5f5371d@mail.gmail.com> <3F785019-DB0E-4385-97EB-7CE69A11647A@gmail.com> <f43ef3191001252007j4fb54a96l843f4515ad87bedd@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Jan 25, 2010, at 8:07 PM, Tommi L=E4tti wrote: > 2010/1/26 Steven Schlansker <stevenschlansker@gmail.com>: >>=20 >> On Jan 25, 2010, at 10:43 AM, Tommi L=E4tti wrote: >>> After checking the logs carefully, it seems that the ada1 device >>> permanently lost some sectors. Before twiddling with the parameters, >>> it was 1953525168 sectors (953869MB), now it reports 1953523055 >>> (953868MB). So, would removing it and maybe export/import get me = back >>> to degraded state and then I could just replace the now >>> suddenly-lost-some-sectors drive? >>=20 >> That will probably work. I had a similar problem a bit >> ago where suddenly my drives were too small, causing the UNAVAIL >> corrupted-data problem. I managed to fix it by using gconcat to = stitch >> an extra MB of space from the boot drive onto it. Not a very good = solution, >> but the best I found until FreeBSD gets shrink support (which sadly = seems >> like it may be a long while) >>=20 >> Failing that, you could use OpenSolaris to import it (as it does have = minimal >> support for opening mismatched sized vdevs), copy the data off, = destroy, and restore. >=20 > After thinking overnight I'm a bit curious why the whole filesystem > failed on that single vdev causing the whole pool loss. Shouldn't the > zfs just disregard the disk and just go to degraded state? I've had > normal catastrophic disk failures on this setup before and normal > replace drive+resilver has worked just fine. I poked through the code - the problem is that ZFS identifies the drive as valid (due to correct metadata+checksums) and then tries to assemble = the array. At some point it checks the size, realizes that the drive is = smaller, and rejects the entire array. It isn't smart enough (yet?) to realize = that only rejecting the one drive would allow it to be only degraded...
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?02786740-7076-4C92-89EE-E1EFC2120E33>