From owner-freebsd-questions@FreeBSD.ORG Sat Jun 26 03:35:27 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 147A116A4CE for ; Sat, 26 Jun 2004 03:35:27 +0000 (GMT) Received: from av1-2-sn3.vrr.skanova.net (av1-2-sn3.vrr.skanova.net [81.228.9.106]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9E70143D3F for ; Sat, 26 Jun 2004 03:35:26 +0000 (GMT) (envelope-from daniel_k_eriksson@telia.com) Received: by av1-2-sn3.vrr.skanova.net (Postfix, from userid 502) id 021CF37EA2; Sat, 26 Jun 2004 05:35:05 +0200 (CEST) Received: from smtp1-2-sn3.vrr.skanova.net (smtp1-2-sn3.vrr.skanova.net [81.228.9.178]) by av1-2-sn3.vrr.skanova.net (Postfix) with ESMTP id E8EC737E44; Sat, 26 Jun 2004 05:35:05 +0200 (CEST) Received: from gadget (h130n1fls11o822.telia.com [213.64.66.130]) by smtp1-2-sn3.vrr.skanova.net (Postfix) with ESMTP id C568B38002; Sat, 26 Jun 2004 05:35:05 +0200 (CEST) From: "Daniel Eriksson" To: "'Benjamin P. Keating'" , Date: Sat, 26 Jun 2004 05:35:09 +0200 Organization: Home Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.6626 In-Reply-To: <1d54d54404062518474bbb3494@mail.gmail.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409 Importance: Normal Subject: RE: fsck'ing a Vinum RAID5 volume (and a stale drive) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Jun 2004 03:35:27 -0000 Benjamin P. Keating wrote: > I've found on the net that I can switch the state by doing: >=20 > $ vinum setstate up backup.p0 backup.p0.s3 Ouch, this is a bad move. You just told vinum to start using the stale = (=3Dout of date data) disc as if it was up to date and nothing was wrong with = it. Basically you have trashed your data, possibly beyond repair. Why was the disc in a stale state? If it developed bad blocks that could = not be remapped, then vinum marked it as stale and continued to use the = other discs in degraded mode (just as it should). Even if the disc did not = break (maybe just connection problems or something), once vinum marked it as = stale any further writing to the array would immediately invalidate the data = on the disc; which means the only way to bring it back up would be to go = throug a proper rebuild of the data. > I rebooted, the state is "up" so I unmounted the volume to fsck it. Is > this approach correct? does this do anything productive or just forces > the state label to change and do nothing to the drives? I don't feel > confident that it did anything and Im having a VERY hard time finding > documentation on this. Your approach is not correct. You should have paid attention to the = vinum manpage which says this about setstate: "This bypasses the usual = consistency mechanism of vinum and should be used only for recovery purposes. It is possible to crash the system by incorrect use of this command." It is unfortunate that the manpage says "for recovery", since people can misunderstand and think you can recover from a crashed disc. setstate = should not need to be used during normal operation, even if a disc in a RAID-5 array crashes. > Im assuming I'll want to answer yes to at least some of those. Can I > say yes to all of them? What errors should I say no to? I have no idea > whats bad bad and whats correctable. An fsck of a degraded RAID-5 array should not normally have any errors. = The errors you are seeing is because you have forced out-of-date (stale) = data into the middle of the filesystem, messing up pretty much the entire filesystem. > Now. because this is a raid5 volume, are some of these fsck prompts > false positives? ie; fsck is giving a error but really it's fine as > it's raid5? No, that is not how RAID-5 works. RAID-5 protects the integrity of the filesystem by ensuring that the stored data can be read/written even if = one disc fails. RAID arrays work at a level below the filesystem, and they generally don't know anything about the actual filesystem. Your only hope now is if you haven't actually allowed anything to be = written to the array. If fsck changed things around then you are probably out of luck. IF nothing has been written, then reset the failed disc to the "stale/down" state (which should put the array in degraded mode = hopefully) and then try fsck again. If you are lucky you should see no errors at = all. If the disc isn't physically broken, then the proper way to get your = array back to the original "all up" state, you should run a "vinum start backup.p0.s3". I'm not sure if this will rebuild all the data = automatically (it does for RAID-1 arrays), but if it doesn't then I guess you also = need to run a "vinum rebuildparity backup.p0". /Daniel Eriksson