From owner-freebsd-hardware@FreeBSD.ORG Wed Oct 15 18:46:02 2008 Return-Path: Delivered-To: freebsd-hardware@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 361801065697 for ; Wed, 15 Oct 2008 18:46:02 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from QMTA04.emeryville.ca.mail.comcast.net (qmta04.emeryville.ca.mail.comcast.net [76.96.30.40]) by mx1.freebsd.org (Postfix) with ESMTP id 16AC88FC2B for ; Wed, 15 Oct 2008 18:46:01 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from OMTA06.emeryville.ca.mail.comcast.net ([76.96.30.51]) by QMTA04.emeryville.ca.mail.comcast.net with comcast id T3v71a00J16AWCUA46m0Wk; Wed, 15 Oct 2008 18:46:01 +0000 Received: from koitsu.dyndns.org ([69.181.141.110]) by OMTA06.emeryville.ca.mail.comcast.net with comcast id T6ly1a00Z2P6wsM8S6lz2x; Wed, 15 Oct 2008 18:45:59 +0000 X-Authority-Analysis: v=1.0 c=1 a=QycZ5dHgAAAA:8 a=0Eq5QFMh5GS47GtUeU0A:9 a=HBYJNMoZ2NqTmMcfCFkA:7 a=YXkIxNPs_HFYXW4_gmQJCiINO8sA:4 a=EoioJ0NPDVgA:10 a=LY0hPdMaydYA:10 Received: by icarus.home.lan (Postfix, from userid 1000) id BE655C9419; Wed, 15 Oct 2008 11:45:58 -0700 (PDT) Date: Wed, 15 Oct 2008 11:45:58 -0700 From: Jeremy Chadwick To: Dieter Message-ID: <20081015184558.GA84665@icarus.home.lan> References: <8f82c35c0810150532o52ae50b5kef7c685fd23a0af4@mail.gmail.com> <200810151714.RAA23995@sopwith.solgatos.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200810151714.RAA23995@sopwith.solgatos.com> User-Agent: Mutt/1.5.18 (2008-05-17) Cc: freebsd-questions@freebsd.org, freebsd-hardware@freebsd.org Subject: Re: RAID 5 - serious problem X-BeenThere: freebsd-hardware@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General discussion of FreeBSD hardware List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Oct 2008 18:46:02 -0000 On Wed, Oct 15, 2008 at 10:14:42AM +0100, Dieter wrote: > > FreeBSD 7.0-Release > > Intel D975XBX2 motherboard (Intel Matrix Storage Technology) > > 3 WD Raptor 74 GB in a RAID 5 array > > 1 WD Raptor 150 GB as a standalone disk > > / and /var mounted on the standalone,, /usr on the RAID 5 > > I believe what happened was that one of the disks didn't respond for such a > > long time, that is was marked "bad". And afterwards the same thing happened > > for the other disks. When I try to boot the system, all three disks are > > marked "Offline". > > > I am very desperate not to lose my data, > > In that case, step one is to use dd(1) to make a bit-for-bit copy of the > three drives to some trusted media. Since they are marked bad/offline, > you might need to move them to a controller that doesn't know anything > about RAID. (Note that there is risk here, and in almost anything you do > at this point.) Once you have this bit-for-bit backup, you can run any > experiment you like to attempt to recover your data. If the experiment > goes bad, you can dd the exact original contents back using dd, then > try a different experiment. While you're at it, make a normal backup > using dump(8) or whatever you normally use, of / and /var. Once you have > *everything* backed up, you can do risky experiments like booting linux. > > My personal approach to avoiding data loss is (a) avoid buggy things like > inthell and linux. Interesting, being as we have another thread going as of late that seems to link transparent data loss with AMD AM2-based systems with certain models of Adaptec and possibly LSI Logic controller cards. I like Intel as much as I like AMD -- but it's important to remember that it's becoming more and more difficult to provide "flawless" stability on things as the complexities increase. And I have no idea what your beef is with Linux. If the OP is successfully able to bring his array on-line using Linux, I would think that says something about the state of things in FreeBSD, would you agree? Both OSes have their pros and cons. > (b) FFS with softdeps and the disk write cache turned off, This has been fully discussed by developers, particularly Matt Dillon. I can point you to a thread discussing why doing this is not only silly, but a bad idea. And if you'd like, I can show you just how bad the performance is on disks with WC disabled using UFS2 + softupdates. When I say bad, I'm serious -- we're talking horrid. And yes, I have tried it -- see PR 127717 for evidence that I *have* tried it. :-) There *may* be advantages to disabling a disk's write cache when using a hardware RAID controller that offers its own on-board cache (DIMMs, etc.), but that cache should be battery-backed for safety reasons. > (c) full backups. I'm curious what your logic is here too -- this one is debatable, so I'd like to hear your view. > I don't have enough ports to run RAID. :-( The downside is that > FreeBSD doesn't have NCQ support yet (when? when? when?) so writes are > slow. :-( NCQ will not necessarily improve write performance. There have been numerous studies done proving this fact, and I can point you to those as well. TCQ, on the other hand, does offer performance benefits when there are a large number of simultaneous transactions occurring (think: it's more like SCSI's command queueing). I believe Andrey Elsukov is working on getting NCQ support working when AHCI is in use (assuming I remember correctly). -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |