From owner-freebsd-questions@freebsd.org Mon Apr 18 22:05:23 2016 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E89DDB132BA for ; Mon, 18 Apr 2016 22:05:23 +0000 (UTC) (envelope-from galtsev@kicp.uchicago.edu) Received: from cosmo.uchicago.edu (cosmo.uchicago.edu [128.135.70.90]) by mx1.freebsd.org (Postfix) with ESMTP id AF1491D59 for ; Mon, 18 Apr 2016 22:05:23 +0000 (UTC) (envelope-from galtsev@kicp.uchicago.edu) Received: by cosmo.uchicago.edu (Postfix, from userid 48) id 841D8CB8CA2; Mon, 18 Apr 2016 17:05:22 -0500 (CDT) Received: from 128.135.52.6 (SquirrelMail authenticated user valeri) by cosmo.uchicago.edu with HTTP; Mon, 18 Apr 2016 17:05:22 -0500 (CDT) Message-ID: <64031.128.135.52.6.1461017122.squirrel@cosmo.uchicago.edu> In-Reply-To: <20160418210257.GB86917@neutralgood.org> References: <571533F4.8040406@bananmonarki.se> <57153E6B.6090200@gmail.com> <20160418210257.GB86917@neutralgood.org> Date: Mon, 18 Apr 2016 17:05:22 -0500 (CDT) Subject: Re: Raid 1+0 From: "Valeri Galtsev" To: "Kevin P. Neal" Cc: "Shamim Shahriar" , freebsd-questions@freebsd.org Reply-To: galtsev@kicp.uchicago.edu User-Agent: SquirrelMail/1.4.8-5.el5.centos.7 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Apr 2016 22:05:24 -0000 On Mon, April 18, 2016 4:02 pm, Kevin P. Neal wrote: > On Mon, Apr 18, 2016 at 09:07:07PM +0100, Shamim Shahriar wrote: >> On 18/04/2016 20:22, Bernt Hansson wrote: >> > Hello list >> > >> > >> > Used gstripe to stripe the arrays raid/r0 + r1 into stripe0 >> > >> Hi >> >> I'm sure there are people with more expertise than I am, and they can >> confirm either ways. But in my mind, given that you used RAID1 first >> (mirror) and then used those two RAID1 to create a RAID0, this is >> logically RAID 1+0. In other words, if you lose one disc from each of >> the RAID1 you are still safe. If you lose both from one single mirror >> array (highly unlikely), the stripe is unlikely to be of any use. > > Not that unlikely. If you take identical disks from the same company and > subject them to identical load then the probability that they will fail > around the same time is much higher than random. Not correct. First of all, in most of the cases, failure of each of the drives are independent events (unless you do weird thing such as you have huge RAID with many drives and you never scan the whole surface of each drive, then failure of one drive and the need to rebuild RAID may trigger discovery of another "bad" drive but strictly speaking, the other drive was in that bad state earlier, say with huge bad area you never tried to access, so these events are not simultaneous, and situation can be avoided by correct RAID configuration: scan surface of drives periodically!). Now, let's take a look at the event of simultaneous failure of two drives. Let me call "simultaneous" a failure of second drive within time interval necessary to rebuild array after first drive failure. Let's consider this time to be as long as 3 days (my RAIDs rebuild much faster). Now, let's deal with RAID array with 6 years old drives. What percentage of drives that die between their 6th and 7th years in service? Let's take really big number: 1%. (I'm lucky, my drive longevity is better). So probability that a drive dies during particular 3 days near its 6 years of age will be 0.01 * 3 (days) / 365 (days) so it is about ten to minus fourth power (0.0001). Now, probability of two drives dying simultaneously (or rather second drive dying within 3 days of first drive), as these two events are independent, is just a product of probabilities of each of events, so it is probability of single event squared. That will be ten to minus 8th power (0.00000001) Note that while the event of single drive though not that likely, still is observable, the event of two drives dying simultaneously (we estimated it for one withing 3 days of another) is way less likely. So, I would say, provided RAID configured well, it is quite unlikely you will have two drives die on you (be are the same manufacturer/model/batch or something absolutely different). The RAID horror stories - which I have heard myself not once including from people who it happened to - when several drives die "simultaneously" are under further scrutiny just poorly configured RAIDs. Namely, several drives are sort of dead and sit like that waiting the moment one drive is actually kicked out of array, triggering array rebuild, and thus triggering access to the whole surface of all drives, and discovery of other drives that were sitting "almost dead" undiscovered just because "raid verification" was not run every so often. I find even with verification of drive full surface scan as rare as 1 Month I am reasonably safe from hitting multiple drives die leading to data loss. Somebody with better knowledge of probability theory will correct me if I'm wrong some place. This, of course, has nothing to do with occasional "small" data loss if no other measures but just having data on RAID are taken. What I mean here is, RAID has no ultimately reliable mechanism to discover which of drives gives corrupted stripe on read (due to not correctly recovered by drive firmware content of bad block) - if checksum doesn't match in RAID-5 there is no way to figure mathematically which stripe is corrupted (one could suspect particular drive if its response time is slower, as its firmware might be busy attempting to recover data by multiple read and superimposition of results). With RAID-6 situation is better, as excluding one of the drives at a time you can discover self - consistent group of N-1 drives, thus knowing the stripe on which drive you need to correct. But I'm not sure RAID adapter firmware indeed does it. Which leads us to conclusion: we need to take advantage of filesystems ensuring data integrity (zfs comes to my mind). Valeri > > That's why when I set up a mirror I always build it with drives from > different companies. And I make it a three way mirror if I can. > -- > Kevin P. Neal http://www.pobox.com/~kpn/ > > "Good grief, I've just noticed I've typed in a rant. Sorry chaps!" > Keir Finlow Bates, circa 1998 > _______________________________________________ > freebsd-questions@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to > "freebsd-questions-unsubscribe@freebsd.org" > ++++++++++++++++++++++++++++++++++++++++ Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247 ++++++++++++++++++++++++++++++++++++++++