From owner-freebsd-stable Tue Jan 2 16: 6:30 2001 From owner-freebsd-stable@FreeBSD.ORG Tue Jan 2 16:06:27 2001 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80]) by hub.freebsd.org (Postfix) with ESMTP id A4B2937B400 for ; Tue, 2 Jan 2001 16:06:25 -0800 (PST) Received: by wantadilla.lemis.com (Postfix, from userid 1004) id AA3296A911; Wed, 3 Jan 2001 10:36:21 +1030 (CST) Date: Wed, 3 Jan 2001 10:36:21 +1030 From: Greg Lehey To: Josef Karthauser , Matraquilla@cs.com, Roman Shterenzon , freebsd-stable@FreeBSD.ORG Subject: RAID-5 reliability (was: vinum malfunction!) Message-ID: <20010103103621.G40453@wantadilla.lemis.com> References: <20010102140616.B1391@tao.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010102140616.B1391@tao.org.uk>; from joe@tao.org.uk on Tue, Jan 02, 2001 at 02:06:16PM +0000 Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.lemis.com/~grog X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF 13 24 52 F8 6D A4 95 EF Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Tuesday, 2 January 2001 at 14:06:16 +0000, Josef Karthauser wrote: > > The problem with vinum RAID5 in -stable is that in my experience > there are some nasty bugs in it, Well, one nasty bug. > and I don't believe that Greg has managed to reproduce these himself > and so is a bit confused as to what is causing the trouble. Well, I don't know if I'd use the word "confused". More "uninformed". > I'm worried that he may be inclined to believe that it is only a > small minority of people who are having problems with this. I would > challange this view because I don't know _anyone_ who is > successfully using RAID5 under vinum. They've all migrated away to > using dedicated hardware and thus solved their vinum problems that > way. Yes, I do believe it's a small minority of people. I have had exactly four reports of this problem, yours included. In none of the cases have I been able to get enough information to reproduce it. On the other hand, I know plenty of people who are successfully using RAID-5. > I don't know what it's going to take for Greg to get enough > information to fix the problem. I spent a week trying to extract a > set of debug information for him that was useful enough that he > could work from it, but it seems that my week was just wasted > because it looks like I didn't capture the bug that he was expecting > :(. Well, it's not wasted, but it didn't help enough. As I said, it's elusive. If anybody else out there has experienced the following problem, please contact me: The system runs fine most of the time, but under heavy load it dies with a trap 12 (page fault in kernel mode). The dump shows the system is trying to call the specific iodone routine from biodone. More careful analysis shows that the buffer header in question has had some fields zeroed out. > To summarise. The idea of software raid 5 is great, but it's got to > work otherwise it is dangerous. Like anything else. Note that a large number of these comments could apply equally well to soft updates: it worked fine for most people, but a small minority of people have had trouble with it. > I note that the BUGS section of the manual page doesn't exist. > There should be a warning to potential users that there are known > problems that can cause the data to become corrupted in some > configurations. It looks as if you have missed it: BUGS 1. vinum is a new product. Bugs can be expected. The configuration mechanism is not yet fully functional. If you have difficulties, please look at the section DEBUGGING PROBLEMS WITH VINUM before re- porting problems. 2. Kernels with the vinum pseudo-device appear to work, but are not supported. If you have trouble with this configuration, please first replace the kernel with a non-Vinum kernel and test with the kld module. 3. Detection of differences between the version of the kernel and the kld is not yet implemented. 4. The RAID-5 functionality is new in FreeBSD 3.3. Some problems have been reported with vinum in combination with soft updates, but these are not reproducible on all systems. If you are planning to use vinum in a production environment, please test carefully. Looking at this, I suppose it should be updated. But the section definitely exists. Greg -- Finger grog@lemis.com for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message