Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 3 Jan 2001 10:36:21 +1030
From:      Greg Lehey <grog@lemis.com>
To:        Josef Karthauser <joe@tao.org.uk>, Matraquilla@cs.com, Roman Shterenzon <roman@harmonic.co.il>, freebsd-stable@FreeBSD.ORG
Subject:   RAID-5 reliability (was: vinum malfunction!)
Message-ID:  <20010103103621.G40453@wantadilla.lemis.com>
In-Reply-To: <20010102140616.B1391@tao.org.uk>; from joe@tao.org.uk on Tue, Jan 02, 2001 at 02:06:16PM %2B0000
References:  <d6.7493e7.278332c6@cs.com> <20010102140616.B1391@tao.org.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday,  2 January 2001 at 14:06:16 +0000, Josef Karthauser wrote:
>
> The problem with vinum RAID5 in -stable is that in my experience
> there are some nasty bugs in it, 

Well, one nasty bug.

> and I don't believe that Greg has managed to reproduce these himself
> and so is a bit confused as to what is causing the trouble.

Well, I don't know if I'd use the word "confused".  More "uninformed".

> I'm worried that he may be inclined to believe that it is only a
> small minority of people who are having problems with this.  I would
> challange this view because I don't know _anyone_ who is
> successfully using RAID5 under vinum.  They've all migrated away to
> using dedicated hardware and thus solved their vinum problems that
> way.

Yes, I do believe it's a small minority of people.  I have had exactly
four reports of this problem, yours included.  In none of the cases
have I been able to get enough information to reproduce it.  On the
other hand, I know plenty of people who are successfully using RAID-5.

> I don't know what it's going to take for Greg to get enough
> information to fix the problem.  I spent a week trying to extract a
> set of debug information for him that was useful enough that he
> could work from it, but it seems that my week was just wasted
> because it looks like I didn't capture the bug that he was expecting
> :(.

Well, it's not wasted, but it didn't help enough.  As I said, it's
elusive.

If anybody else out there has experienced the following problem,
please contact me:

  The system runs fine most of the time, but under heavy load it dies
  with a trap 12 (page fault in kernel mode).  The dump shows the
  system is trying to call the specific iodone routine from biodone.
  More careful analysis shows that the buffer header in question has
  had some fields zeroed out.

> To summarise.  The idea of software raid 5 is great, but it's got to
> work otherwise it is dangerous.

Like anything else.  Note that a large number of these comments could
apply equally well to soft updates: it worked fine for most people,
but a small minority of people have had trouble with it.

> I note that the BUGS section of the manual page doesn't exist.
> There should be a warning to potential users that there are known
> problems that can cause the data to become corrupted in some
> configurations.

It looks as if you have missed it:

BUGS
     1.   vinum is a new product.  Bugs can be expected.  The configuration
          mechanism is not yet fully functional.  If you have difficulties,
          please look at the section DEBUGGING PROBLEMS WITH VINUM before re-
          porting problems.

     2.   Kernels with the vinum pseudo-device appear to work, but are not
          supported.  If you have trouble with this configuration, please
          first replace the kernel with a non-Vinum kernel and test with the
          kld module.

     3.   Detection of differences between the version of the kernel and the
          kld is not yet implemented.

     4.   The RAID-5 functionality is new in FreeBSD 3.3.  Some problems have
          been reported with vinum in combination with soft updates, but these
          are not reproducible on all systems.  If you are planning to use
          vinum in a production environment, please test carefully.

Looking at this, I suppose it should be updated.  But the section
definitely exists.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010103103621.G40453>