Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 31 Mar 2000 05:50:33 +0200
From:      "Niels Chr. Bank-Pedersen" <ncbp@bank-pedersen.dk>
To:        Greg Lehey <grog@lemis.com>
Cc:        freebsd-questions@FreeBSD.ORG, Jesper Skriver <jesper@skriver.dk>
Subject:   Re: Can one do a striped volume with mirrored plexes
Message-ID:  <20000331055033.D50664@bank-pedersen.dk>
In-Reply-To: <20000331130455.F6764@mojave.worldwide.lemis.com>; from grog@lemis.com on Fri, Mar 31, 2000 at 01:04:55PM %2B1000
References:  <20000330165535.C48576@bank-pedersen.dk> <20000331111613.A6178@mojave.worldwide.lemis.com> <20000331045100.A50664@bank-pedersen.dk> <20000331130455.F6764@mojave.worldwide.lemis.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 31, 2000 at 01:04:55PM +1000, Greg Lehey wrote:
> On Friday, 31 March 2000 at  4:51:01 +0200, Niels Chr. Bank-Pedersen wrote:
> > On Fri, Mar 31, 2000 at 11:16:13AM +1000, Greg Lehey wrote:
> >> On Thursday, 30 March 2000 at 16:55:35 +0200, Niels Chr. Bank-Pedersen wrote:
> >>> Hi,
> >>>
> >>> I have 3 Kingston DS-500 chassis' connected to 3 Tekram controllers,
> >>> and I want to create one volume/filesystem that must be resilient to
> >                          ^^^
> >>> disk failure, chassis failure and controller failure.
> >>>
> >>> In vinum(8) it says that I can create a mirrored volume with two
> >>> striped plexes, but that would only give me resilience against disk
> >>> failure, so what I was thinking of was something like this:
> >>>
> >>> Chassis		disks		controller
> >>>
> >>> A	0 1 3 4 6 7 9 a x	I
> >>> B	0 2 3 5 6 8 9 b x	II
> >>> C	1 2 4 5 7 8 a b x	III
> >>>
> >>>  - where each number, 0-b (x being spare disks), represents a small
> >>> mirror of 2 disks, and then all mirrors should be striped together
> >>> to form one volume.
> >>>
> >>> With the above setup I believe I will have resilience
> >>> against all "single-unit failures" - be it disk, chassis or
> >>> controller, but I dont see any way to configure this with
> >>> vinum.
> >>
> >> Hmm.  If I understand what you want here, this would do it:
> >>
> >> drive 0A device /dev/chassisA/drive0
> >> <snip>
> >> volume 0
> >>   plex org striped 512k
> >>     sd len 0b drive 0A
> >>     sd len 0b drive 0B
> >> volume 1
> >>   plex org striped 512k
> >>     sd len 0b drive 0C
> >>     sd len 0b drive 1A
> >>
> >> (etc).
> >
> > I'm not sure I get this (or else I wasn't being clear about what
> > I want in the first place): Wouldn't this give me 12 seperate
> > volumes => 12 mountpoints (with the last 3 drives excluded)?
> 
> Yes.
> 
> >  - And if I ommit all but the first volume statement like:
> >
> >  volume 0
> >    plex org striped 512k
> >      sd len 0b drive 0A
> >      sd len 0b drive 0B
> >    plex org striped 512k
> >      sd len 0b drive 0C
> >      sd len 0b drive 1A
> >  (etc).
> >
> > then it is my understanding that I would get one volume mirrored
> > on 12 plexes (8, I believe, is the maximum number allowed).
> 
> Correct, but you don't want more than 2 plexes.
> 
> > Hmm, probably me missing something obvious here.
> 
> Or I am.  I understood that you wanted 12 separate volumes.  Ah, yes,
> you did.  But you also want a single volume, and you want to make it
> out of volumes you first thought of.  Why do you want to want to do
> that?

I didn't specifically want to call the first instance "volume", but
yes, I wanted one volume striped across 12 fully resilient "entities"
because I thought that was the only way to achieve the best of both
worlds given the hardware at hand.

> The only real issue you have here is that you need to ensure that the
> failure of any one component doesn't cause an overlapping hole in both
> plexes.  You should be able to do this by taking the first chassis and
> the first half of the second chassis for the first plex, and the
> second half of the second chassis and the third chassis for the second
> plex.  Then you have:
> 
>  - failure of first controller or chassis: the first plex loses the
>    first two thirds of the address space.  The second plex covers the
>    entire address space, so there's no failure.
> 
>  - failure of third controller or chassis: the second plex loses the
>    last two thirds of the address space.  The first plex covers the
>    entire address space, so there's no failure.
> 
>  - failure of the second controller: first plex loses last third of
>    the address space.  The second plex losts the first third of the
>    address space.  We still have complete coverage, since the losses
>    don't overlap.

This is what I was missing - that vinum is intelligent enough (well,
of course) to survive loosing subdisks in both plexes as long as the
full address space can be found *anywhere* on the remaining subdisks.

In my case this will require that the first part of the first plex is
mirrored in the first part of the second plex, but that can be
controlled by the layout in the configurationfile.

My understanding of:

 "When laying out a mirrored volume, it is important to ensure that
  the subdisks of each plex are on different drives, so that a drive
  failure will not take down both plexes."

was that vinum wouldn't survive losing subdisks in both plexes.
Now I take it that the above only refers to situations where a drive
failure takes down 2 or more subdisks containing the same address
space from each plex.

> Greg

Thanks.

> --
> When replying to this message, please copy the original recipients.
> For more information, see http://www.lemis.com/questions.html
> Finger grog@lemis.com for PGP public key
> See complete headers for address and phone numbers


/Niels Chr.

-- 
 Niels Christian Bank-Pedersen, NCB1-RIPE.
 Network Manager, Tele Danmark NET, IP-section.

 "Hey, are any of you guys out there actually *using* RFC 2549?"


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000331055033.D50664>