Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Mar 2000 15:31:14 +1030
From:      Greg Lehey <grog@lemis.com>
To:        Ken <webweaver@rmci.net>
Cc:        Kevin Day <toasty@dragondata.com>, freebsd-isp@FreeBSD.ORG
Subject:   Re: Vinum Stripe Size
Message-ID:  <20000308153114.A52208@freebie.lemis.com>
In-Reply-To: <4.2.0.58.20000307155338.00974100@mail.rmci.net>
References:  <4.2.0.58.20000307152825.00956e20@mail.rmci.net> <200003072239.QAA93953@celery.dragondata.com> <4.2.0.58.20000307155338.00974100@mail.rmci.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday,  7 March 2000 at 15:58:00 -0700, Ken wrote:
> At 04:39 PM 3/7/00 -0600, Kevin Day wrote:
>>>
>>> Greets:
>>>
>>> Vinum simple config uses 256k stripe sizes.  However, the examples in _The
>>> Complete FBSD_ and Greg's web pages use 512k stripe size.  It doesn't
>>> appear that this will make a whole lot of difference one way or the other,
>>> but I am curious about what others are using.  This application is for
>>> stripped mirror on four 9GB drives.
>>>
>>> Thanks for your output-- Ken
>>
>> Actually, I discovered that with 4 drives, you're much better off using an
>> odd stripe size. (not a power of two) This is because of how the cylinder
>> groups are laid out, they'll all end up on one drive.
>>
>> You may want to ask Greg Lehey (grog@lemis.com) for more info about this, as
>> I can't remember exactly what he came up with for an optimum stripe size.
>
> I figured he'd end up seeing this and I didn't want to bug him at
> his private address;) The docs suggest 256k to 512K.  I thing I read
> somewhere to use 512 with larger drives, but cannot recall precisely
> where.  BTW, this is for a web hosting box.  I could probably get by
> with RAID-1, but figured I might as well go with RAID-10 since I
> have the drives to spare.

Indeed, Kevin is right.  At the FreeBSDCon he showed me some
interesting results running ufs on striped plexes with different
stripe sizes.

There are two basic issues here:

1.  We want to avoid more I/O requests than necessary.  This is
    described in more detail at
    http://www.lemis.com/vinum/Performance-issues.html.  Basically,
    the larger the stripe size, the fewer user-level requests get
    turned into multiple I/O requests.  The largest I/O request on a
    FreeBSD machine is currently 128 kB, but it's extremely unusual to
    see more than 64 kB, and the average is round 8 to 16 kB.  You can
    keep track of what Vinum is doing here with the 'lp -s' command
    (list statistics for all plexes):

vinum -> lp -s root.p2
Object            Reads         Bytes   Average Recover  Writes         Bytes   Average   Mblock  Mstripe
root.p2             345         4308480   12488       0    2124         8675328    4084      16       3

    The column 'Mblock' specifies the number of transfers that go
    beyond a single block in a stripe (i.e. they involve more than one
    disk).  The column 'Mstripe' specifies the number of transfers
    that go across a stripe boundary.  In practice, with a large
    stripe size, these two values have the same implications.

2.  You want to avoid putting all your superblocks on the same disk.
    Nowadays cylinder groups are almost always 32 MB, so any power of
    2 stripe size will give rise to this undesirable situation.  Kevin
    gave me some interesting input on the effect this has on
    performance, but it's quite a difficult problem.  I'm planning to
    provide a program to work out optimal stripe sizes, but it's not
    ready yet.  The principle is simple, though: spread the
    superblocks equally across all disks.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-isp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000308153114.A52208>