FreeBSD Mail Archives

Date:      Sat, 27 Oct 2001 11:11:45 +0930
From:      Greg Lehey <grog@FreeBSD.org>
To:        Matthew Jacob <mjacob@feral.com>
Cc:        Doug Rabson <dfr@nlsystems.com>, Luigi Rizzo <rizzo@aciri.org>, John Baldwin <jhb@FreeBSD.org>, Jonathan Lemon <jlemon@FreeBSD.org>, cvs-all@FreeBSD.org, cvs-committers@FreeBSD.org
Subject:   Re: RAID-5 parity calculations (was: cvs commit: src/sys/dev/fxp if_fx)
Message-ID:  <20011027111145.A7846@wantadilla.lemis.com>
In-Reply-To: <Pine.BSF.4.21.0110261757260.12956-100000@beppo>; from mjacob@feral.com on Fri, Oct 26, 2001 at 05:59:29PM -0700
References:  <Pine.BSF.4.21.0110261753401.12956-100000@beppo> <Pine.BSF.4.21.0110261757260.12956-100000@beppo>

On Friday, 26 October 2001 at 17:59:29 -0700, Matt Jacob wrote:
> On Fri, 26 Oct 2001, Matthew Jacob wrote:
>> On Sat, 27 Oct 2001, Greg Lehey wrote:
>>> On Thursday, 25 October 2001 at 15:24:06 -0700, Matt Jacob wrote:
>>>>
>>>> And the fastest software RAID-V I've known was at NASA/Ames on the
>>>> Convex 3280s- they used the otherwise unused vector units for parity
>>>> calculations- this gave write performance for a 22 wide stripe on a
>>>> terabyte fileystem to be at about 88% of theoretical maximum, which
>>>> sure aint' bad.
>>>
>>> The parity calculations for RAID-5 are several orders of magnitude
>>> faster than the disk accesses.  Even on a 486, they took hardly any
>>> time.
>>
>> Sorry, that seems wrong to me.

Have you done measurements?

>> Typical RAID write performance for something like an Sun A1000
>> which has a pentium in it is about 50% of theoretical.

That doesn't mean that's because of the parity calculations.

> I guess the real question is: 'can you get the parity calculations
> done in time so that the entire stripe can go out together'.

Why?

> This obviously doesn't really work for the first request unless you
> delay it. If you have a hugely deep queue, you will burn your
> central processor doing things that are not germane to regular
> systems work- you can't help but be assisted by a coprocessor doing
> that work (it's like bcopy h/w).

I don't see much similarity.  As you say, it's the steady state
performance that's important.  Basically, the resource you want to
optimize is disk bandwidth.  Even if you do the calculations
instantaneously, the data doesn't get written out until the disk has
time to do it, and in a heavy load situation that will mean queueing
behind other requests.

Instantaneously?  Well, how long does it take to checksum 6 kB (the
average request size)?  That's 1500 words, say 10000 instructions.  At
1 GHz, that's 10 µs, which is completely negligible compared to the
minimum four I/O transactions needed, which add up to about 25 ms.

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011027111145.A7846>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation