From owner-freebsd-scsi  Tue Feb  1 17:29:29 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from caspian.plutotech.com (caspian.plutotech.com [206.168.67.80])
	by builder.freebsd.org (Postfix) with ESMTP
	id ECC3E3FD2; Tue,  1 Feb 2000 17:29:23 -0800 (PST)
Received: from caspian.plutotech.com (localhost [127.0.0.1])
	by caspian.plutotech.com (8.9.3/8.9.1) with ESMTP id SAA00438;
	Tue, 1 Feb 2000 18:29:30 -0700 (MST)
	(envelope-from gibbs@caspian.plutotech.com)
Message-Id: <200002020129.SAA00438@caspian.plutotech.com>
X-Mailer: exmh version 2.1.0 09/18/1999
To: Greg Lehey <grog@lemis.com>
Cc: "Justin T. Gibbs" <gibbs@FreeBSD.org>,
	Gary Palmer <gjp@in-addr.com>, scsi@FreeBSD.org, up@3.am,
	Wilko Bulte <wilko@yedi.iaf.nl>
Subject: Re: hardware vs software stripping 
In-reply-to: Your message of "Wed, 02 Feb 2000 11:27:55 +1030."
             <20000202112755.L55303@freebie.lemis.com> 
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Date: Tue, 01 Feb 2000 18:29:30 -0700
From: "Justin T. Gibbs" <gibbs@FreeBSD.org>
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

[summary or raid types]

>Is this all they say about it?

That's from the summary section.

>It begs the question why RAID-3 must
>access all members of the disk at a time.  The only reason I can think
>of is that the data is interleaved in such a manner that you can't get
>*any* useful data without reading them all.  This rather agrees with
>the idea that the data is spread in units of less than a sector.  It
>also doesn't say why RAID-4 is less suitable for large file
>transfers.

The point is that the complexity of RAID 4 buys little if all you
want to do is write large files.

>My understanding is that RAID-3, effectively striping at a sub-sector
>level, can give much higher data rates without buffering, and that's
>its raison d'=EAtre.

If you stripe at the sub-sector level, you must perform RMW.  This makes
absolutely no sense.

>>>> In RAID4, it is supposed to be a multiple of your transaction size
>>>
>>> Where do you get the term "transaction" from?  I haven't seen it in
>>
>> From the dictionary?  8-)
>>
>> The point is that your system is such that you may be able to
>> satisfy a request by only reading one component of the stripe.
>
>That's one point.  My point is that a transaction may be of various
>sizes, whereas the stripe has a fixed size.

If your transaction is larger, perhaps you satisfy it by modifying 1 or
more full stripes and only partially modifying the border stripes.
The point is still the same.

>>> any RAID documentation.  In ufs, there is no fixed size.
>>
>> Sure there is, the block size (i.e. 8k.)
>
>ufs has a block size, sure, but the transfers are very seldom equal to
>the block size.

Lets say that you do 64k "strips" on each drive.  To satisfy an 8k transa=
ction,
you only need to touch on drive (and the parity disk on a write).  To sat=
isfy
a 128k transaction, you touch at most 4 (3 if your transaction is aligned=
).
You don't need to touch all N.  That is the difference.

>>> I'd call both of these RAID-4, considering that RAID doesn't use the
>>> term "transaction".
>>
>> Sure it does.
>
>Is this in The Book as well?  How is it defined?

The same way it is defined in the dictionary.  The way you determine
which RAID type is appropriate for you is by looking at the number of
disks you have, the efficient disk strip size, as well as the transaction=

type and size of your application.  That's what this is all about.

>> In RAID-3, your transaction size *is* the stripe size.  In RAID-4,
>> it may be less than the stripe size.
>
>So what is it in the Pluto implementation that stops you from
>reading only part of a RAID-3 stripe?

We could read part of a RAID-3 stripe if we decided the software
complexity warranted it.  In our application, it makes more sense
to read the entire stripe and cache it rather than read individual
chunks.

--
Justin


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message