From owner-cvs-src@FreeBSD.ORG  Thu Aug 19 07:16:13 2004
Return-Path: <owner-cvs-src@FreeBSD.ORG>
Delivered-To: cvs-src@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id BE89E16A4CE; Thu, 19 Aug 2004 07:16:13 +0000 (GMT)
Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 1D59D43D1F; Thu, 19 Aug 2004 07:16:13 +0000 (GMT)
	(envelope-from phk@critter.freebsd.dk)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.13.1/8.13.1) with ESMTP id i7J7G8C7012438;
	Thu, 19 Aug 2004 09:16:08 +0200 (CEST)
	(envelope-from phk@critter.freebsd.dk)
To: "Greg 'groggy' Lehey" <grog@FreeBSD.org>
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
In-Reply-To: Your message of "Thu, 19 Aug 2004 16:37:16 +0930."
             <20040819070716.GS85432@wantadilla.lemis.com> 
Date: Thu, 19 Aug 2004 09:16:08 +0200
Message-ID: <12437.1092899768@critter.freebsd.dk>
Sender: phk@critter.freebsd.dk
cc: Scott Long <scottl@samsco.org>
cc: src-committers@FreeBSD.org
cc: Pawel Jakub Dawidek <pjd@FreeBSD.org>
cc: cvs-src@FreeBSD.org
cc: John-Mark Gurney <gurney_j@resnet.uoregon.edu>
cc: cvs-all@FreeBSD.org
cc: Wilko Bulte <wb@freebie.xs4all.nl>
Subject: Re: RAID-3? 
X-BeenThere: cvs-src@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: CVS commit messages for the src tree <cvs-src.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-src>
List-Post: <mailto:cvs-src@freebsd.org>
List-Help: <mailto:cvs-src-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Aug 2004 07:16:13 -0000

In message <20040819070716.GS85432@wantadilla.lemis.com>, "Greg 'groggy' Lehey"
 writes:

>> Every write takes exactly the same amount of time.
>
>Which, including aggregate seek time, is longer than for RAID-5,
>because more disks are involved.

RAID3 is within epsilon of the single disk because all the disks
work in unison.  (Spindle-sync is a good idea btw).

>> There is no waiting for data to be read off of any disks.
>
>Sure there is.  There's always waiting for data to be read off disks.
>That's part of the way disks are built.  You've got to seek first,
>then you've got to get the head over the data.  That's why I said that
>RAID-3 is only useful for sequential transfers.

You're wrong.  RAID-3 is good for normal usage.  The point is that
you don't have to do the complicated "I have disk1 in my cache but
that is not the parity and not the one I'm writing so I need to
read 2,3 and the parity which is 4 and then write my data to disk
5 and calculate and update the parity on 4" dance.

RAID3 works by:

    A write-request:
	first 1/4 goes to disk1
	second 1/4 goes to disk2
	third 1/4 goes to disk3
	fourth 1/4 goes to disk4
	Calculate parity send to disk 5

    A read-request:
	read 1st 1/4 from disk1
	read 2nd 1/4 from disk2
	read 3rd 1/4 from disk3
	read 4th 1/4 from disk4
	read parity from disk5 and check

And that is _all_ there is to it.

>Note, of course, that RAID-5 is relatively good on reading.  The big
>disadvantage of RAID-5 is when you write.

RAID3 doesn't suffer and is very predictable in either mode.

>Of course it has.  Once you spread your data out over more than one
>disk, you need some kind of mapping.  What we're talking about here
>appears to be an implicit one sector stripe size, though the original
>paper talked of a stripe size of one byte.

Forget the original paper.  Original papers are full of things
people have not found out yet.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.