From owner-cvs-src@FreeBSD.ORG Thu Aug 19 07:16:13 2004 Return-Path: Delivered-To: cvs-src@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BE89E16A4CE; Thu, 19 Aug 2004 07:16:13 +0000 (GMT) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1D59D43D1F; Thu, 19 Aug 2004 07:16:13 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.1/8.13.1) with ESMTP id i7J7G8C7012438; Thu, 19 Aug 2004 09:16:08 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: "Greg 'groggy' Lehey" From: "Poul-Henning Kamp" In-Reply-To: Your message of "Thu, 19 Aug 2004 16:37:16 +0930." <20040819070716.GS85432@wantadilla.lemis.com> Date: Thu, 19 Aug 2004 09:16:08 +0200 Message-ID: <12437.1092899768@critter.freebsd.dk> Sender: phk@critter.freebsd.dk cc: Scott Long cc: src-committers@FreeBSD.org cc: Pawel Jakub Dawidek cc: cvs-src@FreeBSD.org cc: John-Mark Gurney cc: cvs-all@FreeBSD.org cc: Wilko Bulte Subject: Re: RAID-3? X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Aug 2004 07:16:13 -0000 In message <20040819070716.GS85432@wantadilla.lemis.com>, "Greg 'groggy' Lehey" writes: >> Every write takes exactly the same amount of time. > >Which, including aggregate seek time, is longer than for RAID-5, >because more disks are involved. RAID3 is within epsilon of the single disk because all the disks work in unison. (Spindle-sync is a good idea btw). >> There is no waiting for data to be read off of any disks. > >Sure there is. There's always waiting for data to be read off disks. >That's part of the way disks are built. You've got to seek first, >then you've got to get the head over the data. That's why I said that >RAID-3 is only useful for sequential transfers. You're wrong. RAID-3 is good for normal usage. The point is that you don't have to do the complicated "I have disk1 in my cache but that is not the parity and not the one I'm writing so I need to read 2,3 and the parity which is 4 and then write my data to disk 5 and calculate and update the parity on 4" dance. RAID3 works by: A write-request: first 1/4 goes to disk1 second 1/4 goes to disk2 third 1/4 goes to disk3 fourth 1/4 goes to disk4 Calculate parity send to disk 5 A read-request: read 1st 1/4 from disk1 read 2nd 1/4 from disk2 read 3rd 1/4 from disk3 read 4th 1/4 from disk4 read parity from disk5 and check And that is _all_ there is to it. >Note, of course, that RAID-5 is relatively good on reading. The big >disadvantage of RAID-5 is when you write. RAID3 doesn't suffer and is very predictable in either mode. >Of course it has. Once you spread your data out over more than one >disk, you need some kind of mapping. What we're talking about here >appears to be an implicit one sector stripe size, though the original >paper talked of a stripe size of one byte. Forget the original paper. Original papers are full of things people have not found out yet. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.