From owner-freebsd-questions@FreeBSD.ORG Sun Jul 13 12:14:58 2003 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A6AC737B401 for ; Sun, 13 Jul 2003 12:14:58 -0700 (PDT) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0C0A943F75 for ; Sun, 13 Jul 2003 12:14:58 -0700 (PDT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.9/8.12.9) id h6DJErZp004822; Sun, 13 Jul 2003 14:14:53 -0500 (CDT) (envelope-from dan) Date: Sun, 13 Jul 2003 14:14:53 -0500 From: Dan Nelson To: Andrea Venturoli Message-ID: <20030713191453.GF23909@dan.emsphone.com> References: <200307130245.h6D2j8HB000556@soth.ventu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200307130245.h6D2j8HB000556@soth.ventu> X-OS: FreeBSD 5.1-CURRENT X-message-flag: Outlook Error User-Agent: Mutt/1.5.4i cc: freebsd-questions@freebsd.org Subject: Re: vinum and hot-swapping X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Jul 2003 19:14:58 -0000 In the last episode (Jul 13), Andrea Venturoli said: > ** Reply to note from "Greg 'groggy' Lehey" Sat, 12 Jul 2003 17:13:29 +0930 > > The real performance penalty for RAID-5 is simply that writes require > > so much I/O. Expect 25% of the write performance of RAID-0. > > Ok, I must ask this: Shouldn't SCSI system allow paralell writes on > different disks? If so, why so much penalty? Parallel I/Os are already being used. A short write on a RAID-5 array requires you to 1) Read the original block and the parity block (done in parallel) 2) XOR the parity block with the original block and the new block 3) Write the new block and the parity block (done in parallel) Which means that you're doing 4 times the I/O that a plain RAID-5 read would do. There's no getting around this problem for small random writes. Repeated writes to the same locations only cost two writes, since the original and parity blocks are probably still in cache. There is a threshold point where this stops being an issue, however. When your write size becomes larger than the raid-5 stripe width (stripe size * number of data disks), you can simply calculate the parity block directly and not have to read anything. At this point, raid-5 magically becomes as efficient as raid-0 :) I don't believe vinum can optimize full-stripe writes, though, since FreeBSD can only do I/O in 64k max chunks, and since vunum is software instead of battery-backed hardware RAID, it cannot hold off on multiple writes until the stripe fills up. Most hardware RAIDs do parity-block caching and long write optimizations. -- Dan Nelson dnelson@allantgroup.com