From owner-freebsd-fs@FreeBSD.ORG Wed Aug 6 09:22:27 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0F089106566B for ; Wed, 6 Aug 2008 09:22:27 +0000 (UTC) (envelope-from jdc@parodius.com) Received: from mx01.sc1.parodius.com (mx01.sc1.parodius.com [72.20.106.3]) by mx1.freebsd.org (Postfix) with ESMTP id F0D758FC22 for ; Wed, 6 Aug 2008 09:22:26 +0000 (UTC) (envelope-from jdc@parodius.com) Received: by mx01.sc1.parodius.com (Postfix, from userid 1000) id 09A5E1CC0B0; Wed, 6 Aug 2008 02:22:26 -0700 (PDT) Date: Wed, 6 Aug 2008 02:22:26 -0700 From: Jeremy Chadwick To: Matt Simerson Message-ID: <20080806092226.GA49691@eos.sc1.parodius.com> References: <18585.3903.895425.122613@almost.alerce.com> <46EF69A7-349C-4E15-928C-D305E67273CA@spry.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <46EF69A7-349C-4E15-928C-D305E67273CA@spry.com> User-Agent: Mutt/1.5.18 (2008-05-17) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS Advice X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Aug 2008 09:22:27 -0000 On Wed, Aug 06, 2008 at 12:32:43AM -0700, Matt Simerson wrote: > Then I tested ZFS with RAIDZ in various configs (raidz, raidz2, 4,6, and > 8 disk arrays) on FreeBSD. When using raidz and FreeBSD, the difference > in performance of the controllers is much smaller. It's bad with the > Areca controller and worse with the Marvell. My overall impression is > that ZFS performance under FreeBSD is poor. Performance has been significantly improved with a patch from pjd@ provided about a week ago. I have not tested it personally, but there have been a couple reports so far that the performance improvement is significant. I recommend you read the *entire thread*, and yes, it is very long. Subject is "ZFS patches". http://lists.freebsd.org/pipermail/freebsd-fs/2008-July/thread.html It continues into August, with some people using mail clients that don't properly utilise mail reference IDs, so their replies are scattered. Again, look for "ZFS patches". http://lists.freebsd.org/pipermail/freebsd-fs/2008-August/thread.html > I say this because I also tested one of the systems with OpenSolaris on > the Marvell card (OpenSolaris doesn't support the Areca). Read > performance with ZFS and RAIDZ on Solaris was not just 2-3 but 10-12x > faster on Solaris. OpenSolaris write performance was about 50% faster > than FreeBSD on the Areca controller and 100% faster than FreeBSD on the > Marvell. > > The only way I could get decent performance out of FreeBSD and ZFS was > to use the Areca as a RAID controller and then ZFS stripe the data > across the two RAID arrays. I haven't tried it but I'm willing to bet > that if I used UFS and geom_stripe to do the same thing, I'd get better > performance with UFS. If you are looking for performance, then raidz and > ZFS is not where you want to be looking. Do you have any actual numbers showing performance differential? If so, how did you obtain them under FreeBSD? Also, what tuning parameters did you use on FreeBSD (specifically kernel/loader.conf stuff)? It seems that using "zpool iostat" is only useful if one wishes to see the amount of I/O that ends up hitting the physical disks in the pool; if the data is cached in memory, "zpool iostat" won't show any I/O. I can get performance data from gstat(8), but that won't tell me how ZFS itself is actually performing, only at what rate the kernel read/write data from the physical disks. On my system (a 3-disk raidz spool, using WDC WD5000AAKS disks, SATA300, on an Intel ICH7 controller), I can get about 70MB/sec from each disk when reading, and somewhere around 55-60MB/sec when writing. But again, this is for actual disk I/O and isn't testing ZFS performance. > As far as workload with prefetch: under my workloads (heavy network & > file system I/O) prefetch=almost instant crash and burn. As soon as I > put any heavy load on it, it hangs (as I've described previously on this > list). I assume by "hangs" you mean the system becomes unresponsive while disk reads/writes are being performed, then recovers, then stalls again, recovers, rinse lather repeat? If so -- yes, that's the exact behaviour others and myself have reported. Disabling prefetch makes the system much more usable during heavy I/O. > 3ware controllers = cheap and you get what you pay for. At my last job > we had thousands of 3ware cards deployed because they were so > inexpensive and RAID = RAID, right? Well, they were the controllers > most likely to result in catastrophic data loss for our clients. Maybe > it's because the interface is confusing the NOC technicians, maybe it's > because their recovery tools suck, or because when the controller fails > it hoses the disks in interesting ways. For various reasons, our luck at > recovering failed RAID arrays on 3ware cards was poor. This story is pretty much the norm. Once in a while you'll find someone praising 3ware controllers, but I often wonder just what kind of workload and failure testing they've done prior to their praise. A friend of at Rackable told me horror stories involving 3ware controllers. I'm thankful 3ware cares about FreeBSD (most vendors do not), but with a history of firmware/BIOS bugs and the sensitive nature of their cards, I choose to stay away from them. I've heard nothing but praise when it comes to Areca controllers. All that said -- have you actually performed a hard failure with an Areca controller on FreeBSD (using both UFS and ZFS)? Assuming you have hot-swap enclosures/carriers, what happens if you yank a disk on the Areca controller? How does FreeBSD behave in this case? -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |