From owner-freebsd-stable@FreeBSD.ORG Sun Jul 11 18:44:14 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 109BF1065673 for ; Sun, 11 Jul 2010 18:44:14 +0000 (UTC) (envelope-from ricky@csua.berkeley.edu) Received: from mail.CSUA.Berkeley.EDU (mail.CSUA.Berkeley.EDU [128.32.112.223]) by mx1.freebsd.org (Postfix) with ESMTP id F25A38FC1B for ; Sun, 11 Jul 2010 18:44:13 +0000 (UTC) Received: from soda.CSUA.Berkeley.EDU (soda.CSUA.Berkeley.EDU [10.1.1.4]) by mail.CSUA.Berkeley.EDU (Postfix) with ESMTP id 0C1E32E0C2 for ; Sun, 11 Jul 2010 11:23:01 -0700 (PDT) Received: from ricky by soda.CSUA.Berkeley.EDU with local (Exim 4.69) (envelope-from ) id 1OY1DE-0005qU-8q for freebsd-stable@freebsd.org; Sun, 11 Jul 2010 11:25:12 -0700 Date: Sun, 11 Jul 2010 11:25:12 -0700 From: Richard Lee To: freebsd-stable@freebsd.org Message-ID: <20100711182511.GA21063@soda.CSUA.Berkeley.EDU> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-2022-jp Content-Disposition: inline User-Agent: Mutt/1.5.18 (2008-05-17) Subject: Serious zfs slowdown when mixed with another file system (ufs/msdosfs/etc.). X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Jul 2010 18:44:14 -0000 This is on clean FreeBSD 8.1 RC2, amd64, with 4GB memory. The closest I found by Googling was this: http://forums.freebsd.org/showthread.php?t=9935 And it talks about all kinds of little tweaks, but in the end, the only thing that actually works is the stupid 1-line perl code that forces the kernal to free the memory allocated to (non-zfs) disk cache, which is the "Inact"ive memory in "top." I have a 4-disk raidz pool, but that's unlikely to matter. Try to copy large files from non-zfs disk to zfs disk. FreeBSD will cache the data read from non-zfs disk in memory, and free memory will go down. This is as expected, obviously. Once there's very little free memory, one would expect whatever is more important to kick out the cached data (Inact) and make memory available. But when almost all of the memory is taken by disk cache (of non-zfs file system), ZFS disks start threshing like mad and the write throughput goes down in 1-digit MB/second. I believe it should be extremely easy to duplicate. Just plug in a big USB drive formatted in UFS (msdosfs will likely do the same), and copy large files from that USB drive to zfs pool. Right after clean boot, gstat will show something like 20+MB/s movement from USB device (da*), and occasional bursts of activity on zpool devices at very high rate. Once free memory is exhausted, zpool devices will change to constant low-speed activity, with disks threshing about constantly. I tried enabling/disabling prefetch, messing with vnode counts, zfs.vdev.min/max_pending, etc. The only thing that works is that stupid perl 1-liner (perl -e '$x="x"x1500000000'), which returns the activity to that seen right after a clean boot. It doesn't last very long, though, as the disk cache again consumes all the memory. Copying files between zfs devices doesn't seem to affect anything. I understand zfs subsystem has its own memory/cache management. Can a zfs expert please comment on this? And is there a way to force the kernel to not cache non-zfs disk data? --rich