From owner-freebsd-current@FreeBSD.ORG Sun Dec 7 10:21:16 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9F9D71065676 for ; Sun, 7 Dec 2008 10:21:16 +0000 (UTC) (envelope-from stb@lassitu.de) Received: from koef.zs64.net (koef.zs64.net [212.12.50.230]) by mx1.freebsd.org (Postfix) with ESMTP id 1E5608FC1E for ; Sun, 7 Dec 2008 10:21:15 +0000 (UTC) (envelope-from stb@lassitu.de) Received: from localhost by koef.zs64.net (8.14.3/8.14.3) with ESMTP id mB7ALDgL075104 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Sun, 7 Dec 2008 11:21:14 +0100 (CET) (envelope-from stb@lassitu.de) (authenticated as stb) Message-Id: <5C97E586-4296-4835-A103-FD273B2D7A4F@lassitu.de> From: Stefan Bethke To: FreeBSD Current In-Reply-To: <31C70CBC-488A-4A9A-A642-37855E8F1DD1@lassitu.de> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Date: Sun, 7 Dec 2008 11:21:13 +0100 References: <20081117205526.GC1733@garage.freebsd.pl> <20081202203308.GA13818@hyperion.scode.org> <200812021254.21242.fjwcash@gmail.com> <20081202232924.GA19134@hyperion.scode.org> <31C70CBC-488A-4A9A-A642-37855E8F1DD1@lassitu.de> X-Mailer: Apple Mail (2.929.2) Subject: ZFS gets stuck X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Dec 2008 10:21:16 -0000 For the past week, I've been stress testing two new boxes by running make -j4 universe. /usr/src is on ufs, /usr/obj on zfs, backed by a single disk pool. Every so often (about once every one or two days), processes start getting wedged on accessing the zfs file systems. Just this morning I found: output from make universe: >> sparc64 started on Sun Dec 7 01:34:41 UTC 2008 >> amd64 completed on Sun Dec 7 02:18:39 UTC 2008 >> pc98 buildworld completed on Sun Dec 7 02:24:35 UTC 2008 >> sun4v started on Sun Dec 7 02:30:45 UTC 2008 ps shows me these processes that apparently are waiting on some zfs operation: 0 81836 81830 0 47 0 6732 4056 zio->i D ?? 0:13.11 find -sx / /tank /tank/ports /usr/obj /dev/null -type f ( -perm -u+x - or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or -perm -g+s ) -exec ls - liTd {} + 0 86345 86344 0 76 0 11392 10300 zio->i D+ 0 0:00.53 make all DIRPRFX=kerberos5/lib/libasn1/ 0 91920 91420 0 76 0 7144 1792 zfsvfs D+ 0 0:00.00 sh -ev 0 91923 91902 0 76 0 2180 400 zfsvfs D+ 0 0:00.00 [cc] 0 91924 91900 0 76 0 10456 7572 zfsvfs D+ 0 0:00.02 / usr/obj/pc98/usr/src/tmp/usr/libexec/cc1 -E -quiet -nostdinc -I. -I@ - I@/contrib/altq -I/usr/obj/pc98/usr/src/sys/MAC -M -D_LONGLONG -DPC98 - D_KERNEL -DKLD_MODULE -DHAVE_KERNEL_OPTION_HEADERS /usr/src/sys/ modules/smbfs/../../netsmb/smb_dev.c df still works, ls /tank blocks. FreeBSD lokschuppen.lassitu.de 8.0-CURRENT FreeBSD 8.0-CURRENT #1: Wed Dec 3 07:05:03 UTC 2008 root@lokschuppen.lassitu.de:/usr/obj/usr/ src/sys/EISENBOOT amd64 So far, I've had this in loader.conf: vfs.zfs.arc_max="512M" vfs.zfs.prefetch_disable="1" I'm now adding vfs.zfs.zil_disable="1" to see if that makes a difference. Is there anything in particular people would want me to check out? Kernel is GENERIC minus a number of devices, and without INVARIANTS and WITNESS. Stefan -- Stefan Bethke Fon +49 170 346 0140