From owner-freebsd-stable@FreeBSD.ORG Fri Dec 4 20:33:40 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9070C10657AB for ; Fri, 4 Dec 2009 20:33:40 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from QMTA11.westchester.pa.mail.comcast.net (qmta11.westchester.pa.mail.comcast.net [76.96.59.211]) by mx1.freebsd.org (Postfix) with ESMTP id 374848FC0A for ; Fri, 4 Dec 2009 20:33:39 +0000 (UTC) Received: from OMTA16.westchester.pa.mail.comcast.net ([76.96.62.88]) by QMTA11.westchester.pa.mail.comcast.net with comcast id D8QG1d0041uE5Es5B8Zgho; Fri, 04 Dec 2009 20:33:40 +0000 Received: from koitsu.dyndns.org ([98.248.46.159]) by OMTA16.westchester.pa.mail.comcast.net with comcast id D8k21d00A3S48mS3c8k24h; Fri, 04 Dec 2009 20:44:03 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 0FEED1E301B; Fri, 4 Dec 2009 12:33:38 -0800 (PST) Date: Fri, 4 Dec 2009 12:33:38 -0800 From: Jeremy Chadwick To: freebsd-stable@freebsd.org Message-ID: <20091204203338.GA30364@icarus.home.lan> References: <831421F9-6344-4E68-BD64-9C013EB86523@lassitu.de> <06D8F596-649B-4478-8A2F-F9EA133B8DDC@lassitu.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <06D8F596-649B-4478-8A2F-F9EA133B8DDC@lassitu.de> User-Agent: Mutt/1.5.20 (2009-06-14) Subject: Re: Fatal trap 9 triggered by zfs? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Dec 2009 20:33:40 -0000 On Fri, Dec 04, 2009 at 08:56:05PM +0100, Stefan Bethke wrote: > Am 04.12.2009 um 17:52 schrieb Stefan Bethke: > > > I'm getting panics like this every so often (couple weeks, sometimes just a few days.) A second machine that has identical hardware and is running the same source has no such problems. > > > > FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec 1 14:30:54 UTC 2009 root@XXX.hanse.de:/usr/obj/usr/src/sys/EISENBOOT amd64 > > > > # zpool status > > pool: tank > > state: ONLINE > > scrub: none requested > > config: > > > > NAME STATE READ WRITE CKSUM > > tank ONLINE 0 0 0 > > ad4s1d ONLINE 0 0 0 > > # cat /boot/loader.conf > > vfs.zfs.arc_max="512M" > > vfs.zfs.prefetch_disable="1" > > vfs.zfs.zil_disable="1" > > Got another, different one. Any tuning suggestions or similar? > > #0 doadump () at pcpu.h:223 > 223 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) #0 doadump () at pcpu.h:223 > #1 0xffffffff80337bd9 in boot (howto=260) > at /usr/src/sys/kern/kern_shutdown.c:416 > #2 0xffffffff8033802c in panic (fmt=Variable "fmt" is not available. > ) > at /usr/src/sys/kern/kern_shutdown.c:579 > #3 0xffffffff805cc2ad in trap_fatal (frame=0x9, eva=Variable "eva" is not available. > ) > at /usr/src/sys/amd64/amd64/trap.c:857 > #4 0xffffffff805cce12 in trap (frame=0xffffff80625db030) > at /usr/src/sys/amd64/amd64/trap.c:644 > #5 0xffffffff805b2943 in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:224 > #6 0xffffffff80586c7a in vm_map_entry_splay (addr=Variable "addr" is not available. > ) > at /usr/src/sys/vm/vm_map.c:771 > #7 0xffffffff80587f37 in vm_map_lookup_entry (map=0xffffff00010000e8, > address=18446743523979624448, entry=0xffffff80625db170) > at /usr/src/sys/vm/vm_map.c:1021 > #8 0xffffffff80588aa3 in vm_map_delete (map=0xffffff00010000e8, > start=18446743523979624448, end=18446743523979689984) > at /usr/src/sys/vm/vm_map.c:2685 > #9 0xffffffff80588e61 in vm_map_remove (map=0xffffff00010000e8, > start=18446743523979624448, end=18446743523979689984) > at /usr/src/sys/vm/vm_map.c:2774 > #10 0xffffffff8057db85 in uma_large_free (slab=0xffffff005fcc7000) > at /usr/src/sys/vm/uma_core.c:3021 > #11 0xffffffff80325987 in free (addr=0xffffff80018b0000, > mtp=0xffffffff80ac61e0) at /usr/src/sys/kern/kern_malloc.c:471 > #12 0xffffffff80a36d03 in vdev_cache_evict (vc=0xffffff0001723ce0, > ve=0xffffff003dd52200) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:151 > #13 0xffffffff80a372ad in vdev_cache_read (zio=0xffffff005f5ca2d0) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:182 > #14 0xffffffff80a4a954 in zio_vdev_io_start (zio=0xffffff005f5ca2d0) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1814 > #15 0xffffffff80a4ae87 in zio_execute (zio=0xffffff005f5ca2d0) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996 > #16 0xffffffff80a3a080 in vdev_mirror_io_start (zio=0xffffff005f811b40) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:303 > #17 0xffffffff80a4ae87 in zio_execute (zio=0xffffff005f811b40) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996 > #18 0xffffffff809ff45a in arc_read_nolock (pio=0xffffff005f66d5a0, > spa=0xffffff000150a000, bp=0xffffff800a91c440, > done=0xffffffff80a02630 , private=Variable "private" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2763 > #19 0xffffffff809ff8ec in arc_read (pio=0xffffff005f66d5a0, > spa=0xffffff000150a000, bp=0xffffff800a91c440, pbuf=0xffffff0042a3ca20, > done=0xffffffff80a02630 , private=0xffffff005fbfc620, > priority=0, zio_flags=1, arc_flags=0xffffff80625db5ec, > zb=0xffffff80625db5c0) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2508 > #20 0xffffffff80a02aba in dbuf_read (db=0xffffff005fbfc620, > zio=0xffffff005f66d5a0, flags=2) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:521 > #21 0xffffffff80a0602c in dmu_buf_hold (os=Variable "os" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:106 > #22 0xffffffff80a40db5 in zap_lockdir (os=0xffffff005f937610, obj=247890, > tx=0x0, lti=RW_READER, fatreader=1, adding=0, zapp=0xffffff80625db888) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:388 > #23 0xffffffff80a41724 in zap_cursor_retrieve (zc=0xffffff80625db880, > za=0xffffff80625db8c0) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:1004 > #24 0xffffffff80a61b66 in zfs_freebsd_readdir (ap=Variable "ap" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:2157 > #25 0xffffffff803cfde9 in kern_getdirentries (td=0xffffff0057bfe000, fd=Variable "fd" is not available. > ) > at vnode_if.h:758 > #26 0xffffffff803d0093 in getdirentries (td=Variable "td" is not available. > ) > at /usr/src/sys/kern/vfs_syscalls.c:4051 > #27 0xffffffff805cc906 in syscall (frame=0xffffff80625dbc80) > at /usr/src/sys/amd64/amd64/trap.c:989 > #28 0xffffffff805b2c21 in Xfast_syscall () > at /usr/src/sys/amd64/amd64/exception.S:373 > #29 0x0000000800724cdc in ?? () > Previous frame inner to this frame (corrupt stack?) Another user already proposed bad hardware, which is possible, although the stack trace doesn't appear corrupt in any way. It looks like the crash happens right after vm_map_lookup_entry(). I don't know why. > #7 0xffffffff80587f37 in vm_map_lookup_entry (map=0xffffff00010000e8, > address=18446743523979624448, entry=0xffffff80625db170) > at /usr/src/sys/vm/vm_map.c:1021 > #8 0xffffffff80588aa3 in vm_map_delete (map=0xffffff00010000e8, You only have one disk in your pool. I'm not sure how long your system stays up before it panics, but could you try doing "zpool scrub tank" and let that run for a while? The first ~5 minutes may show the time to completion (from "zpool status") getting worse and worse, but it should decrease/catch up. If the scrub is able to finish, look for any errors in the resulting R/W/CK fields. The only other recommendation I have is to pull in pjd@ and ask about vm.kmem_size or vm.kmem_size_max tuning to see if that makes any difference, or maybe kmacy@ and see if he has any advice. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |