From owner-freebsd-current@FreeBSD.ORG Fri Aug 21 09:47:46 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5FCDA106568C; Fri, 21 Aug 2009 09:47:46 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id E24A88FC47; Fri, 21 Aug 2009 09:47:45 +0000 (UTC) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:52491 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MeQii-0004Kc-5V; Fri, 21 Aug 2009 11:47:42 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id AA636122B64; Fri, 21 Aug 2009 11:47:37 +0200 (CEST) Message-Id: <306284EA-C89C-433C-9D33-E6CF44305800@exscape.org> From: Thomas Backman To: Pawel Dawidek Jakub In-Reply-To: <7F161876-8DA7-4617-98B6-7CD54C691BC6@exscape.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v936) Date: Fri, 21 Aug 2009 11:47:35 +0200 References: <7F161876-8DA7-4617-98B6-7CD54C691BC6@exscape.org> X-Mailer: Apple Mail (2.936) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MeQii-0004Kc-5V. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MeQii-0004Kc-5V a283eea3a8bc599c31c2e39bb1910ce5 Cc: freebsd-fs@freebsd.org, FreeBSD current Subject: Re: Yet another ZFS recv panic; old but rarely seen X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Aug 2009 09:47:46 -0000 On Aug 21, 2009, at 08:51, Thomas Backman wrote: > Ugh. Bad news again: another zfs send/recv panic during an > incremental backup. > > Unread portion of the kernel message buffer: > panic: dirtying dbuf obj=b213 lvl=1 blkid=2 but not tx_held > > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > panic() at panic+0x182 > dmu_tx_dirty_buf() at dmu_tx_dirty_buf+0x28f > dbuf_dirty() at dbuf_dirty+0x69 > dnode_free_range() at dnode_free_range+0x80d > dnode_reallocate() at dnode_reallocate+0x131 > dmu_object_reclaim() at dmu_object_reclaim+0x99 > dmu_recv_stream() at dmu_recv_stream+0x1446 > zfs_ioc_recv() at zfs_ioc_recv+0x25a > zfsdev_ioctl() at zfsdev_ioctl+0x8a > devfs_ioctl_f() at devfs_ioctl_f+0x77 > kern_ioctl() at kern_ioctl+0xf6ioctl() at ioctl+0xfd > syscall() at syscall+0x28f > Xfast_syscall() at Xfast_syscall+0xe1 > --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe5f7c, rsp = > 0x7fffffff8fb8, rbp = 0x7fffffff9cf0 --- > KDB: enter: panic > panic: from debugger > cpuid = 0 > Uptime: 4h52m26s > > Looks *eerily* similar to this panic fron OpenSolaris: http://mail.opensolaris.org/pipermail/zfs-code/2008-September/000694.html > > GDB backtrace isn't of that much more use, I guess: > #11 0xffffffff8036d02b in panic (fmt=Variable "fmt" is not available. > ) > at /usr/src/sys/kern/kern_shutdown.c:562 > #12 0xffffffff80b4765f in dmu_tx_dirty_buf () from /boot/kernel/zfs.ko > #13 0xffffffff80b3a519 in dbuf_dirty () from /boot/kernel/zfs.ko > #14 0xffffffff80b4b68d in dnode_free_range () from /boot/kernel/zfs.ko > #15 0xffffffff80b4c461 in dnode_reallocate () from /boot/kernel/zfs.ko > #16 0xffffffff80b42569 in dmu_object_reclaim () from /boot/kernel/ > zfs.ko > #17 0xffffffff80b421b6 in dmu_recv_stream () from /boot/kernel/zfs.ko > #18 0xffffffff80ba430a in zfs_ioc_recv () from /boot/kernel/zfs.ko > #19 0xffffff002ac13d68 in ?? () > #20 0xffffff002aa6c320 in ?? () > #21 0xffffff002ae15000 in ?? () > #22 0xffffff0002891400 in ?? () > #23 0xffffff00028f2800 in ?? () > #24 0xffffff00744a1ab8 in ?? () > ... > #34 0xffffff803e7fc860 in ?? () > #35 0xffffffff805b699f in uma_zalloc_arg (zone=0xffffff00183c6600, > udata=0xffffff00744a1000, flags=-128) at /usr/src/sys/vm/ > uma_core.c:1990 > Previous frame inner to this frame (corrupt stack?) > (kgdb) > > Apparently, I've gotten this once before, at r195910 (+ patches, not > such which ones at that time), on July 30th. Same DDB backtrace, > same broken GDB backtrace. > > Regards, > Thomas I found some more info mere minutes after posting this (figures; that's why I prefer media where you can edit your posts!), but had other things to do. So, here's some more: OpenSolaris bug ID: 6754448 ( http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6754448 ) Fixed in build 108: http://dlc.sun.com/osol/on/downloads/b108/on-changelog-b108.html Changelogs are to be found on that page (just search for "6754448", with a history/diff link on each source file's page. Unfortunately (unless FreeBSD suffers from both, that is), they apparently fixed two bugs in the same batch, making it harder - at least for *me* - to see what changes relate to *this* panic. Still, I'm guessing this will help, unless the code is too much out of sync with OpenSolaris. I'm also guessing Pawel already knows waaaaaaay more about their system than I do (... which is about nothing), so I'll probably shut up now... ;) Regards, Thomas