From owner-freebsd-stable@FreeBSD.ORG Tue Aug 12 13:08:33 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BC4BF106564A for ; Tue, 12 Aug 2008 13:08:33 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 2FAE28FC14 for ; Tue, 12 Aug 2008 13:08:33 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from zion.baldwin.cx (zion.baldwin.cx [IPv6:2001:470:1f11:75:2a0:d2ff:fe18:8b38]) (authenticated bits=0) by server.baldwin.cx (8.14.2/8.14.2) with ESMTP id m7CD866E078642; Tue, 12 Aug 2008 09:08:26 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-stable@freebsd.org Date: Tue, 12 Aug 2008 09:05:48 -0400 User-Agent: KMail/1.9.7 References: <200808110401.49953.kuuse@redantigua.com> <200808111704.30604.jhb@freebsd.org> <200808120842.52899.kuuse@redantigua.com> In-Reply-To: <200808120842.52899.kuuse@redantigua.com> MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200808120905.48528.jhb@freebsd.org> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [IPv6:2001:470:1f11:75::1]); Tue, 12 Aug 2008 09:08:26 -0400 (EDT) X-Virus-Scanned: ClamAV 0.93.1/8018/Tue Aug 12 04:36:31 2008 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,NO_RELAYS autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: Johan Kuuse Subject: Re: kernel panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2008 13:08:33 -0000 On Tuesday 12 August 2008 02:42:52 am Johan Kuuse wrote: > On Monday 11 August 2008 23:04:30 John Baldwin wrote: > > On Sunday 10 August 2008 10:01:49 pm Johan Kuuse wrote: > > > Hi, > > > > > > I am a kgdb newbie, so please be patient. > > > I suspect (just based on the fact that this is the 4th time I edit text > > > > files on my NTFS partition through ntfs-3g, using Emacs, and getting > > frequent I/O error messages inside Emacs, and then a kernel panic) that > > this is a ntfs-3g related problem. > > > > > If you ask me exactly how to reproduce it, I sorry, I can tell you > > > exactly > > > > (but see the kgdb output below). > > > > > Anyway, the kernel seems to panic at /usr/src/sys/kern/vfs_bio.c:1530 > > > > > > Just a suggestion for a patch (without knowing the functionality > > > > of /usr/src/sys/kern/vfs_bio.c): > > > The line where the kernel panics: > > > /usr/src/sys/kern/vfs_bio.c: > > > ---------------------------------- > > > VM_OBJECT_LOCK(bp->b_bufobj->bo_object); > > > ... > > > ---------------------------------- > > > > > > Comparing to another file, which does error checking before calling > > > > VM_OBJECT_LOCK: > > > /usr/src/sys/kern/vfs_aio.c: > > > ---------------------------------- > > > if (vp->v_object != NULL) { > > > VM_OBJECT_LOCK(vp->v_object); > > > ... > > > ---------------------------------- > > > > > > Perhaps the kernel panic could be avoided with the following patch? > > > /usr/src/sys/kern/vfs_bio.c (suggested patch): > > > ---------------------------------- > > > if ((bp->b_bufobj != NULL) && (bp->b_bufobj->bo_object != NULL)) { > > > VM_OBJECT_LOCK(bp->b_bufobj->bo_object); > > > ... > > > ---------------------------------- > > > > > > Please let me know if you need more information. > > > > > > Regards, > > > Johan Kuuse > > > > > > ----------------------------------------------------------------------- > > >------------------------------------ kgdb kernel.debug > > > /var/crash/vmcore.1 > > > [GDB will not be able to debug user-mode threads: > > > /usr/lib/libthread_db.so: > > > > Undefined symbol "ps_pglobal_lookup"] > > > > > GNU gdb 6.1.1 [FreeBSD] > > > Copyright 2004 Free Software Foundation, Inc. > > > GDB is free software, covered by the GNU General Public License, and > > > you are welcome to change it and/or distribute copies of it under > > > certain > > > > conditions. > > > > > Type "show copying" to see the conditions. > > > There is absolutely no warranty for GDB. Type "show warranty" for > > > details. This GDB was configured as "i386-marcel-freebsd". > > > > > > Unread portion of the kernel message buffer: > > > > > > > > > Fatal trap 12: page fault while in kernel mode > > > cpuid = 0; apic id = 00 > > > fault virtual address = 0x34 > > > fault code = supervisor read, page not present > > > instruction pointer = 0x20:0xc07b6de4 > > > stack pointer = 0x28:0xe79de7c8 > > > frame pointer = 0x28:0xe79de7e8 > > > code segment = base 0x0, limit 0xfffff, type 0x1b > > > = DPL 0, pres 1, def32 1, gran 1 > > > processor eflags = interrupt enabled, resume, IOPL = 0 > > > current process = 1214 (opera) > > > trap number = 12 > > > panic: page fault > > > cpuid = 0 > > > Uptime: 5h20m30s > > > Physical memory: 2035 MB > > > Dumping 218 MB: 203 187 171 155 139 123 107 91 75 59 43 27 11 > > > > > > #0 doadump () at pcpu.h:195 > > > 195 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); > > > (kgdb) list *0xc07b6de4 > > > 0xc07b6de4 is in vfs_vmio_release (/usr/src/sys/kern/vfs_bio.c:1530). > > > 1525 vfs_vmio_release(struct buf *bp) > > > 1526 { > > > 1527 int i; > > > 1528 vm_page_t m; > > > 1529 > > > 1530 VM_OBJECT_LOCK(bp->b_bufobj->bo_object); > > > 1531 vm_page_lock_queues(); > > > 1532 for (i = 0; i < bp->b_npages; i++) { > > > 1533 m = bp->b_pages[i]; > > > 1534 bp->b_pages[i] = NULL; > > > (kgdb) bt > > > #0 doadump () at pcpu.h:195 > > > #1 0xc0754457 in boot (howto=260) at > > > /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc0754719 in panic > > > (fmt=Variable "fmt" is not available. > > > ) at /usr/src/sys/kern/kern_shutdown.c:563 > > > #3 0xc0a4905c in trap_fatal (frame=0xe79de788, eva=52) > > > > at /usr/src/sys/i386/i386/trap.c:899 > > > > > #4 0xc0a492e0 in trap_pfault (frame=0xe79de788, usermode=0, eva=52) > > > > at /usr/src/sys/i386/i386/trap.c:812 > > > > > #5 0xc0a49c8c in trap (frame=0xe79de788) > > > > at /usr/src/sys/i386/i386/trap.c:490 > > > > > #6 0xc0a2fc0b in calltrap () at /usr/src/sys/i386/i386/exception.s:139 > > > #7 0xc07b6de4 in vfs_vmio_release (bp=0xd927e33c) > > > > at /usr/src/sys/kern/vfs_bio.c:1530 > > > > > #8 0xc07b8a81 in getnewbuf (slpflag=0, slptimeo=0, size=Variable > > > "size" is > > > > not available. > > > > > ) at /usr/src/sys/kern/vfs_bio.c:1847 > > > #9 0xc07ba118 in getblk (vp=0xc8891bb0, blkno=0, size=2048, slpflag=0, > > > > slptimeo=0, flags=Variable "flags" is not available. > > > > > ) at /usr/src/sys/kern/vfs_bio.c:2602 > > > #10 0xc0932815 in ffs_balloc_ufs2 (vp=0xc8891bb0, > > > > startoffset=Variable "startoffset" is not available. > > > > > ) at /usr/src/sys/ufs/ffs/ffs_balloc.c:699 > > > #11 0xc0952a85 in ffs_write (ap=0xe79debc4) > > > > at /usr/src/sys/ufs/ffs/ffs_vnops.c:720 > > > > > #12 0xc0a5efc6 in VOP_WRITE_APV (vop=0xc0b93c60, a=0xe79debc4) at > > > > vnode_if.c:691 > > > > > #13 0xc07dbf37 in vn_write (fp=0xc85f3168, uio=0xe79dec60, > > > > active_cred=0xc61c6300, flags=0, td=0xc583fc60) at vnode_if.h:373 > > > > > #14 0xc07875e7 in dofilewrite (td=0xc583fc60, fd=17, fp=0xc85f3168, > > > > auio=0xe79dec60, offset=-1, flags=0) at file.h:254 > > > > > #15 0xc07878c8 in kern_writev (td=0xc583fc60, fd=17, auio=0xe79dec60) > > > > at /usr/src/sys/kern/sys_generic.c:401 > > > > > #16 0xc078793f in write (td=0xc583fc60, uap=0xe79decfc) > > > > at /usr/src/sys/kern/sys_generic.c:317 > > > > > #17 0xc0a49635 in syscall (frame=0xe79ded38) > > > > at /usr/src/sys/i386/i386/trap.c:1035 > > > > > #18 0xc0a2fc70 in Xint0x80_syscall () > > > > at /usr/src/sys/i386/i386/exception.s:196 > > > > > #19 0x00000033 in ?? () > > > Previous frame inner to this frame (corrupt stack?) > > > > FYI, you got the panic in ffs/ufs, not fuse. I've seen this at work on > > 6.x with NFS with no clues on what causes it. You can start by going to > > frame 7 and doing 'p *bp'. > > Thanks for the hints. > See below for more debug output. > I recognize that the bp struct members b_data and b_kvabase both point to a > chunk of memory containing the text of the Opera web page I was reading > when the kernel crashed. (This is indicated above: current process > = 1214 (opera)) > > But what is most interesting is that b_bufobj = 0x0 > Obviously, then trying to access bp->b_bufobj->bo_object will cause a > crash. So I think it would be a good idea to NULL-check the struct member > before trying to access it. How should I proceed? Should I post this as a > possible bug somewhere else, to another list? Unfortunately, it is a worse problem that b_bufobj is NULL. That means there is a bug elsewhere. I'll look at this some more. Hmm, can you reproduce this at all? If so, can you try the patch below. Hopefully it panics here which might help: Index: vfs_subr.c =================================================================== --- vfs_subr.c (revision 181629) +++ vfs_subr.c (working copy) @@ -1546,6 +1546,9 @@ CTR3(KTR_BUF, "brelvp(%p) vp %p flags %X", bp, bp->b_vp, bp->b_flags); KASSERT(bp->b_vp != NULL, ("brelvp: NULL")); + if (bp->flags & B_VMIO) + panic("brelvp of B_VMIO buffer"); + /* * Delete from old vnode list, if on one. */ -- John Baldwin