From owner-freebsd-bugs@FreeBSD.ORG Tue Oct 25 06:30:21 2005 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C16B316A422 for ; Tue, 25 Oct 2005 06:30:21 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id CDFE343D4C for ; Tue, 25 Oct 2005 06:30:20 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.3/8.13.3) with ESMTP id j9P6UK7d035521 for ; Tue, 25 Oct 2005 06:30:20 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.3/8.13.1/Submit) id j9P6UKsF035520; Tue, 25 Oct 2005 06:30:20 GMT (envelope-from gnats) Resent-Date: Tue, 25 Oct 2005 06:30:20 GMT Resent-Message-Id: <200510250630.j9P6UKsF035520@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, "Frank Mayhar" Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 40EC216A422 for ; Tue, 25 Oct 2005 06:20:18 +0000 (GMT) (envelope-from frank@exit.com) Received: from tinker.exit.com (tinker.exit.com [206.223.0.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id C3E8443D53 for ; Tue, 25 Oct 2005 06:20:09 +0000 (GMT) (envelope-from frank@exit.com) Received: from realtime.exit.com (realtime [206.223.0.5]) by tinker.exit.com (8.13.4/8.13.4) with ESMTP id j9P6K7Tu056296 for ; Mon, 24 Oct 2005 23:20:07 -0700 (PDT) (envelope-from frank@exit.com) Message-Id: <1130221206.0@realtime.exit.com> Date: Mon, 24 Oct 2005 23:20:06 -0700 From: "Frank Mayhar" To: "FreeBSD gnats submit" X-Send-Pr-Version: gtk-send-pr 0.4.6 Cc: Subject: kern/87967: Page fault panic on NFS client. X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Oct 2005 06:30:21 -0000 >Number: 87967 >Category: kern >Synopsis: Page fault panic on NFS client. >Confidential: no >Severity: critical >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Oct 25 06:30:20 GMT 2005 >Closed-Date: >Last-Modified: >Originator: Frank Mayhar >Release: FreeBSD 6.0-RC1 i386 >Organization: Exit Consulting >Environment: System: FreeBSD 6.0-RC1 #1: Sun Oct 23 09:38:36 PDT 2005 frank@jill.exit.com:/usr/obj/usr/src/sys/REALTIME Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 6.0-RC1 #1: Sun Oct 23 09:38:36 PDT 2005 frank@jill.exit.com:/usr/obj/usr/src/sys/REALTIME ACPI APIC Table: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) MP 1900+ (1600.07-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x662 Stepping = 2 Features=0x383fbff AMD Features=0xc0480800 real memory = 2146959360 (2047 MB) avail memory = 2091614208 (1994 MB) FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 1 cpu1 (AP): APIC ID: 0 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 irqs 0-23 on motherboard bktr_mem: memory holder loaded npx0: [FAST] npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard acpi0: Power Button (fixed) acpi0: Sleep Button (fixed) pci_link0: irq 10 on acpi0 pci_link1: irq 5 on acpi0 pci_link2: irq 11 on acpi0 pci_link3: irq 10 on acpi0 Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0 cpu0: on acpi0 cpu1: on acpi0 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff,0x8000-0x807f,0x8080-0x80ff iomem 0xd8000-0xdbfff on acpi0 pci0: on pcib0 pcib1: at device 1.0 on pci0 pci1: on pcib1 nvidia0: mem 0xd2000000-0xd2ffffff,0xe0000000-0xefffffff,0xd1000000-0xd1ffffff irq 18 at device 5.0 on pci1 nvidia0: [GIANT-LOCKED] isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device 7.1 on pci0 ata0: on atapci0 ata1: on atapci0 pci0: at device 7.3 (no driver attached) ahc0: port 0x1000-0x10ff mem 0xd0060000-0xd0060fff irq 20 at device 8.0 on pci0 ahc0: [GIANT-LOCKED] aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs em0: port 0x1400-0x143f mem 0xd0040000-0xd005ffff,0xd0000000-0xd003ffff irq 21 at device 9.0 on pci0 em0: Ethernet address: 00:04:23:a6:1e:38 em0: Speed:N/A Duplex:N/A pcib2: at device 16.0 on pci0 pci2: on pcib2 ohci0: mem 0xd3004000-0xd3004fff irq 19 at device 0.0 on pci2 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: on ohci0 usb0: USB revision 1.0 uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 4 ports with 4 removable, self powered pcm0: port 0x2080-0x209f irq 17 at device 5.0 on pci2 pcm0: fwohci0: <1394 Open Host Controller Interface> mem 0xd3007000-0xd30077ff,0xd3000000-0xd3003fff irq 18 at device 5.2 on pci2 fwohci0: OHCI version 1.10 (ROM=0) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:02:3c:00:50:00:60:21 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 fwohci0: Initiate bus reset fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) firewire0: bus manager 0 (me) bktr0: mem 0xf0000000-0xf0000fff irq 18 at device 6.0 on pci2 bktr0: [GIANT-LOCKED] bktr0: Hauppauge Model 37281 B223 bktr0: Detected a MSP3430G-A1 at 0x80 bktr0: Detected a DPL3418A-A2 at 0x84 bktr0: Hauppauge WinCast/TV, Philips FR1236 NTSC FM tuner, msp3400c stereo, dpl3518a dolby, remote control. pci2: at device 6.1 (no driver attached) ohci1: mem 0xd3005000-0xd3005fff irq 19 at device 7.0 on pci2 ohci1: [GIANT-LOCKED] usb1: OHCI version 1.0 usb1: on ohci1 usb1: USB revision 1.0 uhub1: NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 3 ports with 3 removable, self powered ohci2: mem 0xd3006000-0xd3006fff irq 16 at device 7.1 on pci2 ohci2: [GIANT-LOCKED] usb2: OHCI version 1.0 usb2: on ohci2 usb2: USB revision 1.0 uhub2: NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered ehci0: mem 0xd3007800-0xd30078ff irq 17 at device 7.2 on pci2 ehci0: [GIANT-LOCKED] usb3: EHCI version 1.0 usb3: companion controllers, 3 ports each: usb1 usb2 usb3: on ehci0 usb3: USB revision 2.0 uhub3: NEC EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub3: 5 ports with 5 removable, self powered xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0x2000-0x207f mem 0xd3007c00-0xd3007c7f irq 19 at device 8.0 on pci2 miibus0: on xl0 ukphy0: on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto xl0: Ethernet address: 00:e0:81:20:cc:de atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model Generic PS/2 mouse, device ID 0 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A, console sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] orm0: at iomem 0xc0000-0xce7ff,0xdc000-0xdffff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x100> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 uhub2: device problem (SET_ADDR_FAILED), disabling port 2 Timecounters tick every 1.000 msec Waiting 15 seconds for SCSI devices to settle uhub2: device problem (SET_ADDR_FAILED), disabling port 2 uhub2: port error, restarting port 2 uhub2: device problem (SET_ADDR_FAILED), disabling port 2 uhub2: port error, restarting port 2 uhub2: device problem (SET_ADDR_FAILED), disabling port 2 uhub2: port error, restarting port 2 da0 at ahc0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-2 device da0: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled da0: 34732MB (71132959 512 byte sectors: 64H 32S/T 34732C) da2 at ahc0 bus 0 target 2 lun 0 da2: Fixed Direct Access SCSI-2 device da2: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled da2: 34732MB (71132960 512 byte sectors: 64H 32S/T 34732C) da8 at ahc0 bus 0 target 8 lun 0 da8: Fixed Direct Access SCSI-3 device da8: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged Queueing Enabled da8: 34732MB (71132959 512 byte sectors: 64H 32S/T 34732C) da9 at ahc0 bus 0 target 9 lun 0 da9: Fixed Direct Access SCSI-3 device da9: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled da9: 35003MB (71687340 512 byte sectors: 64H 32S/T 35003C) SMP: AP CPU #1 Launched! >Description: This has happened twice in 24 hours, both times under load. Both times it was building OpenOffice in an NFS-mounted /usr/ports. In case it matters, it's a soft mount from another 6.0 box over an em(4) interface with an MTU of 9000. Both times it was a panic from vnlru while trying to flush a vnode and both times it was a null-pointer dereference in nfs_putpages() at nfs_bio.c:301. In both cases vp->v_data was null. The vnode itself looks fine to my eyes, although there may well be FreeBSD-specific subtleties that I'm missing. Below is the traceback and a dump of the vnode in question. I'm keeping the cores around, so if there's more information needed, I'll be happy to provide it. (kgdb) bt #0 doadump () at pcpu.h:165 #1 0xc05397b7 in boot (howto=0x104) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xc0539b0d in panic (fmt=0xc06bf3da "%s") at /usr/src/sys/kern/kern_shutdown.c:555 #3 0xc06a0c2a in trap_fatal (frame=0xe7e1c958, eva=0x4) at /usr/src/sys/i386/i386/trap.c:831 #4 0xc06a0937 in trap_pfault (frame=0xe7e1c958, usermode=0x0, eva=0x4) at /usr/src/sys/i386/i386/trap.c:742 #5 0xc06a054d in trap (frame= {tf_fs = 0x8, tf_es = 0x28, tf_ds = 0x28, tf_edi = 0x1000, tf_esi = 0x1, tf_ebp = 0xe7e1c9fc, tf_isp = 0xe7e1c984, tf_ebx = 0x0, tf_edx = 0x0, tf_ecx = 0x1, tf_eax = 0x1000, tf_trapno = 0xc, tf_err = 0x0, tf_eip = 0xc05fad40, tf_cs = 0x20, tf_eflags = 0x10246, tf_esp = 0xe7e1cab0, tf_ss = 0xc43f8d20}) at /usr/src/sys/i386/i386/trap.c:432 #6 0xc068de8a in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc05fad40 in nfs_putpages (ap=0xe7e1ca20) at /usr/src/sys/nfsclient/nfs_bio.c:301 #8 0xc06aacc9 in VOP_PUTPAGES_APV (vop=0x1000, a=0x0) at vnode_if.c:2164 #9 0xc0667fa4 in vnode_pager_putpages (object=0xc806139c, m=0x1000, count=0x1000, sync=0x5, rtvals=0x1000) at vnode_if.h:1119 #10 0xc0662faf in vm_pageout_flush (mc=0xe7e1cab0, count=0x1, flags=0x5) at vm_pager.h:147 #11 0xc065f0d4 in vm_object_page_collect_flush (object=0xc806139c, p=0xc2b921b0, curgeneration=0x1, pagerflags=0x5) at /usr/src/sys/vm/vm_object.c:950 #12 0xc065eb44 in vm_object_page_clean (object=0xc806139c, start=0x0, end=Unhandled dwarf expression opcode 0x93 ) at /usr/src/sys/vm/vm_object.c:753 #13 0xc065e7e0 in vm_object_terminate (object=0xc806139c) at /usr/src/sys/vm/vm_object.c:608 #14 0xc06662ca in vnode_destroy_vobject (vp=0xc403baa0) at /usr/src/sys/vm/vnode_pager.c:166 #15 0xc05fdd69 in nfs_reclaim (ap=0x1000) at /usr/src/sys/nfsclient/nfs_node.c:247 #16 0xc06aa80c in VOP_RECLAIM_APV (vop=0x1000, a=0x0) at vnode_if.c:1589 #17 0xc059236d in vgonel (vp=0xc403baa0) at vnode_if.h:818 #18 0xc058ed4d in vlrureclaim (mp=0xc3d10000) at /usr/src/sys/kern/vfs_subr.c:611 #19 0xc058f1cf in vnlru_proc () at /usr/src/sys/kern/vfs_subr.c:724 #20 0xc052354d in fork_exit (callout=0xc058efbc , arg=0x0, frame=0xe7e1cd38) at /usr/src/sys/kern/kern_fork.c:789 #21 0xc068deec in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:208 (kgdb) print /x *vp $1 = { v_type = 0x1, v_tag = 0xc06deecb, v_op = 0xc0715da0, v_data = 0x0, v_mount = 0xc3d10000, v_nmntvnodes = { tqe_next = 0x0, tqe_prev = 0xc7b959a4 }, v_un = { vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0 }, v_hashlist = { le_next = 0x0, le_prev = 0xc3763808 }, v_hash = 0x11463ccc, v_cache_src = { lh_first = 0x0 }, v_cache_dst = { tqh_first = 0xc7b8cd48, tqh_last = 0xc7b8cd58 }, v_dd = 0x0, v_cstart = 0x0, v_lasta = 0x0, v_lastw = 0x0, v_clen = 0x0, v_lock = { lk_interlock = 0xc072c824, lk_flags = 0x40080, lk_sharecount = 0x0, lk_waitcount = 0x0, lk_exclusivecount = 0x1, lk_prio = 0x50, lk_wmesg = 0xc06deecb, lk_timo = 0x33, lk_lockholder = 0xc35add80, lk_newlock = 0x0 }, v_interlock = { mtx_object = { lo_class = 0xc0707704, lo_name = 0xc06d8e89, lo_type = 0xc06d8e89, lo_flags = 0x30000, lo_list = { tqe_next = 0x0, tqe_prev = 0x0 }, lo_witness = 0x0 }, mtx_lock = 0x4, mtx_recurse = 0x0 }, v_vnlock = 0xc403baf8, v_holdcnt = 0x2, v_usecount = 0x0, v_iflag = 0x480, v_vflag = 0x0, v_writecount = 0x0, v_freelist = { tqe_next = 0x0, tqe_prev = 0x0 }, v_bufobj = { bo_mtx = 0xc403bb1c, bo_clean = { bv_hd = { tqh_first = 0x0, tqh_last = 0xc403bb64 }, bv_root = 0x0, bv_cnt = 0x0 }, bo_dirty = { bv_hd = { tqh_first = 0x0, tqh_last = 0xc403bb74 }, bv_root = 0x0, bv_cnt = 0x0 }, bo_numoutput = 0x0, bo_flag = 0x0, bo_ops = 0xc071608c, bo_bsize = 0x2000, bo_object = 0xc806139c, bo_synclist = { le_next = 0x0, le_prev = 0xc3e64b98 }, bo_private = 0xc403baa0, __bo_vnode = 0xc403baa0 }, v_pollinfo = 0x0, v_label = 0x0 } >How-To-Repeat: >Fix: None, alas. >Release-Note: >Audit-Trail: >Unformatted: