Date: Mon, 24 Oct 2005 23:20:06 -0700 From: "Frank Mayhar" <frank@exit.com> To: "FreeBSD gnats submit" <FreeBSD-gnats-submit@FreeBSD.org> Subject: kern/87967: Page fault panic on NFS client. Message-ID: <1130221206.0@realtime.exit.com> Resent-Message-ID: <200510250630.j9P6UKsF035520@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 87967 >Category: kern >Synopsis: Page fault panic on NFS client. >Confidential: no >Severity: critical >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Oct 25 06:30:20 GMT 2005 >Closed-Date: >Last-Modified: >Originator: Frank Mayhar >Release: FreeBSD 6.0-RC1 i386 >Organization: Exit Consulting >Environment: System: FreeBSD 6.0-RC1 #1: Sun Oct 23 09:38:36 PDT 2005 frank@jill.exit.com:/usr/obj/usr/src/sys/REALTIME Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 6.0-RC1 #1: Sun Oct 23 09:38:36 PDT 2005 frank@jill.exit.com:/usr/obj/usr/src/sys/REALTIME ACPI APIC Table: <PTLTD APIC > Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) MP 1900+ (1600.07-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x662 Stepping = 2 Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE> AMD Features=0xc0480800<SYSCALL,MP,MMX+,3DNow+,3DNow> real memory = 2146959360 (2047 MB) avail memory = 2091614208 (1994 MB) FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 1 cpu1 (AP): APIC ID: 0 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 <Version 1.1> irqs 0-23 on motherboard bktr_mem: memory holder loaded npx0: [FAST] npx0: <math processor> on motherboard npx0: INT 16 interface acpi0: <PTLTD RSDT> on motherboard acpi0: Power Button (fixed) acpi0: Sleep Button (fixed) pci_link0: <ACPI PCI Link LNKA> irq 10 on acpi0 pci_link1: <ACPI PCI Link LNKB> irq 5 on acpi0 pci_link2: <ACPI PCI Link LNKC> irq 11 on acpi0 pci_link3: <ACPI PCI Link LNKD> irq 10 on acpi0 Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0 cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 acpi_button0: <Power Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff,0x8000-0x807f,0x8080-0x80ff iomem 0xd8000-0xdbfff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0 pci1: <ACPI PCI bus> on pcib1 nvidia0: <GeForce 6800 GT> mem 0xd2000000-0xd2ffffff,0xe0000000-0xefffffff,0xd1000000-0xd1ffffff irq 18 at device 5.0 on pci1 nvidia0: [GIANT-LOCKED] isab0: <PCI-ISA bridge> at device 7.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <AMD 768 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device 7.1 on pci0 ata0: <ATA channel 0> on atapci0 ata1: <ATA channel 1> on atapci0 pci0: <bridge> at device 7.3 (no driver attached) ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0x1000-0x10ff mem 0xd0060000-0xd0060fff irq 20 at device 8.0 on pci0 ahc0: [GIANT-LOCKED] aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs em0: <Intel(R) PRO/1000 Network Connection, Version - 2.1.7> port 0x1400-0x143f mem 0xd0040000-0xd005ffff,0xd0000000-0xd003ffff irq 21 at device 9.0 on pci0 em0: Ethernet address: 00:04:23:a6:1e:38 em0: Speed:N/A Duplex:N/A pcib2: <ACPI PCI-PCI bridge> at device 16.0 on pci0 pci2: <ACPI PCI bus> on pcib2 ohci0: <OHCI (generic) USB controller> mem 0xd3004000-0xd3004fff irq 19 at device 0.0 on pci2 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: <OHCI (generic) USB controller> on ohci0 usb0: USB revision 1.0 uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 4 ports with 4 removable, self powered pcm0: <Creative Audigy (EMU10K2)> port 0x2080-0x209f irq 17 at device 5.0 on pci2 pcm0: <SigmaTel STAC9721/23 AC97 Codec> fwohci0: <1394 Open Host Controller Interface> mem 0xd3007000-0xd30077ff,0xd3000000-0xd3003fff irq 18 at device 5.2 on pci2 fwohci0: OHCI version 1.10 (ROM=0) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:02:3c:00:50:00:60:21 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: <IEEE1394(FireWire) bus> on fwohci0 fwohci0: Initiate bus reset fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) firewire0: bus manager 0 (me) bktr0: <BrookTree 878> mem 0xf0000000-0xf0000fff irq 18 at device 6.0 on pci2 bktr0: [GIANT-LOCKED] bktr0: Hauppauge Model 37281 B223 bktr0: Detected a MSP3430G-A1 at 0x80 bktr0: Detected a DPL3418A-A2 at 0x84 bktr0: Hauppauge WinCast/TV, Philips FR1236 NTSC FM tuner, msp3400c stereo, dpl3518a dolby, remote control. pci2: <multimedia> at device 6.1 (no driver attached) ohci1: <NEC uPD 9210 USB controller> mem 0xd3005000-0xd3005fff irq 19 at device 7.0 on pci2 ohci1: [GIANT-LOCKED] usb1: OHCI version 1.0 usb1: <NEC uPD 9210 USB controller> on ohci1 usb1: USB revision 1.0 uhub1: NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 3 ports with 3 removable, self powered ohci2: <NEC uPD 9210 USB controller> mem 0xd3006000-0xd3006fff irq 16 at device 7.1 on pci2 ohci2: [GIANT-LOCKED] usb2: OHCI version 1.0 usb2: <NEC uPD 9210 USB controller> on ohci2 usb2: USB revision 1.0 uhub2: NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered ehci0: <NEC uPD 720100 USB 2.0 controller> mem 0xd3007800-0xd30078ff irq 17 at device 7.2 on pci2 ehci0: [GIANT-LOCKED] usb3: EHCI version 1.0 usb3: companion controllers, 3 ports each: usb1 usb2 usb3: <NEC uPD 720100 USB 2.0 controller> on ehci0 usb3: USB revision 2.0 uhub3: NEC EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub3: 5 ports with 5 removable, self powered xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0x2000-0x207f mem 0xd3007c00-0xd3007c7f irq 19 at device 8.0 on pci2 miibus0: <MII bus> on xl0 ukphy0: <Generic IEEE 802.3u media interface> on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto xl0: Ethernet address: 00:e0:81:20:cc:de atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model Generic PS/2 mouse, device ID 0 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A, console sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] orm0: <ISA Option ROMs> at iomem 0xc0000-0xce7ff,0xdc000-0xdffff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x100> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 uhub2: device problem (SET_ADDR_FAILED), disabling port 2 Timecounters tick every 1.000 msec Waiting 15 seconds for SCSI devices to settle uhub2: device problem (SET_ADDR_FAILED), disabling port 2 uhub2: port error, restarting port 2 uhub2: device problem (SET_ADDR_FAILED), disabling port 2 uhub2: port error, restarting port 2 uhub2: device problem (SET_ADDR_FAILED), disabling port 2 uhub2: port error, restarting port 2 da0 at ahc0 bus 0 target 0 lun 0 da0: <FUJITSU MAF3364L SUN36G 1213> Fixed Direct Access SCSI-2 device da0: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled da0: 34732MB (71132959 512 byte sectors: 64H 32S/T 34732C) da2 at ahc0 bus 0 target 2 lun 0 da2: <HP 36.4GB A 80-F313 > Fixed Direct Access SCSI-2 device da2: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled da2: 34732MB (71132960 512 byte sectors: 64H 32S/T 34732C) da8 at ahc0 bus 0 target 8 lun 0 da8: <QUANTUM ATLAS10K2-TY367J DA40> Fixed Direct Access SCSI-3 device da8: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged Queueing Enabled da8: 34732MB (71132959 512 byte sectors: 64H 32S/T 34732C) da9 at ahc0 bus 0 target 9 lun 0 da9: <IBM IC35L036UWD210-0 S5BS> Fixed Direct Access SCSI-3 device da9: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled da9: 35003MB (71687340 512 byte sectors: 64H 32S/T 35003C) SMP: AP CPU #1 Launched! >Description: This has happened twice in 24 hours, both times under load. Both times it was building OpenOffice in an NFS-mounted /usr/ports. In case it matters, it's a soft mount from another 6.0 box over an em(4) interface with an MTU of 9000. Both times it was a panic from vnlru while trying to flush a vnode and both times it was a null-pointer dereference in nfs_putpages() at nfs_bio.c:301. In both cases vp->v_data was null. The vnode itself looks fine to my eyes, although there may well be FreeBSD-specific subtleties that I'm missing. Below is the traceback and a dump of the vnode in question. I'm keeping the cores around, so if there's more information needed, I'll be happy to provide it. (kgdb) bt #0 doadump () at pcpu.h:165 #1 0xc05397b7 in boot (howto=0x104) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xc0539b0d in panic (fmt=0xc06bf3da "%s") at /usr/src/sys/kern/kern_shutdown.c:555 #3 0xc06a0c2a in trap_fatal (frame=0xe7e1c958, eva=0x4) at /usr/src/sys/i386/i386/trap.c:831 #4 0xc06a0937 in trap_pfault (frame=0xe7e1c958, usermode=0x0, eva=0x4) at /usr/src/sys/i386/i386/trap.c:742 #5 0xc06a054d in trap (frame= {tf_fs = 0x8, tf_es = 0x28, tf_ds = 0x28, tf_edi = 0x1000, tf_esi = 0x1, tf_ebp = 0xe7e1c9fc, tf_isp = 0xe7e1c984, tf_ebx = 0x0, tf_edx = 0x0, tf_ecx = 0x1, tf_eax = 0x1000, tf_trapno = 0xc, tf_err = 0x0, tf_eip = 0xc05fad40, tf_cs = 0x20, tf_eflags = 0x10246, tf_esp = 0xe7e1cab0, tf_ss = 0xc43f8d20}) at /usr/src/sys/i386/i386/trap.c:432 #6 0xc068de8a in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc05fad40 in nfs_putpages (ap=0xe7e1ca20) at /usr/src/sys/nfsclient/nfs_bio.c:301 #8 0xc06aacc9 in VOP_PUTPAGES_APV (vop=0x1000, a=0x0) at vnode_if.c:2164 #9 0xc0667fa4 in vnode_pager_putpages (object=0xc806139c, m=0x1000, count=0x1000, sync=0x5, rtvals=0x1000) at vnode_if.h:1119 #10 0xc0662faf in vm_pageout_flush (mc=0xe7e1cab0, count=0x1, flags=0x5) at vm_pager.h:147 #11 0xc065f0d4 in vm_object_page_collect_flush (object=0xc806139c, p=0xc2b921b0, curgeneration=0x1, pagerflags=0x5) at /usr/src/sys/vm/vm_object.c:950 #12 0xc065eb44 in vm_object_page_clean (object=0xc806139c, start=0x0, end=Unhandled dwarf expression opcode 0x93 ) at /usr/src/sys/vm/vm_object.c:753 #13 0xc065e7e0 in vm_object_terminate (object=0xc806139c) at /usr/src/sys/vm/vm_object.c:608 #14 0xc06662ca in vnode_destroy_vobject (vp=0xc403baa0) at /usr/src/sys/vm/vnode_pager.c:166 #15 0xc05fdd69 in nfs_reclaim (ap=0x1000) at /usr/src/sys/nfsclient/nfs_node.c:247 #16 0xc06aa80c in VOP_RECLAIM_APV (vop=0x1000, a=0x0) at vnode_if.c:1589 #17 0xc059236d in vgonel (vp=0xc403baa0) at vnode_if.h:818 #18 0xc058ed4d in vlrureclaim (mp=0xc3d10000) at /usr/src/sys/kern/vfs_subr.c:611 #19 0xc058f1cf in vnlru_proc () at /usr/src/sys/kern/vfs_subr.c:724 #20 0xc052354d in fork_exit (callout=0xc058efbc <vnlru_proc>, arg=0x0, frame=0xe7e1cd38) at /usr/src/sys/kern/kern_fork.c:789 #21 0xc068deec in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:208 (kgdb) print /x *vp $1 = { v_type = 0x1, v_tag = 0xc06deecb, v_op = 0xc0715da0, v_data = 0x0, v_mount = 0xc3d10000, v_nmntvnodes = { tqe_next = 0x0, tqe_prev = 0xc7b959a4 }, v_un = { vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0 }, v_hashlist = { le_next = 0x0, le_prev = 0xc3763808 }, v_hash = 0x11463ccc, v_cache_src = { lh_first = 0x0 }, v_cache_dst = { tqh_first = 0xc7b8cd48, tqh_last = 0xc7b8cd58 }, v_dd = 0x0, v_cstart = 0x0, v_lasta = 0x0, v_lastw = 0x0, v_clen = 0x0, v_lock = { lk_interlock = 0xc072c824, lk_flags = 0x40080, lk_sharecount = 0x0, lk_waitcount = 0x0, lk_exclusivecount = 0x1, lk_prio = 0x50, lk_wmesg = 0xc06deecb, lk_timo = 0x33, lk_lockholder = 0xc35add80, lk_newlock = 0x0 }, v_interlock = { mtx_object = { lo_class = 0xc0707704, lo_name = 0xc06d8e89, lo_type = 0xc06d8e89, lo_flags = 0x30000, lo_list = { tqe_next = 0x0, tqe_prev = 0x0 }, lo_witness = 0x0 }, mtx_lock = 0x4, mtx_recurse = 0x0 }, v_vnlock = 0xc403baf8, v_holdcnt = 0x2, v_usecount = 0x0, v_iflag = 0x480, v_vflag = 0x0, v_writecount = 0x0, v_freelist = { tqe_next = 0x0, tqe_prev = 0x0 }, v_bufobj = { bo_mtx = 0xc403bb1c, bo_clean = { bv_hd = { tqh_first = 0x0, tqh_last = 0xc403bb64 }, bv_root = 0x0, bv_cnt = 0x0 }, bo_dirty = { bv_hd = { tqh_first = 0x0, tqh_last = 0xc403bb74 }, bv_root = 0x0, bv_cnt = 0x0 }, bo_numoutput = 0x0, bo_flag = 0x0, bo_ops = 0xc071608c, bo_bsize = 0x2000, bo_object = 0xc806139c, bo_synclist = { le_next = 0x0, le_prev = 0xc3e64b98 }, bo_private = 0xc403baa0, __bo_vnode = 0xc403baa0 }, v_pollinfo = 0x0, v_label = 0x0 } >How-To-Repeat: >Fix: None, alas. >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1130221206.0>