Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Feb 2001 23:05:51 -0800
From:      "Ted Mittelstaedt" <tedm@toybox.placo.com>
To:        "Scott Macy" <scott@vicor-nb.com>, <freebsd-questions@FreeBSD.ORG>
Cc:        "milt" <milt@vicor-nb.com>, "Cayford Burrell" <cayford@vicor-nb.com>, "John Lynch" <jpl@vicor-nb.com>, "Josh Howard" <jrh@vicor-nb.com>, "Julian Elischer" <julian@vicor-nb.com>
Subject:   RE: Help with panic: page fault in FBSD 4.1.1
Message-ID:  <000001c09c9d$e3bd33c0$1401a8c0@tedm.placo.com>
In-Reply-To: <20010221004855.83D4729A@muir.vicor-nb.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Try exchanging ram in one of the old systems with another
system and see if the problem follows the ram.  Also compare
BIOS settings from a crashing system with a good system - there
may be some setting that's different.

Ted Mittelstaedt                      tedm@toybox.placo.com
Author of:          The FreeBSD Corporate Networker's Guide
Book website:         http://www.freebsd-corp-net-guide.com


> -----Original Message-----
> From: owner-freebsd-questions@FreeBSD.ORG
> [mailto:owner-freebsd-questions@FreeBSD.ORG]On Behalf Of Scott Macy
> Sent: Tuesday, February 20, 2001 4:49 PM
> To: freebsd-questions@FreeBSD.ORG
> Cc: milt; Cayford Burrell; John Lynch; Josh Howard; Scott Macy; Julian
> Elischer
> Subject: Help with panic: page fault in FBSD 4.1.1
> 
> 
> We have been upgrading about 300 PCs at our clients production sites
> from FreeBSD 2.2.6 to FreeBSD 4.1.1.  On two of the oldest machines we
> have had frequent (daily or so) Kernel crashes with "panic: page fault".
> We have two kernel core dumps which show the exact same stack trace.
> 
> We are not seeing these crashes on the newer machines with 4.1.1 running
> the same application, and the machines that are crashing were stable
> under 2.2.6.
> 
> I've looked though the mail archives, and the GNATS database and not
> found a match to our problem.
> 
> What's going on??  What can we do to elminate the crashes??  Is there
> some Kernel config setting that might be causing the problem?  Is there
> another option than "buy new hw?"
> 
> 
> 
> Some Details:
> 
> The machines that are crashing are "server" machines, with 6x60GB
> RAIDs, mostly running our own "QFT" (Quick File Transfer)
> daemons that do *lots* of network and disk IO.  The crashing machines
> are Pentuim (Pro?) 200's (see dmesg below).  We are not seeing the same
> crash problem on user workstations that are of the same vintage,
> same OS, but not as heavily loaded.
> 
> We swapped hardware on one of these older machines with a newer one, and
> the problem went away when the same disk was used with Pentium II 350
> CPU.
> 
> We now have trapped two kernel core dumps and both give exactly the
> same information.  Below is Kernel Debug Stack Trace info, then Dmesg
> info.
> 
> The process that is running at the time of the panic is qftListener,
> which is the QFT daemon mentioned above.  We have two core
> dumps, both time's it's in a qftListener "open()" call.
> 
> Thanks,
>  -Scott Macy 
> 
> 
> 
> Gory Details:
> ===================================================================
> bigwoop Feb 20 10:54am  ~cayford/oos0b_crash 107: gdb -k kern* vm*
> GNU gdb 4.18
> Copyright 1998 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, 
> and you are
> welcome to change it and/or distribute copies of it under certain 
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" 
> for details.
> This GDB was configured as "i386-unknown-freebsd"...
> (no debugging symbols found)...
> IdlePTD 4358144
> initial pcb at 387100
> panicstr: page fault
> panic messages:
> ---
> Fatal trap 12: page fault while in kernel mode
> fault virtual address   = 0x14
> fault code              = supervisor read, page not present
> instruction pointer     = 0x8:0xc015129b
> stack pointer           = 0x10:0xc8f58a08
> frame pointer           = 0x10:0xc8f58a70
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 22304 (qftListener)
> interrupt mask          = 
> trap number             = 12
> panic: page fault
> 
> syncing disks... 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 
> 30 30 30 30 
> giving up on 26 buffers
> Uptime: 1d9h32m48s
> (da1:ahc1:0:0:0): Synchronize cache failed, status == 0x34, scsi 
> status == 0x0
> (da2:ahc1:0:1:0): Synchronize cache failed, status == 0xb, scsi 
> status == 0x0
> (da3:ahc1:0:2:0): Synchronize cache failed, status == 0xb, scsi 
> status == 0x0
> 
> dumping to dev #da/0x20001, offset 1310720
> dump 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 
> 113 112 111 110
>  109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 
> 91 90 89 88 87 
> 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 
> 64 63 62 61 60
>  59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 
> 38 37 36 35 34 3
> 3 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 
> 11 10 9 8 7 6 5
>  4 3 2 1 
> ---
> #0  0xc0193748 in boot ()
> (kgdb) bt
> #0  0xc0193748 in boot ()
> #1  0xc0193acc in poweroff_wait ()
> #2  0xc02ede85 in trap_fatal ()
> #3  0xc02edb5d in trap_pfault ()
> #4  0xc02ed717 in trap ()
> #5  0xc015129b in ahc_action ()
> #6  0xc012650f in xpt_run_dev_sendq ()
> #7  0xc01258ff in xpt_action ()
> #8  0xc012cd60 in dastart ()
> #9  0xc01262b8 in xpt_run_dev_allocq ()
> #10 0xc01261e7 in xpt_schedule ()
> #11 0xc012c2c8 in dastrategy ()
> #12 0xc019c9e9 in diskstrategy ()
> #13 0xc01c8ed0 in spec_strategy ()
> #14 0xc01c89a5 in spec_vnoperate ()
> #15 0xc0292ddd in ufs_vnoperatespec ()
> #16 0xc0292845 in ufs_strategy ()
> #17 0xc0292dad in ufs_vnoperate ()
> #18 0xc01b61e6 in bread ()
> #19 0xc0289653 in ffs_blkatoff ()
> #20 0xc028dfe9 in ufs_lookup ()
> #21 0xc0292dad in ufs_vnoperate ()
> #22 0xc01b9bf9 in vfs_cache_lookup ()
> ---Type <return> to continue, or q <return> to quit---
> #23 0xc0292dad in ufs_vnoperate ()
> #24 0xc01bc988 in lookup ()
> #25 0xc01bc484 in namei ()
> #26 0xc01c490a in vn_open ()
> #27 0xc01c0c85 in open ()
> #28 0xc02ee131 in syscall2 ()
> #29 0xc02dfca5 in Xint0x80_syscall ()
> #30 0x8050a69 in ?? ()
> #31 0x8051a98 in ?? ()
> #32 0x804a46d in ?? ()
> (kgdb) h
> 
> ===================================================================
> Dmesg:
> 
> Copyright (c) 1992-2000 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> 	The Regents of the University of California. All rights reserved.
> FreeBSD 4.1.1-RELEASE #0: Tue Sep 26 00:46:59 GMT 2000
>     jkh@narf.osd.bsdi.com:/usr/src/sys/compile/GENERIC
> Timecounter "i8254"  frequency 1193182 Hz
> CPU: Pentium/P55C (200.46-MHz 586-class CPU)
>   Origin = "GenuineIntel"  Id = 0x544  Stepping = 4
>   Features=0x8001bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,MMX>
> real memory  = 134217728 (131072K bytes)
> avail memory = 126521344 (123556K bytes)
> Preloaded elf kernel "kernel" at 0xc0416000.
> Intel Pentium detected, installing workaround for F00F bug
> md0: Malloc disk
> npx0: <math processor> on motherboard
> npx0: INT 16 interface
> pcib0: <Host to PCI bridge> on motherboard
> pci0: <PCI bus> on pcib0
> isab0: <Intel 82371AB PCI to ISA bridge> at device 7.0 on pci0
> isa0: <ISA bus> on isab0
> atapci0: <Intel PIIX4 ATA33 controller> port 0xf000-0xf00f at 
> device 7.1 on pci0
> ata0: at 0x1f0 irq 14 on atapci0
> ata1: at 0x170 irq 15 on atapci0
> uhci0: <Intel 82371AB/EB (PIIX4) USB controller> port 
> 0x6400-0x641f irq 11 at device 7.2 on pci0
> usb0: <Intel 82371AB/EB (PIIX4) USB controller> on uhci0
> usb0: USB revision 1.0
> uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> uhub0: 2 ports with 2 removable, self powered
> chip1: <Intel 82371AB Power management controller> port 
> 0x5f00-0x5f0f at device 7.3 on pci0
> ahc0: <Adaptec 2940 Ultra SCSI adapter> port 0x6800-0x68ff mem 
> 0xe4100000-0xe4100fff irq 10 at device 9.0 on pci0
> aic7880: Wide Channel A, SCSI Id=7, 16/255 SCBs
> ahc1: <Adaptec 2940 Ultra SCSI adapter> port 0x6c00-0x6cff mem 
> 0xe4102000-0xe4102fff irq 5 at device 10.0 on pci0
> aic7880: Wide Channel A, SCSI Id=7, 16/255 SCBs
> pci0: <S3 ViRGE DX/GX graphics accelerator> at 11.0 irq 9
> fxp0: <Intel Pro 10/100B/100+ Ethernet> port 0x7000-0x701f mem 
> 0xe4000000-0xe40fffff,0xe4101000-0xe4101fff irq 11 at device 12.0 on pci0
> fxp0: Ethernet address 00:a0:c9:5b:f7:78
> eisa0: <EISA bus> on motherboard
> eisa0: unknown card ADP7881 (0x04907881) at slot 6
> fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
> fdc0: FIFO enabled, 8 bytes threshold
> fd0: <1440-KB 3.5" drive> on fdc0 drive 0
> atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
> atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
> kbd0 at atkbd0
> psm0: <PS/2 Mouse> irq 12 on atkbdc0
> psm0: model Generic PS/2 mouse, device ID 0
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
> sc0: <System console> at flags 0x100 on isa0
> sc0: VGA <16 virtual consoles, flags=0x300>
> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
> sio0: type 16550A
> sio1 at port 0x2f8-0x2ff irq 3 on isa0
> sio1: type 16550A
> ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
> ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
> plip0: <PLIP network interface> on ppbus0
> lpt0: <Printer> on ppbus0
> lpt0: Interrupt-driven port
> ppi0: <Parallel I/O> on ppbus0
> Waiting 15 seconds for SCSI devices to settle
> da0 at ahc0 bus 0 target 0 lun 0
> da0: <SEAGATE ST19171N 0023> Fixed Direct Access SCSI-2 device 
> da0: 10.000MB/s transfers (10.000MHz, offset 15)
> da0: 8683MB (17783112 512 byte sectors: 255H 63S/T 1106C)
> da1 at ahc1 bus 0 target 0 lun 0
> da1: <Seek seek5x18a-P0 0204> Fixed Direct Access SCSI-2 device 
> da1: 20.000MB/s transfers (10.000MHz, offset 8, 16bit)
> da1: 61220MB (125380096 512 byte sectors: 255H 63S/T 7804C)
> da2 at ahc1 bus 0 target 1 lun 0
> da2: <Seek seek5x18b-P0 0204> Fixed Direct Access SCSI-2 device 
> da2: 20.000MB/s transfers (10.000MHz, offset 8, 16bit)
> da2: 61220MB (125380096 512 byte sectors: 255H 63S/T 7804C)
> da3 at ahc1 bus 0 target 2 lun 0
> da3: <Seek seek5x18c-P0 0204> Fixed Direct Access SCSI-2 device 
> da3: 20.000MB/s transfers (10.000MHz, offset 8, 16bit)
> da3: 61220MB (125380096 512 byte sectors: 255H 63S/T 7804C)
> Mounting root from ufs:/dev/da0s1a
> WARNING: / was not properly dismounted
> ===================================================================
> 
> 
> And /var/log/messages give no hit to the problem:
> ===================================================================
> Feb 15 20:12:26 oos0b inetd[149]: ntalk/udp: no such user 'tty', 
> service ignored
> Feb 15 20:19:49 oos0b ntpd[105]: time reset -6.935659 s
> Feb 15 20:19:49 oos0b ntpd[105]: kernel pll status change 2041
> Feb 16 22:46:37 oos0b mountd[112]: umountall request from 
> 192.168.40.39 from unprivileged port
> Feb 16 22:46:40 oos0b mountd[112]: umountall request from 
> 192.168.40.39 from unprivileged port
> Feb 16 22:50:30 oos0b mountd[112]: umountall request from 
> 192.168.40.39 from unprivileged port
> Feb 16 22:50:34 oos0b mountd[112]: umountall request from 
> 192.168.40.39 from unprivileged port
> Feb 16 23:41:17 oos0b mountd[112]: umountall request from 
> 192.168.40.39 from unprivileged port
> Feb 16 23:50:57 oos0b last message repeated 3 times
> Feb 16 23:54:51 oos0b mountd[112]: umountall request from 
> 192.168.40.39 from unprivileged port
> Feb 17 07:29:58 oos0b /kernel: Copyright (c) 1992-2000 The 
> FreeBSD Project.
> Feb 17 07:29:58 oos0b /kernel: Copyright (c) 1979, 1980, 1983, 
> 1986, 1988, 1989, 1991, 1992, 1993, 1994
> Feb 17 07:29:58 oos0b /kernel: The Regents of the University of 
> California. All rights reserved.
> Feb 17 07:29:58 oos0b /kernel: FreeBSD 4.1.1-RELEASE #0: Tue Sep 
> 26 00:46:59 GMT 2000
> Feb 17 07:29:58 oos0b /kernel: 
> jkh@narf.osd.bsdi.com:/usr/src/sys/compile/GENERIC
> Feb 17 07:29:58 oos0b /kernel: Timecounter "i8254"  frequency 1193182 Hz
> Feb 17 07:29:58 oos0b /kernel: CPU: Pentium/P55C (200.46-MHz 
> 586-class CPU)
> Feb 17 07:29:58 oos0b /kernel: Origin = "GenuineIntel"  Id = 
> 0x544  Stepping = 4
> Feb 17 07:29:58 oos0b /kernel: 
> Features=0x8001bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,MMX>
> Feb 17 07:29:58 oos0b /kernel: real memory  = 134217728 (131072K bytes)
> Feb 17 07:29:58 oos0b /kernel: avail memory = 126521344 (123556K bytes)
> Feb 17 07:29:58 oos0b /kernel: Preloaded elf kernel "kernel" at 
> 0xc0416000.
> Feb 17 07:29:58 oos0b /kernel: Intel Pentium detected, installing 
> workaround for F00F bug
> Feb 17 07:29:58 oos0b /kernel: md0: Malloc disk
> .... etc with reboot.
> Feb 17 07:29:59 oos0b /kernel: Mounting root from ufs:/dev/da0s1a
> Feb 17 07:29:59 oos0b savecore: reboot after panic: page fault
> Feb 17 07:29:59 oos0b savecore: /var/crash/bounds: No such file 
> or directory
> Feb 17 07:29:59 oos0b savecore: writing core to /var/crash/vmcore.0
> Feb 17 07:30:48 oos0b savecore: writing kernel to /var/crash/kernel.0
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-questions" in the body of the message
> 

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?000001c09c9d$e3bd33c0$1401a8c0>