From owner-freebsd-questions Wed Feb 21 23: 6:11 2001 Delivered-To: freebsd-questions@freebsd.org Received: from mail.freebsd-corp-net-guide.com (mail.freebsd-corp-net-guide.com [206.29.169.15]) by hub.freebsd.org (Postfix) with ESMTP id D040F37B491 for ; Wed, 21 Feb 2001 23:05:59 -0800 (PST) (envelope-from tedm@toybox.placo.com) Received: from tedm.placo.com (nat-rtr.freebsd-corp-net-guide.com [206.29.168.154]) by mail.freebsd-corp-net-guide.com (8.11.1/8.11.1) with SMTP id f1M75p722084; Wed, 21 Feb 2001 23:05:51 -0800 (PST) (envelope-from tedm@toybox.placo.com) From: "Ted Mittelstaedt" To: "Scott Macy" , Cc: "milt" , "Cayford Burrell" , "John Lynch" , "Josh Howard" , "Julian Elischer" Subject: RE: Help with panic: page fault in FBSD 4.1.1 Date: Wed, 21 Feb 2001 23:05:51 -0800 Message-ID: <000001c09c9d$e3bd33c0$1401a8c0@tedm.placo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0 In-Reply-To: <20010221004855.83D4729A@muir.vicor-nb.com> Importance: Normal Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Try exchanging ram in one of the old systems with another system and see if the problem follows the ram. Also compare BIOS settings from a crashing system with a good system - there may be some setting that's different. Ted Mittelstaedt tedm@toybox.placo.com Author of: The FreeBSD Corporate Networker's Guide Book website: http://www.freebsd-corp-net-guide.com > -----Original Message----- > From: owner-freebsd-questions@FreeBSD.ORG > [mailto:owner-freebsd-questions@FreeBSD.ORG]On Behalf Of Scott Macy > Sent: Tuesday, February 20, 2001 4:49 PM > To: freebsd-questions@FreeBSD.ORG > Cc: milt; Cayford Burrell; John Lynch; Josh Howard; Scott Macy; Julian > Elischer > Subject: Help with panic: page fault in FBSD 4.1.1 > > > We have been upgrading about 300 PCs at our clients production sites > from FreeBSD 2.2.6 to FreeBSD 4.1.1. On two of the oldest machines we > have had frequent (daily or so) Kernel crashes with "panic: page fault". > We have two kernel core dumps which show the exact same stack trace. > > We are not seeing these crashes on the newer machines with 4.1.1 running > the same application, and the machines that are crashing were stable > under 2.2.6. > > I've looked though the mail archives, and the GNATS database and not > found a match to our problem. > > What's going on?? What can we do to elminate the crashes?? Is there > some Kernel config setting that might be causing the problem? Is there > another option than "buy new hw?" > > > > Some Details: > > The machines that are crashing are "server" machines, with 6x60GB > RAIDs, mostly running our own "QFT" (Quick File Transfer) > daemons that do *lots* of network and disk IO. The crashing machines > are Pentuim (Pro?) 200's (see dmesg below). We are not seeing the same > crash problem on user workstations that are of the same vintage, > same OS, but not as heavily loaded. > > We swapped hardware on one of these older machines with a newer one, and > the problem went away when the same disk was used with Pentium II 350 > CPU. > > We now have trapped two kernel core dumps and both give exactly the > same information. Below is Kernel Debug Stack Trace info, then Dmesg > info. > > The process that is running at the time of the panic is qftListener, > which is the QFT daemon mentioned above. We have two core > dumps, both time's it's in a qftListener "open()" call. > > Thanks, > -Scott Macy > > > > Gory Details: > =================================================================== > bigwoop Feb 20 10:54am ~cayford/oos0b_crash 107: gdb -k kern* vm* > GNU gdb 4.18 > Copyright 1998 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, > and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" > for details. > This GDB was configured as "i386-unknown-freebsd"... > (no debugging symbols found)... > IdlePTD 4358144 > initial pcb at 387100 > panicstr: page fault > panic messages: > --- > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0x14 > fault code = supervisor read, page not present > instruction pointer = 0x8:0xc015129b > stack pointer = 0x10:0xc8f58a08 > frame pointer = 0x10:0xc8f58a70 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 22304 (qftListener) > interrupt mask = > trap number = 12 > panic: page fault > > syncing disks... 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 > 30 30 30 30 > giving up on 26 buffers > Uptime: 1d9h32m48s > (da1:ahc1:0:0:0): Synchronize cache failed, status == 0x34, scsi > status == 0x0 > (da2:ahc1:0:1:0): Synchronize cache failed, status == 0xb, scsi > status == 0x0 > (da3:ahc1:0:2:0): Synchronize cache failed, status == 0xb, scsi > status == 0x0 > > dumping to dev #da/0x20001, offset 1310720 > dump 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 > 113 112 111 110 > 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 > 91 90 89 88 87 > 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 > 64 63 62 61 60 > 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 > 38 37 36 35 34 3 > 3 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 > 11 10 9 8 7 6 5 > 4 3 2 1 > --- > #0 0xc0193748 in boot () > (kgdb) bt > #0 0xc0193748 in boot () > #1 0xc0193acc in poweroff_wait () > #2 0xc02ede85 in trap_fatal () > #3 0xc02edb5d in trap_pfault () > #4 0xc02ed717 in trap () > #5 0xc015129b in ahc_action () > #6 0xc012650f in xpt_run_dev_sendq () > #7 0xc01258ff in xpt_action () > #8 0xc012cd60 in dastart () > #9 0xc01262b8 in xpt_run_dev_allocq () > #10 0xc01261e7 in xpt_schedule () > #11 0xc012c2c8 in dastrategy () > #12 0xc019c9e9 in diskstrategy () > #13 0xc01c8ed0 in spec_strategy () > #14 0xc01c89a5 in spec_vnoperate () > #15 0xc0292ddd in ufs_vnoperatespec () > #16 0xc0292845 in ufs_strategy () > #17 0xc0292dad in ufs_vnoperate () > #18 0xc01b61e6 in bread () > #19 0xc0289653 in ffs_blkatoff () > #20 0xc028dfe9 in ufs_lookup () > #21 0xc0292dad in ufs_vnoperate () > #22 0xc01b9bf9 in vfs_cache_lookup () > ---Type to continue, or q to quit--- > #23 0xc0292dad in ufs_vnoperate () > #24 0xc01bc988 in lookup () > #25 0xc01bc484 in namei () > #26 0xc01c490a in vn_open () > #27 0xc01c0c85 in open () > #28 0xc02ee131 in syscall2 () > #29 0xc02dfca5 in Xint0x80_syscall () > #30 0x8050a69 in ?? () > #31 0x8051a98 in ?? () > #32 0x804a46d in ?? () > (kgdb) h > > =================================================================== > Dmesg: > > Copyright (c) 1992-2000 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD 4.1.1-RELEASE #0: Tue Sep 26 00:46:59 GMT 2000 > jkh@narf.osd.bsdi.com:/usr/src/sys/compile/GENERIC > Timecounter "i8254" frequency 1193182 Hz > CPU: Pentium/P55C (200.46-MHz 586-class CPU) > Origin = "GenuineIntel" Id = 0x544 Stepping = 4 > Features=0x8001bf > real memory = 134217728 (131072K bytes) > avail memory = 126521344 (123556K bytes) > Preloaded elf kernel "kernel" at 0xc0416000. > Intel Pentium detected, installing workaround for F00F bug > md0: Malloc disk > npx0: on motherboard > npx0: INT 16 interface > pcib0: on motherboard > pci0: on pcib0 > isab0: at device 7.0 on pci0 > isa0: on isab0 > atapci0: port 0xf000-0xf00f at > device 7.1 on pci0 > ata0: at 0x1f0 irq 14 on atapci0 > ata1: at 0x170 irq 15 on atapci0 > uhci0: port > 0x6400-0x641f irq 11 at device 7.2 on pci0 > usb0: on uhci0 > usb0: USB revision 1.0 > uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub0: 2 ports with 2 removable, self powered > chip1: port > 0x5f00-0x5f0f at device 7.3 on pci0 > ahc0: port 0x6800-0x68ff mem > 0xe4100000-0xe4100fff irq 10 at device 9.0 on pci0 > aic7880: Wide Channel A, SCSI Id=7, 16/255 SCBs > ahc1: port 0x6c00-0x6cff mem > 0xe4102000-0xe4102fff irq 5 at device 10.0 on pci0 > aic7880: Wide Channel A, SCSI Id=7, 16/255 SCBs > pci0: at 11.0 irq 9 > fxp0: port 0x7000-0x701f mem > 0xe4000000-0xe40fffff,0xe4101000-0xe4101fff irq 11 at device 12.0 on pci0 > fxp0: Ethernet address 00:a0:c9:5b:f7:78 > eisa0: on motherboard > eisa0: unknown card ADP7881 (0x04907881) at slot 6 > fdc0: at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 > fdc0: FIFO enabled, 8 bytes threshold > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > atkbdc0: at port 0x60,0x64 on isa0 > atkbd0: flags 0x1 irq 1 on atkbdc0 > kbd0 at atkbd0 > psm0: irq 12 on atkbdc0 > psm0: model Generic PS/2 mouse, device ID 0 > vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > sc0: at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 > sio0: type 16550A > sio1 at port 0x2f8-0x2ff irq 3 on isa0 > sio1: type 16550A > ppc0: at port 0x378-0x37f irq 7 on isa0 > ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode > plip0: on ppbus0 > lpt0: on ppbus0 > lpt0: Interrupt-driven port > ppi0: on ppbus0 > Waiting 15 seconds for SCSI devices to settle > da0 at ahc0 bus 0 target 0 lun 0 > da0: Fixed Direct Access SCSI-2 device > da0: 10.000MB/s transfers (10.000MHz, offset 15) > da0: 8683MB (17783112 512 byte sectors: 255H 63S/T 1106C) > da1 at ahc1 bus 0 target 0 lun 0 > da1: Fixed Direct Access SCSI-2 device > da1: 20.000MB/s transfers (10.000MHz, offset 8, 16bit) > da1: 61220MB (125380096 512 byte sectors: 255H 63S/T 7804C) > da2 at ahc1 bus 0 target 1 lun 0 > da2: Fixed Direct Access SCSI-2 device > da2: 20.000MB/s transfers (10.000MHz, offset 8, 16bit) > da2: 61220MB (125380096 512 byte sectors: 255H 63S/T 7804C) > da3 at ahc1 bus 0 target 2 lun 0 > da3: Fixed Direct Access SCSI-2 device > da3: 20.000MB/s transfers (10.000MHz, offset 8, 16bit) > da3: 61220MB (125380096 512 byte sectors: 255H 63S/T 7804C) > Mounting root from ufs:/dev/da0s1a > WARNING: / was not properly dismounted > =================================================================== > > > And /var/log/messages give no hit to the problem: > =================================================================== > Feb 15 20:12:26 oos0b inetd[149]: ntalk/udp: no such user 'tty', > service ignored > Feb 15 20:19:49 oos0b ntpd[105]: time reset -6.935659 s > Feb 15 20:19:49 oos0b ntpd[105]: kernel pll status change 2041 > Feb 16 22:46:37 oos0b mountd[112]: umountall request from > 192.168.40.39 from unprivileged port > Feb 16 22:46:40 oos0b mountd[112]: umountall request from > 192.168.40.39 from unprivileged port > Feb 16 22:50:30 oos0b mountd[112]: umountall request from > 192.168.40.39 from unprivileged port > Feb 16 22:50:34 oos0b mountd[112]: umountall request from > 192.168.40.39 from unprivileged port > Feb 16 23:41:17 oos0b mountd[112]: umountall request from > 192.168.40.39 from unprivileged port > Feb 16 23:50:57 oos0b last message repeated 3 times > Feb 16 23:54:51 oos0b mountd[112]: umountall request from > 192.168.40.39 from unprivileged port > Feb 17 07:29:58 oos0b /kernel: Copyright (c) 1992-2000 The > FreeBSD Project. > Feb 17 07:29:58 oos0b /kernel: Copyright (c) 1979, 1980, 1983, > 1986, 1988, 1989, 1991, 1992, 1993, 1994 > Feb 17 07:29:58 oos0b /kernel: The Regents of the University of > California. All rights reserved. > Feb 17 07:29:58 oos0b /kernel: FreeBSD 4.1.1-RELEASE #0: Tue Sep > 26 00:46:59 GMT 2000 > Feb 17 07:29:58 oos0b /kernel: > jkh@narf.osd.bsdi.com:/usr/src/sys/compile/GENERIC > Feb 17 07:29:58 oos0b /kernel: Timecounter "i8254" frequency 1193182 Hz > Feb 17 07:29:58 oos0b /kernel: CPU: Pentium/P55C (200.46-MHz > 586-class CPU) > Feb 17 07:29:58 oos0b /kernel: Origin = "GenuineIntel" Id = > 0x544 Stepping = 4 > Feb 17 07:29:58 oos0b /kernel: > Features=0x8001bf > Feb 17 07:29:58 oos0b /kernel: real memory = 134217728 (131072K bytes) > Feb 17 07:29:58 oos0b /kernel: avail memory = 126521344 (123556K bytes) > Feb 17 07:29:58 oos0b /kernel: Preloaded elf kernel "kernel" at > 0xc0416000. > Feb 17 07:29:58 oos0b /kernel: Intel Pentium detected, installing > workaround for F00F bug > Feb 17 07:29:58 oos0b /kernel: md0: Malloc disk > .... etc with reboot. > Feb 17 07:29:59 oos0b /kernel: Mounting root from ufs:/dev/da0s1a > Feb 17 07:29:59 oos0b savecore: reboot after panic: page fault > Feb 17 07:29:59 oos0b savecore: /var/crash/bounds: No such file > or directory > Feb 17 07:29:59 oos0b savecore: writing core to /var/crash/vmcore.0 > Feb 17 07:30:48 oos0b savecore: writing kernel to /var/crash/kernel.0 > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-questions" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message