Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Mar 1998 10:40:01 +0930
From:      Greg Lehey <grog@lemis.com>
To:        "M. Monninger" <markem@primenet.com>, questions@FreeBSD.ORG
Subject:   Re: spontaneous reboot / panic
Message-ID:  <19980331104001.07907@freebie.lemis.com>
In-Reply-To: <3.0.5.32.19980330171937.0098be20@pop.primenet.com>; from M. Monninger on Mon, Mar 30, 1998 at 05:19:37PM %2B0000
References:  <Pine.BSF.3.96.980330131219.360F-100000@ls.wustl.edu> <Pine.BSF.3.96.980330154859.24859p-100000@gdi.uoregon.edu> <3.0.5.32.19980330171937.0098be20@pop.primenet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 30 March 1998 at 17:19:37 +0000, M. Monninger wrote:
> At 03:49 PM 3/30/98 -0800, Doug White wrote:
>>
>> Should be a lot more to this.  A panic in idle state would be *very*
>> suspicious; it may be a hardware problem, bad memory or corrupted swap.
>>
> Here's one from my system:
>
> Fatal trap 12: page fault while in kernel mode
> fault virtual address   = 0x4
> fault code              = supervisor read, page not present
> instruction pointer     = 0x8:0xf0129f5a
> stack pointer           = 0x10:0xefbffcac
> frame pointer           = 0x10:0xefbffcb8
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 4254 (find)
> interrupt mask          =
> panic: page fault
>
> It does this every once in a whle, maybe once a month. I also see disk
> errors every few days. Hmmm...wonder if it's related???

Related to the disk errors, or to other panics?  It could be related
to the disk errors, but there's no reason to believe that any of these
panics are related.  This is one FreeBSD equivalent of the dreaded
"General Protection Error" you see so often with Microsoft.  It means
that something has gone wrong with the kernel's addressing, and that
no specific routines exist to handle it.

The only reliable way to find out what is going on here is a dump.
The only useful way to analyse a dump is to build a debug kernel and
use that kernel for the analysis.  These are the steps (page 290 of
"The Complete FreeBSD", second edition):

  To prepare yourself for possible problems, you should build kernels
  which include debug symbols.  The resultant kernel is about 10 MB in
  size, but it will make debugging with ddb (the kernel debugger) or
  gdb much easier.  Even if you don't intend to do this yourself, the
  information will be of great use to anybody you may call in to help.

  Building a debug kernel is pretty much the same process as building
  a normal kernel.  Here are the differences:

  o Run config with the -g option:

    # config -g FREEBIE

  o After building the kernel, rename it to kernel.gdb:

    # mv kernel kernel.gdb

  o Make a copy of kernel.gdb called kernel, and strip debug symbols:

    # cp kernel.gdb kernel
    # strip -d kernel

  o Install the kernel.

You should also enable dumping in /etc/rc.conf.

When you have a dump (in /var/crash), run gdb against it:

# gdb -k kernel.gdb vmcore.21
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i386-unknown-freebsd), Copyright 1996 Free Software Foundation, Inc...
IdlePTD 2be000
initial pcb at 25b6d8
panicstr: pmap_zero_page: CMAP busy
panic messages:
panic: pmap_zero_page: CMAP busy

dumping to dev 20001, offset 172032
dump 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 
---
During symbol reading, debug info mismatch between compiler and debugger.
#0  boot (howto=0x104) at ../../kern/kern_shutdown.c:286
286                                     dumppcb.pcb_cr3 = rcr3();
(kgdb) bt
#0  boot (howto=0x104) at ../../kern/kern_shutdown.c:286
#1  0xf011c1e7 in panic (fmt=0xf01013e8 "from debugger") at ../../kern/kern_shutdown.c:426
#2  0xf0101405 in db_panic (addr=0xf01de574, have_addr=0x0, count=0x1, modif=0xf2ac3a40 "")
    at ../../ddb/db_command.c:432
#3  0xf01012e5 in db_command (last_cmdp=0xf0244aa4, cmd_table=0xf0244904, aux_cmd_tablep=0xf0258a34)
    at ../../ddb/db_command.c:332
#4  0xf0101472 in db_command_loop () at ../../ddb/db_command.c:454
#5  0xf0103b33 in db_trap (type=0xa, code=0x0) at ../../ddb/db_trap.c:71
#6  0xf01de341 in kdb_trap (type=0xa, code=0x0, regs=0xf2ac3b2c) at ../../i386/i386/db_interface.c:157
#7  0xf01eb7c8 in trap (frame={tf_es = 0x10, tf_ds = 0x10, tf_edi = 0x0, tf_esi = 0xf01e9a0e, tf_ebp = 0xf2ac3b70, 
      tf_isp = 0xf2ac3b54, tf_ebx = 0x100, tf_edx = 0xf01de535, tf_ecx = 0x0, tf_eax = 0x12, tf_trapno = 0xa, 
      tf_err = 0x0, tf_eip = 0xf01de574, tf_cs = 0x8, tf_eflags = 0x346, tf_esp = 0xf01de525, tf_ss = 0xf011c17c})
    at ../../i386/i386/trap.c:474
#8  0xf01de574 in Debugger (msg=0xf011c17c "panic") at ../../i386/i386/db_interface.c:317
#9  0xf011c1de in panic (fmt=0xf01e9a0e "pmap_zero_page: CMAP busy") at ../../kern/kern_shutdown.c:424
#10 0xf01e9a3f in pmap_zero_page (phys=0xee3000) at ../../i386/i386/pmap.c:2716
#11 0xf01c84cf in vm_fault (map=0xf02667e4, vaddr=0xf1e30000, fault_type=0x1, fault_flags=0x0)
    at ../../vm/vm_fault.c:532
#12 0xf01eba28 in trap_pfault (frame=0xf2ac3ca0, usermode=0x0) at ../../i386/i386/trap.c:724
#13 0xf01eb6a7 in trap (frame={tf_es = 0xf0140010, tf_ds = 0xf2ac0010, tf_edi = 0x80779eb9, tf_esi = 0xf1e2e000, 
      tf_ebp = 0xf2ac3d18, tf_isp = 0xf2ac3cc8, tf_ebx = 0x2000, tf_edx = 0xf07d2000, tf_ecx = 0x2000, 
      tf_eax = 0x2000, tf_trapno = 0xc, tf_err = 0x0, tf_eip = 0xf2a9b830, tf_cs = 0xf2a90008, tf_eflags = 0x10216, 
      tf_esp = 0x4000, tf_ss = 0x150}) at ../../i386/i386/trap.c:363
#14 0xf2a9b830 in ?? ()
#15 0xf2a996b4 in ?? ()
#16 0xf01363aa in biodone (bp=0xf05a4694) at ../../kern/vfs_bio.c:1838
#17 0xf01b1fa4 in scsi_done (xs=0xf04e1e80) at ../../scsi/scsi_base.c:447
#18 0xf01f22db in aha_done (aha=0xf04e4000, ccb=0xf04e4b18) at ../../i386/isa/aha1542.c:947
#19 0xf01f1dc9 in ahaintr (unit=0x0) at ../../i386/isa/aha1542.c:753
#20 0xf01ea59a in generic_bcopy ()
#21 0xf01c85e1 in vm_fault (map=0xf29f0440, vaddr=0x5f000, fault_type=0x3, fault_flags=0x8) at ../../vm/vm_fault.c:625
#22 0xf01eba0e in trap_pfault (frame=0xf2ac3fbc, usermode=0x1) at ../../i386/i386/trap.c:716
#23 0xf01eb523 in trap (frame={tf_es = 0xefbf0027, tf_ds = 0xf2a80027, tf_edi = 0xe5513, tf_esi = 0x2, 
      tf_ebp = 0xefbfdaec, tf_isp = 0xf2ac3fe4, tf_ebx = 0x0, tf_edx = 0xffffffff, tf_ecx = 0xc99b0, tf_eax = 0x2, 
      tf_trapno = 0xc, tf_err = 0x7, tf_eip = 0x1f01c, tf_cs = 0x1f, tf_eflags = 0x10246, tf_esp = 0xefbfdae0, 
      tf_ss = 0x27}) at ../../i386/i386/trap.c:287
#24 0x1f01c in ?? ()

This looks pretty much like gibberish--see the online handbook for
more details.  In particular, though, the last part is what we need:
where did this happen?  The command 

(kgdb) bt

tells gdb to give you a backtrace of the function calls.  The #1, #2
and so on are the frame numbers.  In this particular dump, frames #13
and #23 are trap frames.  As the name in frames #12 and #22 show, in
each case it was a page fault.

Nobody's asking you to solve this dump, of course (I wish I could),
but the information will usually give an accurate idea of what caused
it.

Greg


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19980331104001.07907>