Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 Jul 2005 11:09:50 -0700
From:      Frank McConnell <fmc@reanimators.org>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        freebsd-stable@FreeBSD.org
Subject:   Re: RELENG_5 PAE panic
Message-ID:  <200507291809.j6TI9p37035628@lots.reanimators.org>
In-Reply-To: <20050729091624.R74149@fledge.watson.org> (Robert Watson's message of "Fri, 29 Jul 2005 09:20:34 %2B0100 (BST)")
References:  <200507290034.j6T0YLdZ014411@lots.reanimators.org> <20050729091624.R74149@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Robert Watson wrote:
> This appears to be a NULL pointer dereference in
> propagate_priotity(). Often a panic in propagate_priority is actually
> a symptom of a slightly earlier problem which is discovered by
> propagate_priority when it trips over, for example, a bad mutex.  If
> you're set up with a serial port to copy and paste debugging output,
> the output of 'ps' and 'show pcpu' for each of the cpus (as well as
> 'show pcpu without a cpu argument) would be helpful.  It wouldn't hurt
> also to use gdb on a copy of the kernel with debugging sybols to map
> 'vm_pageout+0x280' into a line number.  Details on these various
> activities can be found in the handbook.

Thanks, that's helpful.  It's been a while since I've needed to debug
a FreeBSD kernel (good work, y'all!), and while I have worked on RTOSs
and TCP/IP stacks and drivers, and I've looked at the code enough to
figure that I probably don't have the right clues to make sense of the
entrails as seen through the debugger.  It'd be interesting and probably
fun, but I have other stuff that needs doing too.

It's a single-CPU system with hyperthreading disabled in the firmware
setup, so I'm thinking 'show pcpu 1' will not be meaningful.  I'll do it
anyway.  

--- begin crash ---
splat# /usr/sbin/named -c /etc/namedb/named.conf
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x24
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc03db1cf
stack pointer           = 0x10:0xeb328c64
frame pointer           = 0x10:0xeb328c78
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = resume, IOPL = 0
current process         = 70 (pagedaemon)
[thread pid 70 tid 100080 ]
Stopped at      0xc03db1cf = propagate_priority+0x7f:   movl    0x24(%eax),%eax
db> ps
  pid   proc     uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
  597 c6a66e20    0   596   596 0000002 new [INACTIVE] named
  596 c6cd61c4    0   585   596 0004002 [SLPQ user map 0xc7080b24][SLP] named
  585 c717d8d4    0   555   585 0004002 [SLPQ pause 0xc717d90c][SLP] csh
  563 c6cd954c    0     1   563 0004002 [SLPQ ttyin 0xc6a61810][SLP] getty
  562 c6cd9c5c    0     1   562 0004002 [SLPQ ttyin 0xc6a61a10][SLP] getty
  561 c6cd9710    0     1   561 0004002 [SLPQ ttyin 0xc6a61c10][SLP] getty
  560 c6cd91c4    0     1   560 0004002 [SLPQ ttyin 0xc6a61e10][SLP] getty
  559 c6cd9a98    0     1   559 0004002 [SLPQ ttyin 0xc6adb010][SLP] getty
  558 c7067000    0     1   558 0004002 [SLPQ ttyin 0xc6adb210][SLP] getty
  557 c6cd9388    0     1   557 0004002 [SLPQ ttyin 0xc6adb410][SLP] getty
  556 c6cd6710    0     1   556 0004002 [SLPQ ttyin 0xc6a60a10][SLP] getty
  555 c7067c5c    0     1   555 0004102 [SLPQ wait 0xc7067c5c][SLP] login
  541 c7067388    0     1   541 0000000 [SLPQ select 0xc0628624][SLP] inetd
  534 c70671c4  125   522   522 0004100 [SLPQ select 0xc0628624][SLP] qmgr
  533 c6cd9e20  125   522   522 0004100 [SLPQ select 0xc0628624][SLP] pickup
  522 c6cd98d4    0     1   522 0004100 [SLPQ select 0xc0628624][SLP] master
  421 c6a66710    0     1   421 0000000 [SLPQ nanslp 0xc0624fec][SLP] cron
  408 c6cd654c    0     1   408 0000100 [SLPQ select 0xc0628624][SLP] sshd
  391 c6cd6000    0     1   391 0000000 [SLPQ select 0xc0628624][SLP] ntpd
  289 c6cd6388    0     1   289 0000000 [SLPQ select 0xc0628624][SLP] syslogd
  271 c6a668d4    0     1   271 0000000 [SLPQ select 0xc0628624][SLP] devd
  214 c6a6654c    0     1   214 0000000 [SLPQ pause 0xc6a66584][SLP] adjkerntz
   80 c6cd68d4    0     0     0 0000204 [RUNQ] schedcpu
   79 c6cd6a98    0     0     0 0000204 [SLPQ - 0xc063084c][SLP] nfsiod 3
   78 c6cd6c5c    0     0     0 0000204 [SLPQ - 0xc0630848][SLP] nfsiod 2
   77 c6cd6e20    0     0     0 0000204 [SLPQ - 0xc0630844][SLP] nfsiod 1
   76 c6cd9000    0     0     0 0000204 [SLPQ - 0xc0630840][SLP] nfsiod 0
   75 c69f3a98    0     0     0 0000204 [RUNQ] vnlru
   74 c69f3c5c    0     0     0 0000204 [RUNQ] syncer
   73 c69f3e20    0     0     0 0000204 [RUNQ] bufdaemon
   72 c6a63000    0     0     0 000020c [SLPQ pgzero 0xc06370f4][SLP] pagezero
   71 c6a631c4    0     0     0 0000204 [SLPQ psleep 0xc0637148][SLP] vmdaemon
   70 c6a63388    0     0     0 0000204 [LOCK vm page queue mutex c6a57240] pagedaemon
   69 c6a6354c    0     0     0 0000204 [IWAIT] swi0: sio
    9 c6a63710    0     0     0 0000204 [SLPQ actask 0xc061c3cc][SLP] acpi_task2
    8 c6a638d4    0     0     0 0000204 [SLPQ actask 0xc061c3cc][SLP] acpi_task1
    7 c6a63a98    0     0     0 0000204 [SLPQ actask 0xc061c3cc][SLP] acpi_task0
   68 c6a63c5c    0     0     0 0000204 [IWAIT] swi6:+
    6 c6a63e20    0     0     0 0000204 [SLPQ - 0xc6a3fd80][SLP] thread taskq
   67 c6a66000    0     0     0 0000204 [IWAIT] swi6:+
   66 c6a661c4    0     0     0 0000204 [IWAIT] swi6: task queue
    5 c6a66388    0     0     0 0000204 [SLPQ - 0xc6a57500][SLP] kqueue taskq
   65 c69e41c4    0     0     0 0000204 [IWAIT] swi6: acpitaskq
   64 c69e4388    0     0     0 0000204 [IWAIT] swi3: cambio
   63 c69e454c    0     0     0 0000204 [IWAIT] swi2: camnet
   62 c69e4710    0     0     0 0000204 [SLPQ - 0xc061ca20][SLP] yarrow
    4 c69e48d4    0     0     0 0000204 [SLPQ - 0xc061f648][SLP] g_down
    3 c69e4a98    0     0     0 0000204 [SLPQ - 0xc061f644][SLP] g_up
    2 c69e4c5c    0     0     0 0000204 [SLPQ - 0xc061f63c][SLP] g_event
   61 c69e4e20    0     0     0 0000204 [IWAIT] swi4: vm
   60 c69f3000    0     0     0 000020c [IWAIT] swi5: clock sio
   59 c69f31c4    0     0     0 0000204 [IWAIT] swi1: net
   58 c69f3388    0     0     0 0000204 [IWAIT] irq0: clk
   57 c69f354c    0     0     0 0000204 [IWAIT] irq47:
   56 c69f3710    0     0     0 0000204 [IWAIT] irq46:
   55 c69f38d4    0     0     0 0000204 [IWAIT] irq45:
   54 c69cca98    0     0     0 0000204 [IWAIT] irq44:
   53 c69ccc5c    0     0     0 0000204 [IWAIT] irq43:
   52 c69cce20    0     0     0 0000204 [IWAIT] irq42:
   51 c69e0000    0     0     0 0000204 [IWAIT] irq41:
   50 c69e01c4    0     0     0 0000204 [IWAIT] irq40:
   49 c69e0388    0     0     0 0000204 [IWAIT] irq39:
   48 c69e054c    0     0     0 0000204 [IWAIT] irq38:
   47 c69e0710    0     0     0 0000204 [IWAIT] irq37:
   46 c69e08d4    0     0     0 0000204 [IWAIT] irq36:
   45 c69e0a98    0     0     0 0000204 [IWAIT] irq35:
   44 c69e0c5c    0     0     0 0000204 [IWAIT] irq34:
   43 c69e0e20    0     0     0 0000204 [IWAIT] irq33:
   42 c69e4000    0     0     0 0000204 [IWAIT] irq32:
   41 c69bc54c    0     0     0 0000204 [IWAIT] irq31:
   40 c69bc710    0     0     0 0000204 [IWAIT] irq30:
   39 c69bc8d4    0     0     0 0000204 [IWAIT] irq29:
   38 c69bca98    0     0     0 0000204 [IWAIT] irq28:
   37 c69bcc5c    0     0     0 0000204 [IWAIT] irq27:
   36 c69bce20    0     0     0 0000204 [IWAIT] irq26:
   35 c69cc000    0     0     0 0000204 [IWAIT] irq25:
   34 c69cc1c4    0     0     0 0000204 [IWAIT] irq24:
   33 c69cc388    0     0     0 0000204 [IWAIT] irq23:
   32 c69cc54c    0     0     0 0000204 [IWAIT] irq22:
   31 c69cc710    0     0     0 0000204 [IWAIT] irq21:
   30 c69cc8d4    0     0     0 0000204 [IWAIT] irq20:
   29 c696c1c4    0     0     0 0000204 [IWAIT] irq19:
   28 c696c388    0     0     0 0000204 [IWAIT] irq18:
   27 c696c54c    0     0     0 0000204 [IWAIT] irq17: em0
   26 c696c710    0     0     0 0000204 [IWAIT] irq16:
   25 c696c8d4    0     0     0 0000204 [IWAIT] irq15: ata1
   24 c696ca98    0     0     0 0000204 [IWAIT] irq14: ata0
   23 c696cc5c    0     0     0 0000204 [IWAIT] irq13:
   22 c696ce20    0     0     0 0000204 [IWAIT] irq12:
   21 c69bc000    0     0     0 0000204 [IWAIT] irq11:
   20 c69bc1c4    0     0     0 0000204 [IWAIT] irq10:
   19 c69bc388    0     0     0 0000204 [IWAIT] irq9: acpi0
   18 c6964000    0     0     0 0000204 [IWAIT] irq8: rtc
   17 c69641c4    0     0     0 0000204 [IWAIT] irq7:
   16 c6964388    0     0     0 0000204 [IWAIT] irq6:
   15 c696454c    0     0     0 0000204 [IWAIT] irq5:
   14 c6964710    0     0     0 0000204 [IWAIT] irq4: sio0
   13 c69648d4    0     0     0 0000204 [IWAIT] irq3: sio1
   12 c6964a98    0     0     0 0000204 [IWAIT] irq1: atkbd0
   11 c6964c5c    0     0     0 000020c [Can run] idle
    1 c6964e20    0     0     1 0004200 [SLPQ wait 0xc6964e20][SLP] init
   10 c696c000    0     0     0 0000204 [SLPQ ktrace 0xc0622f98][SLP] ktrace
    0 c061f740    0     0     0 0000200 [SLPQ sched 0xc061f740][SLP] swapper
db> show pcpu
cpuid        = 0
curthread    = 0xc6a65000: pid 70 "pagedaemon"
curpcb       = 0xeb328d90
fpcurthread  = none
idlethread   = 0xc6965480: pid 11 "idle"
APIC ID      = 0
currentldt   = 0x28
db> show pcpu 0
cpuid        = 0
curthread    = 0xc6a65000: pid 70 "pagedaemon"
curpcb       = 0xeb328d90
fpcurthread  = none
idlethread   = 0xc6965480: pid 11 "idle"
APIC ID      = 0
currentldt   = 0x28
db> trace
Tracing pid 70 tid 100080 td 0xc6a65000
propagate_priority(c6a65000,c0628280,c0636c60,c6a65000,c6cd7782) at 0xc03db1cf = propagate_priority+0x7f
turnstile_wait(c6a57240,c0636c60,c6cd7780) at 0xc03db84a = turnstile_wait+0x266
_mtx_lock_sleep(c0636c60,c6a65000,0,0,0) at 0xc03b4c25 = _mtx_lock_sleep+0xad
msleep(c0637104,c0636c60,44,c059aa74,1f4) at 0xc03c37ea = msleep+0x39a
vm_pageout(0,eb328d38) at 0xc04fb0e4 = vm_pageout+0x280
fork_exit(c04fae64,0,eb328d38) at 0xc03a8680 = fork_exit+0x74
fork_trampoline() at 0xc0539d9c = fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xeb328d6c, ebp = 0 ---
db> show pcpu 1
cpuid        = 2130835587
curthread    = 
db> reset
--- end crash ---

--- begin gdb ---
splat# gdb /usr/obj/usr/src/sys/EAST1-PAE/kernel.debug
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd"...
(gdb) list *vm_pageout+0x280
0xc04fb0e4 is in vm_pageout (/usr/src/sys/vm/vm_pageout.c:1466).
1461                                    pass = 1;
1462                            else
1463                                    pass = 0;
1464                            error = msleep(&vm_pages_needed, &vm_page_queue_mtx, PVM,
1465                                        "psleep", vm_pageout_stats_interval * hz);
1466                            if (error && !vm_pages_needed) {
1467                                    pass = 0;
1468                                    vm_pageout_page_stats();
1469                                    vm_page_unlock_queues();
1470                                    continue;
(gdb) 
--- end gdb ---

Based on the 'ps' output, I'm thinking the foreground named has loaded
up the zone files and is forking a copy of itself to run as a daemon.

There's more output (dmesg, gdb list for other trace frames, &c).
Mostly I don't want to flood you or the list.

When it was running 5.4-RELEASE, I had at one point added options
INVARIANTS and INVARIANT_SUPPORT to the PAE-based configuration, and
it panic'd during startup, either during fsck of the root filesystem
(if multi-user startup) or immediately after I pressed return at the
prompt for a single-user shell (during single-user startup), but I
didn't take good notes.  I'm willing to try that again if it would be
helpful.

-Frank McConnell



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200507291809.j6TI9p37035628>