Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 7 Nov 2007 11:16:11 -0800
From:      Jeremy Chadwick <koitsu@FreeBSD.org>
To:        freebsd-stable@freebsd.org
Subject:   RELENG_6 kernel panic + savecore(8) problem
Message-ID:  <20071107191611.GA1400@eos.sc1.parodius.com>

next in thread | raw e-mail | index | archive | help
Two things:

This morning our production RELENG_6 box, after 133 days of uptime,
kernel panic'd:

panic: handle_written_inodeblock: live inodedep
KDB: enter: panic
[thread pid 3 tid 100001 ]
Stopped at      kdb_enter+0x30: leave
db> bt
Tracing pid 3 tid 100001 td 0xc7c6ad80
kdb_enter(3228441820,3228796672,3228487817,3867634632,256,...) at kdb_enter+48
panic(3228487817,3426817152,256,3228643296,0,...) at panic+206
handle_written_inodeblock(3459887104,3688934424,3226775710,3228787204,3228175693,...) at handle_written_inodeblock+1503
softdep_disk_write_complete(3688934424,3227842097,3356275348,3867634836,3226342800,...) at softdep_disk_write_complete+241
bufdone(3688934424,0,3867634856,3226352850,3356275348,...) at bufdone+126
g_vfs_done(3356275348,0,0,3352445440,3355957180) at g_vfs_done+198
biodone(3356275348,3228786984,588,3228423470,100,...) at biodone+178
g_io_schedule_up(3351686528,76,3351679512,3226344072,3867634980,...) at g_io_schedule_up+137
g_up_procbody(0,3867635000,0,0,0,...) at g_up_procbody+122
fork_exit(3226344072,0,3867635000) at fork_exit+122
fork_trampoline() at fork_trampoline+8
--- trap 1, eip = 0, esp = 3867635052, ebp = 0 ---
db> panic
panic: from debugger
Uptime: 133d3h3m31s
Dumping 3062 MB (2 chunks)
  chunk 0: 1MB (159 pages) ... ok
  chunk 1: 3062MB (783840 pages) 3046 3030 3014 2998 2982 2966 2950 2934 2918 2902 2886 2870 2854 2838 2822 2806 2790 2774 2758 2742 2726 2710 2694 2678 2662 2646 2630 2614 2598 2582 2566 2550 2534 2518 2502 2486 2470 2454 2438 2422 2406 2390 2374 2358 2342 2326 2310 2294 2278 2262 2246 2230 2214 2198 2182 2166 2150 2134 2118 2102 2086 2070 2054 2038 2022 2006 1990 1974 1958 1942 1926 1910 1894 1878 1862 1846 1830 1814 1798 1782 1766 1750 1734 1718 1702 1686 1670 1654 1638 1622 1606 1590 1574 1558 1542 1526 1510 1494 1478 1462 1446 1430 1414 1398 1382 1366 1350 1334 1318 1302 1286 1270 1254 1238 1222 1206 1190 1174 1158 1142 1126 1110 1094 1078 1062 1046 1030 1014 998 982 966 950 934 918 902 886 870 854 838 822 806 790 774 758 742 726 710 694 678 662 646 630 614 598 582 566 550 534 518 502 486 470 454 438 422 406 390 374 358 342 326 310 294 278 262 246 230 214 198 182 166 150 134 118 102 86 70 54 38 22 6 ... ok
Dump complete

Based on the contents of rc.conf, dumpdev(8) was indeed set correctly:

dumpdev="auto"
dumpdir="/var/crash"

Our dump device is /dev/ad0s1b (swap), which the kernel knew of prior to
the panic, as can be seen from the dmesg 133 days ago:

Trying to mount root from ufs:/dev/ad0s1a^M
Loading configuration files.^M
kernel dumps on /dev/ad0s1b^M
Entropy harvesting: interrupts ethernet point_to_point kickstart.^M

Swap size is 8GB:

Device          1K-blocks     Used    Avail Capacity
/dev/ad0s1b       8388608        0  8388608     0%

When the machine came back up after the panic this morning, I expected
it to use savecore(8) to save the dump to /var/crash.  What I saw
instead was:

kernel dumps on /dev/ad0s1b
Entropy harvesting: interrupts ethernet point_to_point kickstart.
swapon: adding /dev/ad0s1b as swap device
{...skip...}
Starting syslogd.
Checking for core dump on /dev/ad0s1b...
savecore: no dumps found

Our /var filesystem had 13GB free prior to the crash:

Filesystem  1024-blocks    Used    Avail Capacity  Mounted on
/dev/ad0s1d    16244334 1487758 13457030    10%    /var

/var/crash/minfree contains 2048, which was definitely met at the time
of the panic.  So, um, where's my vmcore?  :-)

And of course, if anyone has any details about what caused the crash,
I'd be grateful.  It may be something that was fixed in the past 133
days, but since I don't have a vmcore, I can't debug it after-the-
fact...  :-(

Thanks.

-- 
| Jeremy Chadwick                                    jdc at parodius.com |
| Parodius Networking                           http://www.parodius.com/ |
| UNIX Systems Administrator                      Mountain View, CA, USA |
| Making life hard for others since 1977.                  PGP: 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071107191611.GA1400>