Date: Mon, 9 Feb 2009 01:13:08 -0500 (EST) From: Charles Sprickman <spork@bway.net> To: freebsd-stable@freebsd.org Subject: 7.1 Panic on degraded disk w/mpt Message-ID: <alpine.OSX.2.00.0902090102060.37588@toasty.nat.fasttrackmonkey.com>
next in thread | raw e-mail | index | archive | help
Howdy, I dug around and can't find a PR on this, and the only other report I saw was in this mailing list post that has no replies: http://www.nabble.com/7.1-BETA2-panic-on-mpt-degrade-td20183173.html The hardware is a Dell PowerEdge 860 with the Dell/LSI SAS5 controller: mpt0: <LSILogic SAS/SATA Adapter> port 0xec00-0xecff mem 0xfe9fc000-0xfe9fffff,0xfe9e0000-0xfe9effff irq 16 at device 8.0 on pci2 mpt0: MPI Version=1.5.13.0 The panic is repeatable by forcing the array into a degraded state. Here's my best shot at getting info out of kgdb: [root@uniweb /home/spork]# cd /usr/obj/usr/src/sys/BWAY7/ [root@uniweb /usr/obj/usr/src/sys/BWAY7]# kgdb kernel.debug /var/crash/vmcore.0 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x14 fault code = supervisor read, page not present instruction pointer = 0x20:0xc044b09b stack pointer = 0x28:0xe6ee5b80 frame pointer = 0x28:0xe6ee5b9c code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 17 (swi2: cambio) trap number = 12 panic: page fault cpuid = 0 Uptime: 3m7s Physical memory: 3575 MB Dumping 94 MB: 79 63 47 31 15 Reading symbols from /boot/kernel/acpi.ko...Reading symbols from /boot/kernel/acpi.ko.symbols...done. done. Loaded symbols for /boot/kernel/acpi.ko #0 doadump () at pcpu.h:196 196 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) list *0xc044b09b 0xc044b09b is in xpt_done (/usr/src/sys/cam/cam_xpt.c:4832). 4827 if ((done_ccb->ccb_h.func_code & XPT_FC_QUEUED) != 0) { 4828 /* 4829 * Queue up the request for handling by our SWI handler 4830 * any of the "non-immediate" type of ccbs. 4831 */ 4832 sim = done_ccb->ccb_h.path->bus->sim; 4833 switch (done_ccb->ccb_h.path->periph->type) { 4834 case CAM_PERIPH_BIO: 4835 TAILQ_INSERT_TAIL(&sim->sim_doneq, &done_ccb->ccb_h, 4836 sim_links.tqe); (kgdb) backtrace #0 doadump () at pcpu.h:196 #1 0xc061d0f7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc061d3c9 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc0865fcc in trap_fatal (frame=0xe6ee5b40, eva=20) at /usr/src/sys/i386/i386/trap.c:939 #4 0xc0866230 in trap_pfault (frame=0xe6ee5b40, usermode=0, eva=20) at /usr/src/sys/i386/i386/trap.c:852 #5 0xc0866bc2 in trap (frame=0xe6ee5b40) at /usr/src/sys/i386/i386/trap.c:530 #6 0xc084d45b in calltrap () at /usr/src/sys/i386/i386/exception.s:159 #7 0xc044b09b in xpt_done (done_ccb=0xc6bf5000) at /usr/src/sys/cam/cam_xpt.c:4832 #8 0xc044eee9 in xpt_scan_bus (periph=0xc6984b00, request_ccb=0xc6bf5000) at /usr/src/sys/cam/cam_xpt.c:5395 #9 0xc044d241 in camisr_runqueue (V_queue=Variable "V_queue" is not available. ) at /usr/src/sys/cam/cam_xpt.c:7316 #10 0xc044d39e in camisr (dummy=0x0) at /usr/src/sys/cam/cam_xpt.c:7216 #11 0xc05fb41b in ithread_loop (arg=0xc699d770) at /usr/src/sys/kern/kern_intr.c:1088 #12 0xc05f7f69 in fork_exit (callout=0xc05fb260 <ithread_loop>, arg=0xc699d770, frame=0xe6ee5d38) at /usr/src/sys/kern/kern_fork.c:810 #13 0xc084d4d0 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:264 I can supply dmesg, more info, make it crash more, etc. I suspect it will panic again when the rebuild completes, I'll capture that one as well. Please let me know how to proceed - I can open a PR if this is truly a bug, or bring it over to freebsd-scsi if more appropriate. Thanks, Charles ___ Charles Sprickman NetEng/SysAdmin Bway.net - New York's Best Internet - www.bway.net spork@bway.net - 212.655.9344
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.OSX.2.00.0902090102060.37588>