Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Feb 2009 09:30:15 -0500
From:      Matt Hempel <matt@biglist.com>
To:        gavin@FreeBSD.org
Cc:        freebsd-bugs@FreeBSD.org
Subject:   Re: kern/131571: Running with APIC enabled crashes a Supermicro server running 7.0/7.1
Message-ID:  <4992E0F7.5040308@biglist.com>
In-Reply-To: <200902111152.n1BBqnSc002745@freefall.freebsd.org>
References:  <200902111152.n1BBqnSc002745@freefall.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

In answer to your questions:

This server began as a 4.11 production machine ... its role now is as a 
testbed for 7.1.

6.x never ran on the box.  We performed a fresh install of 7.0, then an 
update to 7.1.  There's an Adaptec raid controller in the server we 
wished to obviate (the management utility wouldn't run on 7.x) so we  
started from scratch and employed gmirror.

Answering "N" to the Panic question yielded these results:

panic y/n? [y] INTR: Adding local APIC 6 as a target
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  6
 cpu3 (AP): APIC ID:  7
bios32: Found BIOS32 Service Directory header at 0xc00f7070
bios32: Entry = 0xfd6a0 (c00fd6a0)  Rev = 0  Len = 1
pcibios: PCI BIOS entry at 0xfd6a0+0x225
pnpbios: Found PnP BIOS data at 0xc00f70d0
pnpbios: Entry = f0000:9a25  Rev = 1.0
Other BIOS signatures found:
APIC: CPU 0 has ACPI ID 0
APIC: CPU 1 has ACPI ID 2
APIC: CPU 2 has ACPI ID 1


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xbc
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc0dec6dc
stack pointer           = 0x28:0xc1020d5c
frame pointer           = 0x28:0xc1020d6c
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 ()
trap number             = 12
panic: page fault
cpuid = 0
kernel trap 12 with interrupts disabled

A number of Fatal trap 12s quickly followed.  The machine didn't recover.

We looked hard, but could not find a pattern to when the boot process 
would fail and not fail.  We rely on remote hands for this box, so we 
cannot control the process completely, but we did receive advice at one 
point to attempt cold boots and the results were the same.

What I can report is that initially the box hung up repeatedly at 
"mounting root," so much so that I assumed it was a gmirror problem.  
The APIC (?) failure popped up after that and the boot failures have 
been predominantly, even exclusively, those, ever since.

If I'm not forwarding this email properly, please let me know.

thanks

M

> Also, if you answer "N" to the "Panic? (y/n)" question, does the
> machine boot and run successfully?
> Lastly, is there a pattern as to when the machine does and doesn't boot?
> For example, does it always boot or not from a power on (rather than a
> reset)?
>
>
> Responsible-Changed-From-To: freebsd-bugs->gavin
> Responsible-Changed-By: gavin
> Responsible-Changed-When: Wed Feb 11 11:49:11 UTC 2009
> Responsible-Changed-Why: 
> Track
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=131571
>
>   




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4992E0F7.5040308>