Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Feb 2008 10:43:18 -0500
From:      Jim Pingle <lists@pingle.org>
To:        stable@freebsd.org
Subject:   Re: 7.0-PRERELEASE Fatal Trap 12 with sysctl and acpi
Message-ID:  <47C58516.6010403@pingle.org>
In-Reply-To: <47B6192A.3030507@pingle.org>
References:  <47B4AAA3.6060501@pingle.org> <47B6192A.3030507@pingle.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Jim Pingle wrote:
> Jim Pingle wrote:
>> I'm having some trouble with a SuperMicro SuperServer 6022L-6 that 
>> previously ran 7.0-BETA4 without problems. Today, I updated this 
>> machine to 7.0-PRERELEASE and now it will not fully boot unless I 
>> disable ACPI. A quick search of the PR database didn't turn up 
>> anything similar with sysctl and ACPI.

I wiped the machine, installed from the RC3 CD, and it did not crash. If I 
update to RELENG_7, the crash comes back. If I go back to RELENG_7_0, there 
is no crash.

> Kernel config is GENERIC, with ULE scheduler and "options ASR_COMPAT"

This happens with GENERIC, with no extra options, as well as with my custom 
kernel.

>> If I get some time next week I might try a binary search of commits 
>> between BETA4 and now, to pinpoint where it stopped working.
> 
> As a buildworld/buildkernel takes about an hour and a half on this 
> hardware (2x2GHz Xeon), I haven't fully narrowed this down yet. It is 
> somewhere between 12/15/2007 (works) and 12/25/2007 (crashes). I glanced 
> at the archives between those points but I didn't see any similar 
> complaints. The only ACPI references I saw in the archives were 
> referring to thermal zone problems, and a commit relating to those.
> 
> I'll return to this early next week to see if I can narrow this down 
> more precisely.

I tried a binary search of the source tree to narrow down the crash. I found 
that one vector for the crash was introduced between 2007/12/19 20:00:00 and 
2007/12/19 23:59:00, which left me with only a handful of files to test.

By process of elimination, I found that if I backed some changes out in 
machdep.c, the crash stopped.

machdep.c v1.658 2007/08/09 njl - Boots OK
machdep.c v1.658.2.1 2007/12/19 rpaulo - Crashes

The confusing part (to me) is that my next step was to update all the way to 
RELENG_7 as of yesterday, then back out those same changes, but the crash 
still happened. So either I misidentified the cause of the crash -- which is 
quite possible -- or it was reintroduced in some other change (or both!).

I have a debug kernel built now, and I can generate vmcore files at will. 
Does anyone have any ideas? Is there some more information that I can gather 
that will help find the cause?

Now that I have some more solid information, I'll open a PR.

Jim



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47C58516.6010403>