Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Aug 2011 16:46:17 +0100
From:      "Steven Hartland" <killing@multiplay.co.uk>
To:        "Jeremy Chadwick" <freebsd@jdc.parodius.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: debugging frequent kernel panics on 8.2-RELEASE
Message-ID:  <ADF5E597D1C0428D8FB838D94BDEB3A4@multiplay.co.uk>
References:  <47F0D04ADF034695BC8B0AC166553371@multiplay.co.uk> <20110810151256.GA38601@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
----- Original Message ----- 
From: "Jeremy Chadwick" <freebsd@jdc.parodius.com>


> On Wed, Aug 10, 2011 at 03:22:52PM +0100, Steven Hartland wrote:
>> The base stack reported is a double fault with no additional
>> details and CTRL+ALT+ESC fails to break to the debugger as
>> does and NMI, even though it at least tries printing the
>> following many times some quite jumbled:-
>> NMI ... going to debugger

> If you're generating the NMI yourself (possibly via the KVM, etc.) then
> okay, that's different.  I'm trying to discern whether or not *you're*
> generating the NMI, or if the NMI just happens and causes a panic for
> you and that's what you're worried about.

Yer generating it after panic in order to try and get to the debugger :)

> Now to discuss the "jumbled console output":
...
> The default (assuming your kernel configs are based off of GENERIC
> within the past 4-5 years) is 128.  However, the same developers stated
> that they have great reservations over increasing this number
> dramatically (meaning, something like 256 will probably work, but larger
> "may have repercussions which are unknown at this time").

Might try that if it will help but with so many production machines to
action I'd like to try and avoid if possible.

>> The machines are single disk ZFS root install and the dump
>> device is configured using the gptid, could this be what's
>> preventing the dump happening?
> 
> I can tell you that others have reported this problem where the kernel
> panic/dump begins but either locks up after showing the first progress
> metre/amount, or during the dumping itself.

Ahh, so possibly not a gptid issue

> I give everyone the same advice: please make sure that you have a swap
> partition that's large enough to fit your entire memory contents
> (preferably a swap that's 2x or 1.5x the amount of physical RAM), and
> please make sure it's on a dedicated slice (e.g. ada0s1b).  I do not
> advise any sort of "abstraction" layer between swap and the rest of the
> system.  It might seem like a great/fun/awesome idea followed by
> "whatever jdc, it works!" but when a crash happens -- which is when you
> need it most -- and it doesn't work, I won't sympathise.  :-)
> 
> As for the GPT aspects of things: I'm still not familiar with GPT (as a
> technology I am, but when it comes to actual usability I am not).

Just managed to get a crash dump from one machine so hopefully will be able
to make some progress is someone can point me in the right direction.

> # Debugging options
> options         BREAK_TO_DEBUGGER       # Sending a serial BREAK drops to DDB
> options         ALT_BREAK_TO_DEBUGGER   # Permit <CR>~<Ctrl-b> to drop to DDB
> options         KDB                     # Enable kernel debugger support
> options         KDB_TRACE               # Print stack trace automatically on panic
> options         DDB                     # Support DDB
> options         GDB                     # Support remote GDB

Cheers 

> In combination with this, we use the following in /etc/rc.conf (the
> dumpdev line is important, else savecore won't pick up anything):
> 
> dumpdev="auto"

I thought this was ment to be the default from back in the 6.x days but
it didnt seem to work, so I added the gptid device from /etc/fstab

> ddb_enable="yes"

Thanks :)

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ADF5E597D1C0428D8FB838D94BDEB3A4>