Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Dec 2004 12:38:13 +0000 (GMT)
From:      Robert Watson <rwatson@freebsd.org>
To:        Michael Nottebrock <michaelnottebrock@gmx.net>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: crashdumps not working
Message-ID:  <Pine.NEB.3.96L.1041208123155.98791J-100000@fledge.watson.org>
In-Reply-To: <200412081316.50578.michaelnottebrock@gmx.net>

next in thread | previous in thread | raw e-mail | index | archive | help

On Wed, 8 Dec 2004, Michael Nottebrock wrote:

> > > I don't have kdb enabled in my kernel configuration at all...
> >
> > I'm guessing it might be useful at this point, if possible :-).
> 
> Useful for what exactly? I'm mainly interested in getting this machine
> to auto-reboot after a (watchdog-triggered) panic, crashdumps are a
> bonus. At the moment, it will just hang on a panic (even if I do not
> enable crashdumps in rc.conf, it won't reset), and since it's usually
> running X, it will just stand there while the CRTs burn in. If you think
> you can get a clue as to why it wouldn't crashdump or reset by something
> I can do in kdb, I will enable it ... 

The primary goal in using KDB would be to see what parts of the crash,
dump, and reset work separately.  For example, by entering KDB using the
sysctl, we can see if dumps work on your system when not in a potentially
sticky situation (i.e., not in an interrupt handler, or with interrupts
disabled, after a controller wedge, or the like).  So I'm thinking it
would be nice to know:

- Can you enter and continue from kdb normally using the sysctl.
- If you can enter kdb using the sysctl, does "call doadump()" work from
  kdb?
- If you can enter kdb using the sysctl, oes "reset" work from kdb?

I.e., do the individual elements work from the debugger.  If they do, then
we can try the same from entering the debugger following the panic, and
see how things differ.

> > This is a NULL pointer dereference; you can use addr2line or gdb on your
> > kernel.debug to turn it into a line number even without a core.  That
> > might well be worth doing, as we might be able to debug that even without
> > getting dumping working on the box.
> 
> It's a SCHED_ULE + PREEMPTION triggered panic, probably there's no point
> in investigating it at this point, as _ULE has been demoted to
> abandonware :-(. 

ULE is temporarily without an owner, but Jeff and others have expressed
interest in working on it further.  I'd not run it for the time being, but
it's probably not a hopeless case.  Does the above statement mean that the
hangs or panics you are experiencing don't happen at all if you just use
4BSD?

> > Syncing on panic is, in general, probably not going to make it work better
> > than not.  I guess there's no chance the box has an NMI button?
> 
> Right. I just enabled it for the SW_WATCHDOG experiments (which made me
> discover that this machine would just get stuck on panics in the first
> place), I already turned it off again. 

Thanks.  Just trying to keep track of and reduce the number of variables.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Principal Research Scientist, McAfee Research



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1041208123155.98791J-100000>