Date: Tue, 13 Jan 2009 00:09:24 +0000 (GMT) From: Robert Watson <rwatson@FreeBSD.org> To: Pete French <petefrench@ticketswitch.com> Cc: freebsd-stable@freebsd.org, drosih@rpi.edu, rblayzor.bulk@inoc.net Subject: Re: Big problems with 7.1 locking up :-( Message-ID: <alpine.BSF.2.00.0901130005130.16794@fledge.watson.org> In-Reply-To: <E1LMS1C-0002x6-Je@dilbert.ticketswitch.com> References: <E1LMS1C-0002x6-Je@dilbert.ticketswitch.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 12 Jan 2009, Pete French wrote: >> I'm not sure if you've done this already, but the normal suggestions apply: >> have you compiled with INVARIANTS/WITNESS/DDB/KDB/BREAK_TO_DEBUGGER, and do >> any results / panics / etc result? Sometimes these debugging tools are >> able to convert hangs into panics, which gives us much more ability to >> debug them. > > OK, I have now had a machine hand again, with the correct debug options in > the kernel. The screen looked like this when I went to restart it: > > http://toybox.twisted.org.uk/~pete/71_lor2.png > > It had not, however, dropped into any kind of debugger. Also there appear to > me console messages after the lock order reversal - is that normal ? Lock order reversals are warnings of potential deadlock due to a lock cycle, but deadlocks may not actually result, either because it's a false positive (some locking construct that is deadlock free but involves lock cycles), or because a cycle didn't actually form. The message is suggestive, but if you have significant system activity after the message, then it may be unrelated. > The machine did stay up for a signifanct amount of time before doing this. I > notice that it is more or less identical to the one I posted whenI had > WITNESS_KDB in the kernel too, so maybe those results arent entirely > suprious after all ? > > Given it hasnt dropped to a debugger, is there anything else I can try ? Features like WITNESS and INVARIANTS may change the timing of the kernel making certain race conditions less likely; I'd run with them for a bit and see if you can reproduce the hang with them present, as they will make debugging the problem a lot easier, if it's possible. Robert N M Watson Computer Laboratory University of Cambridge
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.0901130005130.16794>