From owner-freebsd-current@FreeBSD.ORG Wed Jul 7 20:26:10 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9E26916A4D0 for ; Wed, 7 Jul 2004 20:26:10 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 59D9343D39 for ; Wed, 7 Jul 2004 20:26:10 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.11/8.12.11) with ESMTP id i67KPTT6086286; Wed, 7 Jul 2004 16:25:29 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i67KPTSD086283; Wed, 7 Jul 2004 16:25:29 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Wed, 7 Jul 2004 16:25:29 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: "Conrad J. Sabatier" In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-current@freebsd.org Subject: Re: Any way to debug a hard lockup? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2004 20:26:10 -0000 (Trimmed a slightly long CC list) On Wed, 7 Jul 2004, Conrad J. Sabatier wrote: > For the last few days, my amd64 box (Athlon 64 3200+, 5.2-CURRENT, > updating just about daily lately) has been locking up hard fairly often, > and I have no clue as to why. > > Checking the logs after rebooting reveals nothing at all out of the > ordinary. The box just "seizes up" and that's that. Only thing to do > is a power-down and restart. > > Changing from SCHED_ULE to SCHED_4BSD appears to make no difference at > all. > > My other box, an older (32-bit) Athlon, has been rock stable. I've been > keeping both machines updated simultaneously to exactly the same > revisions, using as near-identical system settings as possible also, the > main difference being I'm running X.org/GNOME on the amd64 box and no X > on the i386 box. > > Any suggestions? Instinct tells me it may be a problem with some of the > ports I'm running (X.org/GNOME, hence the Cc:s), but how to verify? > Sure, I could run without X a while and see how it goes, but that still > won't help to isolate the problem, if it is, in fact, port-related. As already has been pointed out by others in this thread, a serial console with BREAK_TO_DEBUGGER enabled is invaluable when debugging hangs, and even more so when X11 may be involved. If you're still unable to get into a debugger using a serial break, much new server hardware supports generating an NMI, sometimes using a nice web interface or IPMI. This will often be able to drop you into a debugger even when nothing else works. You can find a pretty decent chapter on setting up serial console, serial port debugging, etc, in the handbook. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Principal Research Scientist, McAfee Research