From owner-freebsd-current@FreeBSD.ORG  Mon Jun 16 13:57:20 2003
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 81B5B37B401
	for <current@FreeBSD.org>; Mon, 16 Jun 2003 13:57:20 -0700 (PDT)
Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163])
	by mx1.FreeBSD.org (Postfix) with ESMTP id DD36C43FBF
	for <current@FreeBSD.org>; Mon, 16 Jun 2003 13:57:19 -0700 (PDT)
	(envelope-from truckman@FreeBSD.org)
Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2])
	by gw.catspoiler.org (8.12.9/8.12.9) with ESMTP id h5GKvCM7049856;
	Mon, 16 Jun 2003 13:57:16 -0700 (PDT)
	(envelope-from truckman@FreeBSD.org)
Message-Id: <200306162057.h5GKvCM7049856@gw.catspoiler.org>
Date: Mon, 16 Jun 2003 13:57:12 -0700 (PDT)
From: Don Lewis <truckman@FreeBSD.org>
To: chris@shenton.org
In-Reply-To: <87isr69nc7.fsf@PECTOPAH.shenton.org>
MIME-Version: 1.0
Content-Type: TEXT/plain; charset=us-ascii
cc: current@FreeBSD.org
Subject: Re: 	5.1-CURRENT hangs on disk i/o? sysctl_old_user()
 non-sleepable locks
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 16 Jun 2003 20:57:20 -0000

On 16 Jun, Chris Shenton wrote:
> (I don't know if this has any relation to the problems I reported
> yesterday with qmail-send consuming 100% cpu after 5.0 to 5.1 upgrade.)

I doubt it.  I checked in a fix for this problem today so you should get
the fix when you next cvsup.

> After booting 5.1-CURRENT the system runs fine for a while.  Then
> later most disk i/o related actions seem to hang.  E.g., system works
> but when cron kicks off a glimpseindex in the middle of the night, the
> system is useless by the morning.  If I login on the console as me, it
> takes my username and password then hangs (trying to run
> /usr/local/bin/bash?). If I do this as root, I do get a shell
> (/bin/csh).  After a point, asking for "top" will hang, even as root.
> Even a "reboot" hung this morning with nothing in the logs.

Can you break into ddb and do a ps to find out what state all the
processes are in?  You might want to try adding the DEBUG_VFS_LOCKS
options to your kernel config to see if that turns up anything.  There
is also ddb command to list the locked vnodes "show lockedvnods".

Are you using nullfs or unionfs which are a bit fragile?

> The system has become almost unusable because of this, requiring
> frequent reboots or hardware resets.
> 
> Sometimes when I do something as simple as "ps" I see this ominous
> message on the console:
> 
>   sysctl_old_user() with the following non-sleepablelocks held:
>   exclusive sleep mutex process lock r = 0 (0xc50bc9e0) locked @ /usr/src/sys/kern/kern_proc.c:258
> 
> which gets into /var/log/messages as:
> 
>   Jun 16 08:33:48 PECTOPAH kernel: exclusive sleep mutex process lock r = 0 (0xc50c7618) locked @ /usr/src/sys/kern/kern_proc.c:258
> 
> There are a bunch of these.

I've been seeing this for about the last week, I think.  It seems to be
harmless and nothing bad has happened to my -current box.