Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Jun 2003 20:20:45 -0700 (PDT)
From:      Don Lewis <truckman@FreeBSD.org>
To:        chris@Shenton.Org
Cc:        current@FreeBSD.org
Subject:   Re: 5.1-CURRENT hangs on disk i/o? sysctl_old_user() non-sleepable locks
Message-ID:  <200306180320.h5I3KjM7053484@gw.catspoiler.org>
In-Reply-To: <87smq8jdj7.fsf@PECTOPAH.shenton.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 17 Jun, Chris Shenton wrote:
> Don Lewis <truckman@FreeBSD.org> writes:
> 
>> If you have another machine and a null modem cable you can redirect the
>> system console of the machine to be debugged to a serial port and run
>> some comm software on the other machine so that you can capture all the
>> output from ddb.
> 
> OK, I'll give that a shot, probably tomorrow.
> 
> 
>> At the ddb prompt, you can do a "tr" command to get a stack trace,
>> which is likely to be very helpful in pointing out the offending
>> code.
> 
> Just saw it again, did a tr.  From chicken-scratch notes, the last
> bits are:
> 
>   VOP_GETVOBJECT(...)
>   do_sendfile(...)
>   sendfile(...)
>   syscall(...)
>   Xint0x80_syscall...
>   --- syscall( 393, FreeBSD ELF32, sendfile) ...
> 
> The next time it dropped into ddb, same "sendfile" thing.

Try the very untested patch below ...

> The main services I'm running are qmail, apache, and NFS.  Also 
> tftp, rarpd, lpd, sshd, bootparamd ...  oh, well, I guess I'm running
> a bunch of stuff here. :-(  Not sure which one, if any, this would be.
> 
> Unless sendfile() is something in the OS?

It's a system call, and I believe apache uses it.

> 
> I'll have to dig up a nullmodem and grab console output.  I realise
> I'm not giving enough detailed info to be very helpful here.

It's good enough to squash one bug.  I don't know if it will solve your
problem, though.


>> If you are running the NFS *client* code on this machine, there is one
>> lock assertion that is easy to trigger. 
> 
> In my kernel config I have this, because a diskless box uses the same
> kernel, but my /etc/fstab doesn't mount anyone else's NFS exports.

You won't trigger the the lock violation in the NFS client code unless
you actually mount a file system from another machine using NFS and
actually do some I/O on it.

Here's the patch:

Index: uipc_syscalls.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/uipc_syscalls.c,v
retrieving revision 1.150
diff -u -r1.150 uipc_syscalls.c
--- uipc_syscalls.c	12 Jun 2003 05:52:09 -0000	1.150
+++ uipc_syscalls.c	18 Jun 2003 03:14:42 -0000
@@ -1775,10 +1775,13 @@
 	 */
 	if ((error = fgetvp_read(td, uap->fd, &vp)) != 0)
 		goto done;
+	vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td);
 	if (vp->v_type != VREG || VOP_GETVOBJECT(vp, &obj) != 0) {
 		error = EINVAL;
+		VOP_UNLOCK(vp, 0, td);
 		goto done;
 	}
+	VOP_UNLOCK(vp, 0, td);
 	if ((error = fgetsock(td, uap->s, &so, NULL)) != 0)
 		goto done;
 	if (so->so_type != SOCK_STREAM) {



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200306180320.h5I3KjM7053484>