Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Oct 2006 15:13:54 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        John E Hein <jhein@timing.com>
Cc:        Kostik Belousov <kostikbel@gmail.com>, stable@freebsd.org, davidxu@freebsd.org
Subject:   Re: locked vnode / nfs... requires kill -9 in ddb
Message-ID:  <200610201513.55539.jhb@freebsd.org>
In-Reply-To: <17720.62415.274270.378426@gromit.timing.com>
References:  <17718.20457.799395.602805@gromit.timing.com> <17719.56453.21278.746053@gromit.timing.com> <17720.62415.274270.378426@gromit.timing.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Friday 20 October 2006 12:05, John E Hein wrote:
> John Baldwin wrote at 10:44 -0400 on Oct 19, 2006:
>  > On Thursday 19 October 2006 06:04, Kostik Belousov wrote:
>  > > On Wed, Oct 18, 2006 at 10:01:45AM -0600, John E Hein wrote:
>  > > > 6.2-PRERELEASE from 20061016 RELENG_6 sources.
>  > > > Locked vnodes
>  > > >  
>  > > > 0xc6b7bdd0: tag nfs, type VDIR
>  > > >     usecount 2, writecount 0, refcount 8 mountedhere 0
>  > > >     flags (VV_ROOT)
>  > > >     v_object 0xc9d84108 ref 0 pages 0
>  > > >      lock type nfs: EXCL (count 1) by thread 0xc8adac00 (pid 50746) 
with 5 
>  > pending
>  > > >         fileid 8 fsid 0x300ff06
>  > > > 
>  > > > 50746 50000 49999   600  T+                          sh
>  > > >  .
>  > > >  .
>  > > > db>db> trace 50746
>  > > > Tracing pid 50746 tid 100231 td 0xc8adac00
>  > > > sched_switch(c8adac00,0,2) at 0xc05ce0cb = sched_switch+0x173
>  > > > mi_switch(2,0) at 0xc05c2b0a = mi_switch+0x1ba
>  > > > thread_suspend_check(1,c079e04c,c8adac00,c9206b80,1,...) at 
0xc05c722d = 
>  > thread_suspend_check+0x191
>  > > > sleepq_catch_signals(c9206b80) at 0xc05db93f = 
sleepq_catch_signals+0x103
>  > > > sleepq_wait_sig(c9206b80) at 0xc05dbd96 = sleepq_wait_sig+0xe
>  > > > msleep(c9206b80,c08a6a40,153,c0813379,0) at 0xc05c2652 = msleep+0x25a
>  > > > nfs_reply(c9206b80,0,c8adac00,4,c7ea7100,...) at 0xc06c33ac = 
>  > nfs_reply+0x244
>  > > > 
>  > 
nfs_request(c6b7bdd0,c6ae2d00,1,c8adac00,c7815280,e8f3488c,e8f34890,e8f34894,c8adac00,e8f348a0) 
>  > at 0xc06c40a5 = nfs_request+0x3c1
>  > > > nfs_getattr(e8f348dc) at 0xc06c912b = nfs_getattr+0x11f
>  > > > VOP_GETATTR_APV(c086c700,e8f348dc) at 0xc07b260c = 
VOP_GETATTR_APV+0x38
>  > > > nfsspec_access(e8f34a8c,c6bf7c94,0,e8f349a4,c060ca26,...) at 
0xc06cebf1 = 
>  > nfsspec_access+0x85
>  > > > nfs_access(e8f34a8c) at 0xc06c8b7a = nfs_access+0x122
>  > > > VOP_ACCESS_APV(c086c700,e8f34a8c) at 0xc07b25b0 = VOP_ACCESS_APV+0x38
>  > > > nfs_lookup(e8f34b18) at 0xc06c96ff = nfs_lookup+0xd3
>  > > > VOP_LOOKUP_APV(c086c700,e8f34b18) at 0xc07b22f7 = VOP_LOOKUP_APV+0x43
>  > > > lookup(e8f34c00) at 0xc060ee79 = lookup+0x4c1
>  > > > namei(e8f34c00) at 0xc060e71a = namei+0x39a
>  > > > kern_stat(c8adac00,806712c,0,e8f34c74) at 0xc061d3cd = kern_stat+0x35
>  > > > stat(c8adac00,e8f34d04) at 0xc061d37b = stat+0x1b
>  > > > syscall(3b,3b,3b,1,80670ec,...) at 0xc07a9363 = syscall+0x2bf
>  > > > Xint0x80_syscall() at 0xc079456f = Xint0x80_syscall+0x1f
>  > > > --- syscall (188, FreeBSD ELF32, stat), eip = 0x28196477, esp = 
>  > 0xbfbfdc1c, ebp = 0xbfbfdcb8 ---
>  > > > db> kill 9 50746
>  > > > db> c
>  > > 
>  > > The nfs_reply is sleeping with the PCATCH set. The question is why 
SIGTSTP
>  > > does not cause msleep to return with EINTR.
>  > 
>  > The problem is in thread_suspend_check(), not the sleepq code.
> 
> 
> It happened again (triggered by ctrl-z).
> INVARIANTS & WITNESS provided no help.
> 
> Is the problem in thread_suspend_check() known?
> MFC-able from HEAD?
> 
> I see this diff.  I'm not sure it will help, but is there any reason
> not to try it in 6 (David Xu CC'd since he made this change)?
> 
> Index: kern_thread.c
> ===================================================================
> RCS file: /base/FreeBSD-CVS/src/sys/kern/kern_thread.c,v
> retrieving revision 1.216.2.6
> retrieving revision 1.235
> diff -u -p -r1.216.2.6 -r1.235
> --- kern_thread.c	2 Sep 2006 17:29:57 -0000	1.216.2.6
> +++ kern_thread.c	28 Aug 2006 04:24:51 -0000	1.235
> @@ -910,6 +926,10 @@ thread_suspend_check(int return_instead)
>  		    (p->p_flag & P_SINGLE_BOUNDARY) && return_instead)
>  			return (ERESTART);
>  
> +		/* If thread will exit, flush its pending signals */
> +		if ((p->p_flag & P_SINGLE_EXIT) && (p->p_singlethread != td))
> +			sigqueue_flush(&td->td_sigqueue);
> +
>  		mtx_lock_spin(&sched_lock);
>  		thread_stopped(p);
>  		/*

This change is not applicable to 6.x.  The bug is likely in both 6.x and HEAD 
in thread_suspend_check().

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200610201513.55539.jhb>