Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 8 Jan 2004 04:10:55 -0800 (PST)
From:      Don Lewis <truckman@FreeBSD.org>
To:        rwatson@FreeBSD.org
Cc:        current@FreeBSD.org
Subject:   Re: strace, holding sigacts lock over postsig(), et al.
Message-ID:  <200401081210.i08CAt7E019668@gw.catspoiler.org>
In-Reply-To: <Pine.NEB.3.96L.1040107233603.13166H-100000@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On  7 Jan, Robert Watson wrote:
> 
> Got a bug report this evening that the strace package hangs on 5-CURRENT.
> I'm able to confirm this; for those that don't know, strace makes
> extensive use of procfs.  On attempting to reproduce it, I first got:
> 
> crash2# strace ls
> Sleeping on "stopevent" with the following non-sleepable locks held:
> exclusive sleep mutex sigacts r = 0 (0xc20e2aa8) locked @
> kern/subr_trap.c:260
> lock order reversal
>  1st 0xc20e2aa8 sigacts (sigacts) @ kern/subr_trap.c:260
>  2nd 0xc20f1224 process lock (process lock) @ kern/kern_synch.c:309
> Stack backtrace:
> backtrace(c0864c4a,c20f1224,c0860e7b,c0860e7b,c0861ee5) at backtrace+0x17
> witness_lock(c20f1224,8,c0861ee5,135,c20f1224) at witness_lock+0x672
> _mtx_lock_flags(c20f1224,0,c0861edc,135,ffffffff) at _mtx_lock_flags+0xba
> msleep(c20f12e8,c20f1224,5c,c0865441,0) at msleep+0x794
> stopevent(c20f11b8,2,13,823,c0922200) at stopevent+0x85
> issignal(c1f31bd0,2,c08619f7,bd,1) at issignal+0x168
> cursig(c1f31bd0,0,c0864399,104,0) at cursig+0xe8
> ast(c9520d48) at ast+0x4b0
> doreti_ast() at doreti_ast+0x17
> load: 0.21  cmd: strace 583 [iowait] 0.00u 0.91s 0% 724k
> [sent a serial break]
> 
> Cool, eh?  Second try:
> 
> crash2# strace ls
> execve(0xbfbfe890, [0xbfbfed54], [/* 0 vars */]PIOCWSTOP: Input/output
> error
> 
> Even better.
> 
> The first obvious observation is that holding mutexes other than the
> process mutex over calls to _STOPEVENT() is a bad idea.  It seems like the
> p_sig mutex is used to cover a fair amount of flag handling, signal entry
> changes, etc, etc.  I'm not familiar with the semantic requirements here,
> but presumably something needs to change.  Is it possible to release the
> locks after grabbing the value of 'action' (or even do a lock-free read),
> and then grab the sigact lock only later during actual delivery, yet
> maintain the right semantics?

In both issignal() and postsig() I think it would be safe to drop the
p_sig mutex before _STOPEVENT() and grab the mutex again afterwards.
About the only thing that can happen during the interim would be the
receipt of another signal and I don't think that would be a problem.
Dropping the mutex is how issignal() handles ptracestop() a bit further
down in the code.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200401081210.i08CAt7E019668>