From owner-freebsd-current@FreeBSD.ORG Wed Jan 30 13:45:09 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4CE1816A417; Wed, 30 Jan 2008 13:45:09 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id DEE6D13C447; Wed, 30 Jan 2008 13:45:08 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 7D1D5472B2; Wed, 30 Jan 2008 08:45:08 -0500 (EST) Date: Wed, 30 Jan 2008 13:45:08 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Jos Backus In-Reply-To: <20080129015916.GA5413@lizzy.catnook.local> Message-ID: <20080130134307.P41201@fledge.watson.org> References: <20080128180840.GA9582@lizzy.catnook.local> <20080128204730.I56811@fledge.watson.org> <20080129015916.GA5413@lizzy.catnook.local> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: rrs@FreeBSD.org, freebsd-current@freebsd.org, bz@FreeBSD.org Subject: Re: panic: Lock so_rcv_sx not exclusively locked X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Jan 2008 13:45:09 -0000 On Mon, 28 Jan 2008, Jos Backus wrote: > On Mon, Jan 28, 2008 at 08:49:41PM +0000, Robert Watson wrote: >> On Mon, 28 Jan 2008, Jos Backus wrote: >> >>> Recent -current kernels can reliably be made to panic by killing >>> npviewer.bin. Right before killing, npviewer.bin shows up like this in >>> top: >>> >>> 6288 jos 1 45 5 62844K 36788K so_rcv 0:08 0.00% npviewer.bin >> >> Is this a multithreaded app? If so, could you look and see if you can find >> the other threads in npviewer.bin in the kernel (info thread, etc) and get >> stack traces for those threads as well? > > It's a multithreaded Linux app, installed through ports; see the attached > file. Please let me know if you'd like me to gather more information. Thanks, looks like that's enough for now. I've attached a patch, could you let me know if it works? There turned out to be a number of issues here, not least the possibility of an inter-thread deadlock if the blocked socket read was never going to return, so if you could watch out for processes failing to exit, not just panics, that would be good. I've CC'd Randall as this change touches SCTP in a few places. URL to the patch to address inevitable MUA manglement: http://www.watson.org/~robert/freebsd/20080130-sblock-sorflush.diff Proposed commit message: Correct two problems relating to sorflush(), which is called to flush read socket buffers in shutdown() and close(): - Call socantrcvmore() before sblock() to dislodge any threads that might be sleeping (potentially indefinitely) while holding sblock(), such as a thread blocked in recv(). - Flag the sblock() call as non-interruptible so that a signal delivered to the thread calling sorflush() doesn't cause sblock() to fail. The sblock() is required to ensure that all other socket consumer threads have, in fact, left, and do not enter, the socket buffer until we're done flushin it. To implement the latter, change the 'flags' argument to sblock() to accept two flags, SBL_WAIT and SBL_NOINTR, rather than one M_WAITOK flag. When SBL_NOINTR is set, it forces a non-interruptible sx acquisition, regardless of the setting of the disposition of SB_NOINTR on the socket buffer; without this change it would be possible for another thread to clear SB_NOINTR between when the socket buffer mutex is released and sblock() is invoked. Reported by: Jos Backus The patch: Index: sys/socketvar.h =================================================================== RCS file: /home/ncvs/src/sys/sys/socketvar.h,v retrieving revision 1.160 diff -u -r1.160 socketvar.h --- sys/socketvar.h 17 Dec 2007 10:02:01 -0000 1.160 +++ sys/socketvar.h 30 Jan 2008 06:13:43 -0000 @@ -273,6 +273,13 @@ */ /* + * Flags to sblock(). + */ +#define SBL_WAIT 0x00000001 /* Wait if not immediately available. */ +#define SBL_NOINTR 0x00000002 /* Force non-interruptible sleep. */ +#define SBL_VALID (SBL_WAIT | SBL_NOINTR) + +/* * Do we need to notify the other side when I/O is possible? */ #define sb_notify(sb) (((sb)->sb_flags & (SB_WAIT | SB_SEL | SB_ASYNC | \ Index: kern/uipc_sockbuf.c =================================================================== RCS file: /home/ncvs/src/sys/kern/uipc_sockbuf.c,v retrieving revision 1.174 diff -u -r1.174 uipc_sockbuf.c --- kern/uipc_sockbuf.c 17 Dec 2007 10:02:01 -0000 1.174 +++ kern/uipc_sockbuf.c 30 Jan 2008 13:21:40 -0000 @@ -137,8 +137,12 @@ sblock(struct sockbuf *sb, int flags) { - if (flags == M_WAITOK) { - if (sb->sb_flags & SB_NOINTR) { + KASSERT((flags & SBL_VALID) == flags, + ("sblock: flags invalid (0x%x)", flags)); + + if (flags & SBL_WAIT) { + if ((sb->sb_flags & SB_NOINTR) || + (flags & SBL_NOINTR)) { sx_xlock(&sb->sb_sx); return (0); } Index: kern/uipc_socket.c =================================================================== RCS file: /home/ncvs/src/sys/kern/uipc_socket.c,v retrieving revision 1.303 diff -u -r1.303 uipc_socket.c --- kern/uipc_socket.c 24 Oct 2007 19:03:55 -0000 1.303 +++ kern/uipc_socket.c 30 Jan 2008 05:40:06 -0000 @@ -916,7 +916,7 @@ } #endif /*ZERO_COPY_SOCKETS*/ -#define SBLOCKWAIT(f) (((f) & MSG_DONTWAIT) ? M_NOWAIT : M_WAITOK) +#define SBLOCKWAIT(f) (((f) & MSG_DONTWAIT) ? 0 : SBL_WAIT) int sosend_dgram(struct socket *so, struct sockaddr *addr, struct uio *uio, @@ -1884,10 +1884,16 @@ * however, we have to initialize and destroy the mutex in the copy * so that dom_dispose() and sbrelease() can lock t as needed. */ - (void) sblock(sb, M_WAITOK); - SOCKBUF_LOCK(sb); - sb->sb_flags |= SB_NOINTR; - socantrcvmore_locked(so); + + /* + * Dislodge threads currently blocked in receive and wait to acquire + * a lock against other simultaneous readers before clearing the + * socket buffer. Don't let our acquire be interrupted by a signal + * despite any existing socket disposition on interruptable waiting. + */ + socantrcvmore(so); + (void) sblock(sb, SBL_WAIT | SBL_NOINTR); + /* * Invalidate/clear most of the sockbuf structure, but leave selinfo * and mutex data unchanged. Index: kern/uipc_syscalls.c =================================================================== RCS file: /home/ncvs/src/sys/kern/uipc_syscalls.c,v retrieving revision 1.263 diff -u -r1.263 uipc_syscalls.c --- kern/uipc_syscalls.c 13 Jan 2008 14:44:09 -0000 1.263 +++ kern/uipc_syscalls.c 30 Jan 2008 05:41:03 -0000 @@ -1863,8 +1863,13 @@ } } - /* Protect against multiple writers to the socket. */ - (void) sblock(&so->so_snd, M_WAITOK); + /* + * Protect against multiple writers to the socket. + * + * XXXRW: Historically this has assumed non-interruptibility, so now + * we implement that, but possibly shouldn't. + */ + (void)sblock(&so->so_snd, SBL_WAIT | SBL_NOINTR); /* * Loop through the pages of the file, starting with the requested Index: netinet/sctp_input.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/sctp_input.c,v retrieving revision 1.66 diff -u -r1.66 sctp_input.c --- netinet/sctp_input.c 16 Oct 2007 14:05:51 -0000 1.66 +++ netinet/sctp_input.c 30 Jan 2008 05:41:58 -0000 @@ -2509,7 +2509,8 @@ atomic_add_int(&(*stcb)->asoc.refcnt, 1); SCTP_TCB_UNLOCK((*stcb)); - sctp_pull_off_control_to_new_inp((*inp_p), inp, *stcb, M_NOWAIT); + sctp_pull_off_control_to_new_inp((*inp_p), inp, *stcb, + 0); SCTP_TCB_LOCK((*stcb)); atomic_subtract_int(&(*stcb)->asoc.refcnt, 1); Index: netinet/sctp_peeloff.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/sctp_peeloff.c,v retrieving revision 1.16 diff -u -r1.16 sctp_peeloff.c --- netinet/sctp_peeloff.c 8 Sep 2007 11:35:10 -0000 1.16 +++ netinet/sctp_peeloff.c 30 Jan 2008 05:42:07 -0000 @@ -134,7 +134,7 @@ atomic_add_int(&stcb->asoc.refcnt, 1); SCTP_TCB_UNLOCK(stcb); - sctp_pull_off_control_to_new_inp(inp, n_inp, stcb, M_WAITOK); + sctp_pull_off_control_to_new_inp(inp, n_inp, stcb, SBL_WAIT); atomic_subtract_int(&stcb->asoc.refcnt, 1); return (0); @@ -230,7 +230,7 @@ * And now the final hack. We move data in the pending side i.e. * head to the new socket buffer. Let the GRUBBING begin :-0 */ - sctp_pull_off_control_to_new_inp(inp, n_inp, stcb, M_WAITOK); + sctp_pull_off_control_to_new_inp(inp, n_inp, stcb, SBL_WAIT); atomic_subtract_int(&stcb->asoc.refcnt, 1); return (newso); } Index: netinet/sctputil.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/sctputil.c,v retrieving revision 1.72 diff -u -r1.72 sctputil.c --- netinet/sctputil.c 7 Dec 2007 01:32:14 -0000 1.72 +++ netinet/sctputil.c 30 Jan 2008 05:41:42 -0000 @@ -4993,7 +4993,7 @@ sctp_misc_ints(SCTP_SORECV_ENTERPL, rwnd_req, block_allowed, so->so_rcv.sb_cc, uio->uio_resid); } - error = sblock(&so->so_rcv, (block_allowed ? M_WAITOK : 0)); + error = sblock(&so->so_rcv, (block_allowed ? SBL_WAIT : 0)); sockbuf_lock = 1; if (error) { goto release_unlocked;