Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Oct 2004 17:51:34 -0400
From:      Brian Fundakowski Feldman <green@freebsd.org>
To:        Vlad <marchenko@gmail.com>
Cc:        Marc UBM Bocklet <ubm@u-boot-man.de>
Subject:   Re: [BETA7-panic] sodealloc(): so_count 1
Message-ID:  <20041006215134.GN47017@green.homeunix.org>
In-Reply-To: <cd70c681041006125527e69bcd@mail.gmail.com>
References:  <20041006015131.10116be7.ubm@u-boot-man.de> <cd70c68104100517074a5cebf2@mail.gmail.com> <20041006090104.06710d85.ubm@u-boot-man.de> <20041006154137.GJ47017@green.homeunix.org> <20041006203220.7f8e7b8a.ubm@u-boot-man.de> <20041006192518.GM47017@green.homeunix.org> <cd70c681041006125527e69bcd@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Oct 06, 2004 at 03:55:21PM -0400, Vlad wrote:
> Brian,
> 
> I've created ticket a while ago in regards to this problem:
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/72126
> 
> also, attached are some additional debug info I could get during last
> two times it crashed.
> 
> unfortunately I don't have big enough stand-alone swap partition to
> get kernel crash dump, so those files are the best I could get.
> 
> here is week-old thread for the same problem:
> http://groups.google.com/groups?th=fc0d7d881f0713cc

Can you tell me where sorele() crashed when you did _not_ have INVARIANTS
enabled?  That will help because the INVARIANTS panic tells us one half
of where the problem is and the sorele() panic is the other half.
Want to try this (untested) patch for the problem that I'm guessing about?

Index: uipc_socket.c
===================================================================
RCS file: /usr/ncvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.212
diff -u -r1.212 uipc_socket.c
--- uipc_socket.c	5 Sep 2004 14:33:21 -0000	1.212
+++ uipc_socket.c	6 Oct 2004 21:51:16 -0000
@@ -321,6 +321,7 @@
 	struct socket *so;
 {
 	struct socket *head;
+	int diedcomplete = 0;
 
 	KASSERT(so->so_count == 0, ("socket %p so_count not 0", so));
 	SOCK_LOCK_ASSERT(so);
@@ -342,30 +343,45 @@
 		    (so->so_qstate & SQ_INCOMP) == 0,
 		    ("sofree: so->so_qstate is SQ_COMP and also SQ_INCOMP"));
 		/*
-		 * accept(2) is responsible draining the completed
-		 * connection queue and freeing those sockets, so
-		 * we just return here if this socket is currently
-		 * on the completed connection queue.  Otherwise,
-		 * accept(2) may hang after select(2) has indicating
-		 * that a listening socket was ready.  If it's an
-		 * incomplete connection, we remove it from the queue
-		 * and free it; otherwise, it won't be released until
-		 * the listening socket is closed.
+		 * If we previously signalled the accept(2) queue for
+		 * completion, but this connection died in the meantime,
+		 * we still need to tell the accept(2) call so that
+		 * it can make progress based upon previously-seen state.
+		 * Note that this will screw up kqueue-applications that
+		 * count the filter-returned queue length as gospel and do
+		 * not use non-blocking-IO on the head.  A more complete
+		 * fix is to replace this socket on so_comp with a "bad
+		 * socket" marker and make the callers understand that.
+		 * A most complete fix would be to make this function
+		 * already have the accept lock on entry.
+		 *
+		 * This avoids some relatively-impossible-to-fix races
+		 * with the accept lock, the socket lock, and multiple
+		 * concurrent frees on a zero-reference-count socket.
+		 * It's bad form to have a reference count that bumps
+		 * up and down.
 		 */
 		if ((so->so_qstate & SQ_COMP) != 0) {
-			ACCEPT_UNLOCK();
-			return;
+			TAILQ_REMOVE(&head->so_comp, so, so_list);
+			head->so_qlen--;
+			so->so_qstate &= ~SQ_COMP;
+			so->so_head = NULL;
+			head->so_error = ECONNABORTED;
+			diedcomplete = 1;
+		} else {
+			TAILQ_REMOVE(&head->so_incomp, so, so_list);
+			head->so_incqlen--;
+			so->so_qstate &= ~SQ_INCOMP;
+			so->so_head = NULL;
 		}
-		TAILQ_REMOVE(&head->so_incomp, so, so_list);
-		head->so_incqlen--;
-		so->so_qstate &= ~SQ_INCOMP;
-		so->so_head = NULL;
 	}
 	KASSERT((so->so_qstate & SQ_COMP) == 0 &&
 	    (so->so_qstate & SQ_INCOMP) == 0,
 	    ("sofree: so_head == NULL, but still SQ_COMP(%d) or SQ_INCOMP(%d)",
 	    so->so_qstate & SQ_COMP, so->so_qstate & SQ_INCOMP));
 	ACCEPT_UNLOCK();
+	if (diedcomplete)
+		sorwakeup(head);
 	SOCKBUF_LOCK(&so->so_snd);
 	so->so_snd.sb_flags |= SB_NOINTR;
 	(void)sblock(&so->so_snd, M_WAITOK);

-- 
Brian Fundakowski Feldman                           \'[ FreeBSD ]''''''''''\
  <> green@FreeBSD.org                               \  The Power to Serve! \
 Opinions expressed are my own.                       \,,,,,,,,,,,,,,,,,,,,,,\



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041006215134.GN47017>