From owner-svn-src-all@FreeBSD.ORG Sun Oct 9 10:18:18 2011 Return-Path: Delivered-To: svn-src-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 88ACD1065673 for ; Sun, 9 Oct 2011 10:18:18 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 00C2C8FC12 for ; Sun, 9 Oct 2011 10:18:17 +0000 (UTC) Received: (qmail 9193 invoked from network); 9 Oct 2011 08:33:28 -0000 Received: from localhost (HELO [127.0.0.1]) ([127.0.0.1]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 9 Oct 2011 08:33:28 -0000 Message-ID: <4E916EA6.3020300@freebsd.org> Date: Sun, 09 Oct 2011 11:51:34 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 MIME-Version: 1.0 To: Lawrence Stewart References: <201110071639.p97Gd3t4019128@svn.freebsd.org> <4E8F9FAD.7070103@freebsd.org> In-Reply-To: <4E8F9FAD.7070103@freebsd.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, FreeBSD Release Engineering Team , John Baldwin Subject: Re: svn commit: r226113 - head/sys/netinet X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Oct 2011 10:18:18 -0000 Hi Lawrence, Sorry for jumping in here. There was some urgency felt at EuroBSDCon to get this issue fixed before the next RC. -- Andre On 08.10.2011 02:56, Lawrence Stewart wrote: > Hi Andre and RE team, > > I've had a patch sitting in re@'s inbox for this problem since 15th Sep and have been waiting for > their go-ahead to commit. The patch I submitted is at: > > http://people.freebsd.org/~lstewart/patches/misctcp/tcpreassstackfix_9.x.r225576.diff > > The proposed commit message was: > > ########## > Use a backup (stack allocated) struct tseg_qent when we are unable to allocate one from the TCP > reassembly UMA zone and the incoming segment is the one we've been waiting for (i.e. th_seq == > rcv_nxt). This avoids TCP connections stalling when the zone limit is reached. > > PR: kern/155407 > Reported by: Slawa Olhovchenkov and Steven Hartland > Tested by: Steven Hartland > Submitted by: andre > Reviewed by: jhb > Approved by: re (?) > MFC after: 1 week > ########## > > I feel the logging changes should have been committed separately to the fix, but other than that, > what you committed achieves the same thing as the patch I proposed. > > I should have updated the ML thread to say it was submitted and awaiting approval, so you weren't to > know. > > Anyhoo, I guess I'll leave it up to you and re@ to sort out how you want to proceed, but wanted to > make sure everyone was on the same page as RE would have gotten confused when you requested your > patch be MFCed. > > Cheers, > Lawrence > > On 10/08/11 03:39, Andre Oppermann wrote: >> Author: andre >> Date: Fri Oct 7 16:39:03 2011 >> New Revision: 226113 >> URL: http://svn.freebsd.org/changeset/base/226113 >> >> Log: >> Prevent TCP sessions from stalling indefinitely in reassembly >> when reaching the zone limit of reassembly queue entries. >> >> When the zone limit was reached not even the missing segment >> that would complete the sequence space could be processed >> preventing the TCP session forever from making any further >> progress. >> >> Solve this deadlock by using a temporary on-stack queue entry >> for the missing segment followed by an immediate dequeue again >> by delivering the contiguous sequence space to the socket. >> >> Add logging under net.inet.tcp.log_debug for reassembly queue >> issues. >> >> Reviewed by: lsteward (previous version) >> Tested by: Steven Hartland >> MFC after: 3 days >> >> Modified: >> head/sys/netinet/tcp_reass.c >> >> Modified: head/sys/netinet/tcp_reass.c >> ============================================================================== >> --- head/sys/netinet/tcp_reass.c Fri Oct 7 16:09:44 2011 (r226112) >> +++ head/sys/netinet/tcp_reass.c Fri Oct 7 16:39:03 2011 (r226113) >> @@ -177,7 +177,9 @@ tcp_reass(struct tcpcb *tp, struct tcphd >> struct tseg_qent *nq; >> struct tseg_qent *te = NULL; >> struct socket *so = tp->t_inpcb->inp_socket; >> + char *s = NULL; >> int flags; >> + struct tseg_qent tqs; >> >> INP_WLOCK_ASSERT(tp->t_inpcb); >> >> @@ -215,19 +217,40 @@ tcp_reass(struct tcpcb *tp, struct tcphd >> TCPSTAT_INC(tcps_rcvmemdrop); >> m_freem(m); >> *tlenp = 0; >> + if ((s = tcp_log_addrs(&tp->t_inpcb->inp_inc, th, NULL, NULL))) { >> + log(LOG_DEBUG, "%s; %s: queue limit reached, " >> + "segment dropped\n", s, __func__); >> + free(s, M_TCPLOG); >> + } >> return (0); >> } >> >> /* >> * Allocate a new queue entry. If we can't, or hit the zone limit >> * just drop the pkt. >> + * >> + * Use a temporary structure on the stack for the missing segment >> + * when the zone is exhausted. Otherwise we may get stuck. >> */ >> te = uma_zalloc(V_tcp_reass_zone, M_NOWAIT); >> - if (te == NULL) { >> + if (te == NULL&& th->th_seq != tp->rcv_nxt) { >> TCPSTAT_INC(tcps_rcvmemdrop); >> m_freem(m); >> *tlenp = 0; >> + if ((s = tcp_log_addrs(&tp->t_inpcb->inp_inc, th, NULL, NULL))) { >> + log(LOG_DEBUG, "%s; %s: global zone limit reached, " >> + "segment dropped\n", s, __func__); >> + free(s, M_TCPLOG); >> + } >> return (0); >> + } else if (th->th_seq == tp->rcv_nxt) { >> + bzero(&tqs, sizeof(struct tseg_qent)); >> + te =&tqs; >> + if ((s = tcp_log_addrs(&tp->t_inpcb->inp_inc, th, NULL, NULL))) { >> + log(LOG_DEBUG, "%s; %s: global zone limit reached, " >> + "using stack for missing segment\n", s, __func__); >> + free(s, M_TCPLOG); >> + } >> } >> tp->t_segqlen++; >> >> @@ -304,6 +327,8 @@ tcp_reass(struct tcpcb *tp, struct tcphd >> if (p == NULL) { >> LIST_INSERT_HEAD(&tp->t_segq, te, tqe_q); >> } else { >> + KASSERT(te !=&tqs, ("%s: temporary stack based entry not " >> + "first element in queue", __func__)); >> LIST_INSERT_AFTER(p, te, tqe_q); >> } >> >> @@ -327,7 +352,8 @@ present: >> m_freem(q->tqe_m); >> else >> sbappendstream_locked(&so->so_rcv, q->tqe_m); >> - uma_zfree(V_tcp_reass_zone, q); >> + if (q !=&tqs) >> + uma_zfree(V_tcp_reass_zone, q); >> tp->t_segqlen--; >> q = nq; >> } while (q&& q->tqe_th->th_seq == tp->rcv_nxt); > > >