From owner-freebsd-net@FreeBSD.ORG Thu Apr 11 19:45:57 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 94B04908 for ; Thu, 11 Apr 2013 19:45:57 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id F01D915F9 for ; Thu, 11 Apr 2013 19:45:56 +0000 (UTC) Received: (qmail 71924 invoked from network); 11 Apr 2013 20:52:59 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 11 Apr 2013 20:52:59 -0000 Message-ID: <516712EE.8010708@freebsd.org> Date: Thu, 11 Apr 2013 21:45:50 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130307 Thunderbird/17.0.4 MIME-Version: 1.0 To: Matt Miller Subject: Re: RFC 3042 Implementation References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 19:45:57 -0000 On 11.04.2013 20:59, Matt Miller wrote: > In some of our tests, we noticed some duplicate pure ACKs (not window > updates), most of which the duplicates were coming from this tcp_output() > call in tcp_do_segment() (line 2534): > > 2508 } else if (V_tcp_do_rfc3042) { > 2509 cc_ack_received(tp, th, > CC_DUPACK); > 2510 u_long oldcwnd = tp->snd_cwnd; > 2511 tcp_seq oldsndmax = > tp->snd_max; > 2512 u_int sent; > 2513 > 2514 KASSERT(tp->t_dupacks == 1 || > 2515 tp->t_dupacks == 2, > 2516 ("%s: dupacks not 1 or 2", > 2517 __func__)); > 2518 if (tp->t_dupacks == 1) > 2519 tp->snd_limited = 0; > 2520 tp->snd_cwnd = > 2521 (tp->snd_nxt - > tp->snd_una) + > 2522 (tp->t_dupacks - > tp->snd_limited) * > 2523 tp->t_maxseg; > 2524 if ((thflags & TH_FIN) && > 2525 > (TCPS_HAVERCVDFIN(tp->t_state) == 0)) { > 2526 /* > 2527 * If its a fin we > need to process > 2528 * it to avoid a race > where both > 2529 * sides enter > FIN-WAIT and send FIN|ACK > 2530 * at the same time. > 2531 */ > 2532 break; > 2533 } > 2534 (void) tcp_output(tp); > > I added some instrumentation here to count how many time the following is > zero prior to calling tcp_output(): > > so->so_snd.sb_cc - ((tp->snd_nxt - tp->snd_una) > > And, in our tests, it was like 97% of the time. > > It looks like the intent of the RFC is to only send one or two unsent data > segments here, not pure ACKs. And this subsequent standard seems to > clarify that new data should be available for transmission: > > http://tools.ietf.org/html/rfc5681 > > > On the first and second duplicate ACKs received at a sender, a > TCP SHOULD send a segment of previously unsent data per [RFC3042] > provided that the receiver's advertised window allows, the total > FlightSize would remain less than or equal to cwnd plus 2*SMSS, > and that new data is available for transmission. > > > So, this is a detailed way of asking: do we need a check here to make sure > there is new data to send prior to calling tcp_output()? Yes. Based on your input the attached patch should fix it (untested). -- Andre Index: tcp_input.c =================================================================== --- tcp_input.c (revision 249375) +++ tcp_input.c (working copy) @@ -2564,6 +2564,7 @@ u_long oldcwnd = tp->snd_cwnd; tcp_seq oldsndmax = tp->snd_max; u_int sent; + int avail; KASSERT(tp->t_dupacks == 1 || tp->t_dupacks == 2, @@ -2585,7 +2586,17 @@ */ break; } - (void) tcp_output(tp); + /* + * Only call tcp_output when there + * is new data available to be sent. + * Otherwise we send pure ACKs. + */ + SOCKBUF_LOCK(&so->so_snd); + avail = so->so_snd.sb_cc - + (tp->snd_nxt - tp->snd_una); + SOCKBUF_UNLOCK(&so->so_snd); + if (avail > 0) + (void) tcp_output(tp); sent = tp->snd_max - oldsndmax; if (sent > tp->t_maxseg) { KASSERT((tp->t_dupacks == 2 &&