From owner-freebsd-hackers  Wed May 28 16:02:01 1997
Return-Path: <owner-hackers>
Received: (from root@localhost)
          by hub.freebsd.org (8.8.5/8.8.5) id QAA09788
          for hackers-outgoing; Wed, 28 May 1997 16:02:01 -0700 (PDT)
Received: from mailhub.Stanford.EDU (mailhub.Stanford.EDU [36.21.0.128])
          by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id QAA09783;
          Wed, 28 May 1997 16:01:56 -0700 (PDT)
Received: from tree1.Stanford.EDU (tree1.Stanford.EDU [36.83.0.36])
	by mailhub.Stanford.EDU (8.8.5/8.8.5/L) with SMTP id QAA18688;
	Wed, 28 May 1997 16:01:52 -0700 (PDT)
Newsgroups: comp.protocols.tcp-ip
Date: Wed, 28 May 1997 16:01:07 -0700 (PDT)
From: "Amr A. Awadallah" <aaa@stanford.edu>
To: freebsd-bugs@FreeBSD.org
cc: freebsd-hackers@FreeBSD.org, Chetan Rai <crai@CS.Stanford.EDU>,
        Nick W McKeown <nickm@ee.stanford.edu>
Subject: FreeBSD: Clarification for the false slow-start "bug".
In-Reply-To: <Pine.GSO.3.96.970527181755.20859A-100000@tree1.Stanford.EDU>
Message-ID: <Pine.GSO.3.96.970528143315.14472A-100000@tree1.Stanford.EDU>
References: <Pine.GSO.3.96.970527181755.20859A-100000@tree1.Stanford.EDU>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-hackers@FreeBSD.org
X-Loop: FreeBSD.org
Precedence: bulk


  It has been pointed to us, so far by Mark Allman and Vern Paxson, that
the suspected bug we reported earlier is actually a feature of TCP!  In
fact it is part of the fast recovery algorithm as described in RFC 2001
(page 5, second paragraph): 

   "2.  Each time another duplicate ACK arrives, increment cwnd by the
        segment size.  This inflates the congestion window for the
        additional segment that has left the network.  Transmit a
        packet, if allowed by the new value of cwnd.

   3.  When the next ACK arrives that acknowledges new data, set cwnd
       to ssthresh (the value set in step 1).  This ACK should be the
       acknowledgment of the retransmission from step 1, one round-trip
       time after the retransmission.  Additionally, this ACK should
       acknowledge all the intermediate segments sent between the lost
       packet and the receipt of the first duplicate ACK.  This step is
       congestion avoidance, since TCP is down to one-half the rate it
       was at when the packet was lost."

   The reasoning behind this temporarily inflation of cwnd is to be able
to send more segments out for each incoming duplicate-ACK (which indicates
that another segment made it to the other side). This is necessary because
TCPs sliding window is stuck and will not slide until the first
non-duplicate ACK comes back. As soon as the first non-duplicate ACK comes
back cwnd is set back to ssthresh and the window continues sliding in
normal congestion-avoidance mode.

   Vern Paxson noted that the missing check of cwnd against TCP_MAXWIN
appears to be a genuine bug. He also noted that it is not a particularly
worrisome one, since (on their system at least) TCP_MAXWIN is 2^30, so
it's unlikely to be triggered. 

  We are in the process of setting up a web page to show the plots
demonstrating the burst of back-to-back packets that occur when
fast-recovery is invoked. This burst of back-to-back packets is what led
us to suspect the presence of a bug. The burst can be observed easily from
tcpdump traces using Greg Minshall's tracelook program. We provide plots
for the performance after our suggested modification and further
calrification of the problem. We welcome feedback on the results which
were captured from FreeBSD's TCP kernel operating across a transatlantic
link.

   The web page can be reached at: 

               http://www.stanford.edu/~aaa/tcp 

   It is still under development but should be up by 7PM pacific daylight
time. 

  Thanks is in order to Vern Paxson and Mark Allman for promptly pointing
out our mistake. Please accept our apologies if this mistake led to any
confusion.

Sincerely,

Amr A. Awadallah   <aaa@stanford.edu>
Chetan Rai         <crai@CS.Stanford.EDU>