From owner-freebsd-net@FreeBSD.ORG Tue Jul 22 23:57:35 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4EB881065673 for ; Tue, 22 Jul 2008 23:57:35 +0000 (UTC) (envelope-from anumita@gmail.com) Received: from mu-out-0910.google.com (mu-out-0910.google.com [209.85.134.191]) by mx1.freebsd.org (Postfix) with ESMTP id AAFF18FC0A for ; Tue, 22 Jul 2008 23:57:34 +0000 (UTC) (envelope-from anumita@gmail.com) Received: by mu-out-0910.google.com with SMTP id i2so1380270mue.3 for ; Tue, 22 Jul 2008 16:57:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:mime-version:content-type; bh=oSyR78y5hebDNE3mKR4rPkH3QXiGBHqqLkOvwyYZoNo=; b=LdNnB5DmxKmU3iG7SULvP+igix9OaXoyFLcWsJwA9zHeFFx0gd87H7TsRNH+1j2xg0 Vqi5h61Pz+KFc2ZLo2A4hSGJ7HhAcL5F1EwNRkwDowREbX6+dUg6qkWP+Cjiy6xIHAYE Gk9rRGTNpx3eAhF9OrQS7U09dEv9OTRdDWHdo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:mime-version:content-type; b=Hk1PXUialgKHjAFlglHmZzFtifuw8hSlucyHYRG4LTHXKCgEyOTTQntIK188QWVbHF hK5sCR0KzZPMQ0gVxL352K51foRpWJNwqkndciQltyAzxdLJxXAhz0TbKFOBobxfkS7Z MBGFTdr6Tm/DeqyLShEqpxo7X5oD5i/3X4vAI= Received: by 10.103.239.10 with SMTP id q10mr4094962mur.82.1216769429280; Tue, 22 Jul 2008 16:30:29 -0700 (PDT) Received: by 10.103.23.17 with HTTP; Tue, 22 Jul 2008 16:30:29 -0700 (PDT) Message-ID: <2ddbdfa20807221630j77001ddfvd83dcd0f8c279a7d@mail.gmail.com> Date: Tue, 22 Jul 2008 16:30:29 -0700 From: "Anumita Biswas" To: freebsd-net@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: anumita@gmail.com Subject: FreeBSD tcp backoff problem X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 23:57:35 -0000 Hi, I work on a stack which is derived from FreeBSD. We have found a problem in the stack which shows up on TCP connections that do not use timestamps as follows. TCP backs off its retransmissions exponentially even though forward progress is being made. Appliance(our stack) sends data Client sends ack, but appliance does not receive it Appliance times out and resends packet Client sends ack, Appliance receives this ack This sequence continues. But each time the timeout goes up as 16, 32, 64, 64, 64 etc. Since each retransmitted packet is acked, the appliance should not continue to back off. The problem seems to be that t_rxtshift is not being reset when the ack is received. Normally t_rxtshift will be set to zero in tcp_xmit_timer() which is called from tcp_input() when a packet with a valid round trip time is received. When times stamps are not being used, as is the case with this connection, tcp_xmit_timer() is only called if t_rtttime is non-zero. However, it is set to zero when the retransmission timeout happens. Thus, tcp_xmit_timers() is never called during the sequence of packets shown above. So like in this case: if (tlen == 0) { if (SEQ_GT(th->th_ack, tp->snd_una) && SEQ_LEQ(th->th_ack, tp->snd_max) && tp->snd_cwnd >= tp->snd_wnd && ((!tcp_do_newreno && !tp->sack_enable && tp->t_dupacks < tcprexmtthresh) || ((tcp_do_newreno || tp->sack_enable) && ... etc. ... */ if ((to.to_flags & TOF_TS) != 0 && to.to_tsecr) { tcp_xmit_timer(tp, ticks - to.to_tsecr + 1); } else if (tp->t_rtttime && SEQ_GT(th->th_ack, tp->t_rtseq)) { tcp_xmit_timer(tp, ticks - tp->t_rtttime); } Since timestamps are not in use, and tp->t_rtttime is 0 as we just had a retransmission, we don't bring down tp->t_rxtshift to 0. There is a comment in the code subsequently, /* * If all outstanding data are acked, stop * retransmit timer, otherwise restart timer * using current (possibly backed-off) value. * If process is waiting for space, * wakeup/selwakeup/signal. If data * are ready to send, let tcp_output * decide between more output or persist. which seems to indicate that we should use possibly backed off value when restarting the retransmit timer. But we dont do that when timestamps are in use. So the comment is confusing. But when timestamps are not in use, t_rxtshift is not brought down to 0. Would it make sense to correct the comment and introduce an else condition here: if ((to.to_flags & TOF_TS) != 0 && to.to_tsecr) { tcp_xmit_timer(tp, ticks - to.to_tsecr + 1); } else if (tp->t_rtttime && SEQ_GT(th->th_ack, tp->t_rtseq)) { tcp_xmit_timer(tp, ticks - tp->t_rtttime); } else { tp->t_rxtshift = 0; } We might need a similar change when we receive more than 3 dupacks in tcp_input and don't call tcp_xmit_timer(). Though I don't know if in that case, tp->t_rtttime will be 0. I also dont know if we should be initializing anything else besides tp->t_rxtshift in this else part. Any comments on this would be appreciated. thanks, Anumita.