From owner-freebsd-bugs Thu Mar 19 15:00:05 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id PAA22091 for freebsd-bugs-outgoing; Thu, 19 Mar 1998 15:00:05 -0800 (PST) (envelope-from owner-freebsd-bugs@FreeBSD.ORG) Received: (from gnats@localhost) by hub.freebsd.org (8.8.8/8.8.8) id PAA22075; Thu, 19 Mar 1998 15:00:03 -0800 (PST) (envelope-from gnats) Received: from roller.nas.nasa.gov (roller.nas.nasa.gov [129.99.223.26]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id OAA21946 for ; Thu, 19 Mar 1998 14:59:29 -0800 (PST) (envelope-from kml@roller.nas.nasa.gov) Received: (from kml@localhost) by roller.nas.nasa.gov (8.8.7/8.8.7) id OAA00289; Thu, 19 Mar 1998 14:59:26 -0800 (PST) (envelope-from kml) Message-Id: <199803192259.OAA00289@roller.nas.nasa.gov> Date: Thu, 19 Mar 1998 14:59:26 -0800 (PST) From: Kevin Lahey Reply-To: kml@roller.nas.nasa.gov To: FreeBSD-gnats-submit@FreeBSD.ORG X-Send-Pr-Version: 3.2 Subject: i386/6068: TCP retransmission bug Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org >Number: 6068 >Category: i386 >Synopsis: TCP can time out of retransmission in 12 seconds >Confidential: no >Severity: non-critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Mar 19 15:00:02 PST 1998 >Last-Modified: >Originator: Kevin Lahey >Organization: NASA/Ames >Release: FreeBSD 2.2.5-RELEASE i386 >Environment: This is a fresh, patch-free installation of 2.2.5. >Description: In some circumstances, when the round-trip time is very low, it is possible for TCP to time out in just 12 seconds, after sending 12 packets: 14:49:02.585137 roller.1026 > yakko-work.discard: . 2195169:2196617(1448) ack 1 win 17376 (DF) (ttl 64, id 4931) 14:49:02.586423 roller.1026 > yakko-work.discard: . 2196617:2198065(1448) ack 1 win 17376 (DF) (ttl 64, id 4932) 14:49:02.587676 roller.1026 > yakko-work.discard: P 2198065:2199513(1448) ack 1 win 17376 (DF) (ttl 64, id 4933) 14:49:04.202214 roller.1026 > yakko-work.discard: . 2518073:2519521(1448) ack 1 win 17376 (DF) (ttl 64, id 5166) 14:49:05.202248 roller.1026 > yakko-work.discard: . 2518073:2519521(1448) ack 1 win 17376 (DF) (ttl 64, id 5167) 14:49:06.202232 roller.1026 > yakko-work.discard: . 2518073:2519521(1448) ack 1 win 17376 (DF) (ttl 64, id 5168) 14:49:07.202232 roller.1026 > yakko-work.discard: . 2518073:2519521(1448) ack 1 win 17376 (DF) (ttl 64, id 5169) 14:49:08.202225 roller.1026 > yakko-work.discard: . 2518073:2519521(1448) ack 1 win 17376 (DF) (ttl 64, id 5170) 14:49:09.202228 roller.1026 > yakko-work.discard: . 2518073:2519521(1448) ack 1 win 17376 (DF) (ttl 64, id 5171) 14:49:10.202253 roller.1026 > yakko-work.discard: . 2518073:2519521(1448) ack 1 win 17376 (DF) (ttl 64, id 5172) 14:49:11.202221 roller.1026 > yakko-work.discard: . 2518073:2519521(1448) ack 1 win 17376 (DF) (ttl 64, id 5173) 14:49:12.202203 roller.1026 > yakko-work.discard: . 2518073:2519521(1448) ack 1 win 17376 (DF) (ttl 64, id 5174) 14:49:13.202221 roller.1026 > yakko-work.discard: . 2518073:2519521(1448) ack 1 win 17376 (DF) (ttl 64, id 5175) 14:49:14.202212 roller.1026 > yakko-work.discard: . 2518073:2519521(1448) ack 1 win 17376 (DF) (ttl 64, id 5176) 14:49:15.202257 roller.1026 > yakko-work.discard: . 2518073:2519521(1448) ack 1 win 17376 (DF) (ttl 64, id 5177) 14:49:16.201046 roller.1026 > yakko-work.discard: R 2535449:2535449(0) ack 1 win 17376 (DF) (ttl 64, id 5178) I just fixed this is NetBSD, and it looks like the problem is that the TCP_REXMTVAL can be 0 with the Brakmo-Peterson RTO estimator, when with the Van Jacobson RTO estimator the lowest value it could return was 3. When the value is 0, the exponential backoff product is also 0, and so the timeout falls back to the minimum. After 12 retransmissions, it just times out. It looks like the persist code in tcp_timer.c has a fix for just this problem, but the fix wasn't applied to the retransmission code... >How-To-Repeat: >From the FreeBSD host: ttcp -t -s -p9 target Unplug the target from the net and watch to see how long the connection takes to timeout. I found that this didn't fail every time, but was pretty repeatable. >Fix: Apply some sort of check to TCP_REXMTVAL to ensure that it is at least t_rttmin before multiplying it by the exponential backoff term, as is currently done for the persist timer. >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message