From owner-freebsd-current Sat Feb 22 8:47:48 2003 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A886637B401 for ; Sat, 22 Feb 2003 08:47:44 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id EC77D43F75 for ; Sat, 22 Feb 2003 08:47:43 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.12.6/8.12.5) with SMTP id h1MGlWP4013922; Sat, 22 Feb 2003 11:47:32 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Sat, 22 Feb 2003 11:47:31 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Bosko Milekic Cc: current@FreeBSD.org Subject: Re: TCP connections timing out "real fast" In-Reply-To: <20030222105903.A85320@unixdaemons.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, 22 Feb 2003, Bosko Milekic wrote: > On Sat, Feb 22, 2003 at 10:57:05AM -0500, Robert Watson wrote: > > > > Don't yet have any quantitative evidence that this is the case, but I feel > > like TCP sessions have been timing out on me a lot faster than they used > > to. For example, yesterday a machine got unplugged from the network for > > about 15 seconds: in that time, the SSH sessions to the machine timed out > > and disconnected. This morning, a machine generated a lot of output to > > the serial console keeping it substantially busy for about 20 seconds; in > > that time, the SSH session to it timed out. I'm going to see if I can't > > generate some tcpdump traces later today to confirm my suspicions, but was > > wondering if anyone else (annecdotally or not) has seen similar things? > > I have (annecdotally) but I believe I'm seeing it on -STABLE too... > it's tough to tell... how recent are your -CURRENT machines, though, > and is it something that you think just started happening or has it > been happening for a while now? FWIW, I can't say for sure that this > is related to TCP connection timeouts. Here's a packet trace. cboss.gw.tislabs.com is running the January 30 5.0-CURRENT. crash2.gw.tislabs.com is running a -CURRENT from yesterday. Here's the output from the ssh session: crash2:~> sysctl -a | grep witnessRead from remote host crash2.gw.tislabs.com: Operation timed out Connection to crash2.gw.tislabs.com closed. cboss:/data/stock/src/sys/kern> The sysctl -a takes a little while to run because it currently generates a boatload of serial console output due to sleep warnings. Running it on the console takes about 35 seconds to complete. The disconnect appears to happen half way through that time. Here's the trace, as recorded on cboss.gw.tislabs.com, starting about when I hit enter at the end of the sysctl command line; it looks like it takes about 20 seconds to decide to disconnect after a series of rapid retransmissions: cboss# tcpdump -r /tmp/packets 11:40:36.826529 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 347024 1365:3470241385(20) ack 49959986 win 33304 ( DF) [tos 0x10] 11:40:36.845660 crash2.gw.tislabs.com.ssh > cboss.gw.tislabs.com.49423: P 1:21(2 0) ack 20 win 33304 (DF) [tos 0x10] 11:40:36.940001 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: . ack 21 win 33304 (DF) [tos 0x10] 11:40:37.758432 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 20:40( 20) ack 21 win 33304 (DF) [tos 0x10] 11:40:37.775625 crash2.gw.tislabs.com.ssh > cboss.gw.tislabs.com.49423: P 21:41( 20) ack 40 win 33304 (DF) [tos 0x10] 11:40:37.868677 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: . ack 41 win 33304 (DF) [tos 0x10] 11:40:40.780735 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:41.008779 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:41.268786 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:41.588797 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:42.028822 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:42.708951 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:43.868880 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:45.988960 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:48.109027 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:50.229094 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:52.349177 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:54.469236 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:56.589311 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: P 40:60( 20) ack 41 win 33304 (DF) [tos 0x10] 11:40:58.709370 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: R 60:60( 0) ack 41 win 33304 (DF) [tos 0x10] 11:41:15.784279 crash2.gw.tislabs.com.ssh > cboss.gw.tislabs.com.49423: . ack 60 win 33304 (DF) [tos 0x10] 11:41:15.784337 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: R 347024 1425:3470241425(0) win 0 (DF) [tos 0x10] 11:41:15.785617 crash2.gw.tislabs.com.ssh > cboss.gw.tislabs.com.49423: . ack 60 win 33304 (DF) [tos 0x10] 11:41:15.785659 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: R 347024 1425:3470241425(0) win 0 (DF) [tos 0x10] 11:41:15.785693 crash2.gw.tislabs.com.ssh > cboss.gw.tislabs.com.49423: . ack 60 win 33304 (DF) [tos 0x10] 11:41:15.785737 cboss.gw.tislabs.com.49423 > crash2.gw.tislabs.com.ssh: R 347024 1425:3470241425(0) win 0 (DF) [tos 0x10] 11:41:15.785770 crash2.gw.tislabs.com.ssh > cboss.gw.tislabs.com.49423: . ack 60 win 33304 (DF) [tos 0x10] Maybe I'm just too impatient, but it strikes me that I used to get more time before TCP gave up during a brief outage. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message