From owner-freebsd-net@FreeBSD.ORG Mon Apr 21 06:41:17 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B7D70106564A for ; Mon, 21 Apr 2008 06:41:17 +0000 (UTC) (envelope-from netslists@gmail.com) Received: from yw-out-2324.google.com (yw-out-2324.google.com [74.125.46.29]) by mx1.freebsd.org (Postfix) with ESMTP id 65BCA8FC1E for ; Mon, 21 Apr 2008 06:41:17 +0000 (UTC) (envelope-from netslists@gmail.com) Received: by yw-out-2324.google.com with SMTP id 2so842412ywt.13 for ; Sun, 20 Apr 2008 23:41:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding; bh=XgGvSEQ+YlBoXbjK1ejJ4QW5qC3GMD60QlLSSDGEls4=; b=p4k1hMsPj8rAmrn4KRzMYSsgOnlTPP8k6WMq74Ip6mFL/y+TamQWt3p2oxXAuzpElhbgLUuqEOCBOTlS7FikoDfEJWQs5ntyA0jzV8mOAXyqKsgTnI5bz+Hl+EKwDPb+zgWwu/E2NtdfGtJNrmHfzaWO2yaAvZ0wQs93DkNjWo8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding; b=xQsYr/CjGwmzmN0EO+BLo/4rQQb6xUa3ulU9tvwgk06TBq8ftmoKk2PwelgBepisv1yn1IC2DjBD6+hvh4pPsebHlZ0bNhmpUbQeL9nBkwYHY93ZCLL4OhaHJ2CUz52oYECqx/nQlifj7Psh4IHFpvjPi8IUzhlM/h3gb3w476A= Received: by 10.150.58.17 with SMTP id g17mr6452856yba.235.1208758391080; Sun, 20 Apr 2008 23:13:11 -0700 (PDT) Received: from ?192.168.12.8? ( [97.101.40.241]) by mx.google.com with ESMTPS id 9sm6991837yws.6.2008.04.20.23.13.09 (version=SSLv3 cipher=RC4-MD5); Sun, 20 Apr 2008 23:13:10 -0700 (PDT) Message-ID: <480C306F.5000006@gmail.com> Date: Mon, 21 Apr 2008 02:13:03 -0400 From: Sten Daniel Soersdal User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: Mark Hills References: <20080420025010.GJ73016@server.vk2pj.dyndns.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Peter Jeremy , freebsd-net@freebsd.org Subject: Re: read() returns ETIMEDOUT on steady TCP connection X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Apr 2008 06:41:17 -0000 Mark Hills wrote: > On Sun, 20 Apr 2008, Peter Jeremy wrote: > >> Can you give some more detail about your hardware (speed, CPU, >> available RAM, UP or SMP) and the application (roughly what does the >> core of the code look like and is it single-threaded/multi-threaded >> and/or multi-process). > > The current test is a Dell 2650, 2Gb, Quad Xeon with onboard bge. > > The application is single threaded, non-blocking multiplexed I/O based > on poll(). It's relatively simple at its core -- read() from an inbound > connection and write() to outbound sockets. > >>> As the number of outbound connections increases, the 'output drops' >>> increases to around 10% of the total packets sent and maintains that >>> ratio. >>> There's no problems with network capacity. >> >> 'output drops' (ips_odropped) means that the kernel is unable to >> buffer the write (no mbufs or send queue full). Userland should see >> ENOBUFS unless the error was triggered by a fragmentation request. > > The app definitely isn't seeing ENOBUFS; this would be treated as a > fatal condition and reported. > >> I can't explain the problem but it definitely looks like a resource >> starvation issue within the kernel. > > I've traced the source of the ETIMEDOUT within the kernel to > tcp_timer_rexmt() in tcp_timer.c: > > if (++tp->t_rxtshift > TCP_MAXRXTSHIFT) { > tp->t_rxtshift = TCP_MAXRXTSHIFT; > tcpstat.tcps_timeoutdrop++; > tp = tcp_drop(tp, tp->t_softerror ? > tp->t_softerror : ETIMEDOUT); > goto out; > } > > I'm new to FreeBSD, but it seems to implies that it's reaching a limit > of a number of retransmits of sending ACKs on the TCP connection > receiving the inbound data? But I checked this using tcpdump on the > server and could see no retransmissions. > > As a test, I ran a simulation with the necessary changes to increase > TCP_MAXRXTSHIFT (including adding appropriate entries to > tcp_sync_backoff[] and tcp_backoff[]) and it appeared I was able to > reduce the frequency of the problem occurring, but not to a usable level. > > With ACKs in mind, I took the test back to stock kernel and > configuration, and went ahead with disabling sack on the server and the > client which supplies the data (FreeBSD 6.1, not 7). This greatly > reduced the 'duplicate acks' metric, but didn't fix the problem. The > next step was to switch off delayed_ack as well, and I didn't see the > problem for some hours on the test system at 850mbit output. But hasn't > eliminated it, as it happened again. > > Perhaps someone with a greater knowledge can help to join the dots of > all these symptoms? Verify that you are not experiencing connection loss due to mtu related issues. What is path mtu? is mss adjusted along the way? Try turning off txcsum and rxcsum using ifconfig. Just my $0.02 -- Sten Daniel Soersdal