From owner-freebsd-net@FreeBSD.ORG Tue Aug 2 21:25:06 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA6811065676 for ; Tue, 2 Aug 2011 21:25:06 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 184B68FC0A for ; Tue, 2 Aug 2011 21:25:05 +0000 (UTC) Received: (qmail 3203 invoked from network); 2 Aug 2011 20:19:37 -0000 Received: from localhost (HELO [127.0.0.1]) ([127.0.0.1]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 2 Aug 2011 20:19:37 -0000 Message-ID: <4E386B35.20303@freebsd.org> Date: Tue, 02 Aug 2011 23:25:09 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: Steven Hartland References: <1F95A4C2D54E4F369830143CBDB5FF86@multiplay.co.uk><4E37C0F2.4080004@freebsd.org> <2B063B6D95AA4C27B004C50D96393F91@multiplay.co.uk> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, lstewart@freebsd.org Subject: Re: tcp failing to recover from a packet loss under 8.2-RELEASE? X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Aug 2011 21:25:06 -0000 On 02.08.2011 14:32, Steven Hartland wrote: > ----- Original Message ----- From: "Steven Hartland" >> Setting net.inet.tcp.reass.maxsegments=8148 and rerunning the >> tests appears to result in a solid 14MB/s, its still running a >> full soak test but looking very promising :) > > Ok so full test has completed with over 134GB of data transferred > without a hitch. > > I haven't been able to complete this test without error on this > target machine before the change to maxsegments; so packets being > dropped in the tcp reassembly code due to tcp_reass global zone > exhaustion is almost certainly the cause of the stalls we're > seeing, which is good news :) The zone exhaustion is part of the problem but not the real cause of the stall you're seeing. When the reassembly zone is exhausted the retransmit that would fill the hole between the socket buffer and the out-of-order segments in the reassembly queue can't be processed anymore. This brings all TCP sessions with data in the reassembly queue to a standstill. > I suspect there are a few contributing factors at play causing > this. > > 1. There is known to be a dodgy fibre on the test routing, which > is scheduled for cleaning, but is currently causing a small amount > of packet loss between the sender and receiver. Packet loss and the use of the reassembly queue. > 2. The target machine has 24 cores and is running 8 queues on > the igb interface which could result in the requirement to reorder > packets even if they arrived in order on the wire. I say this as > disabling msix on igb0, which results in just one queue, did reduce > the occurrence rate of the problem. Maybe this is a compounding factor. > 3. Combining a low latency (<1ms) high throughput connection > ~64MB/s with a lower throughput but still relatively high bandwidth > ~14MB/s with a 10ms latency. What matters here is the socket buffer and receive window size. > I look forward to hearing peoples thoughts on what the actual fix > should be: increased default nmbclusters, decreased nmbclusters => > maxsegments divisor, or something else? These items may or may not need some adjustments. Though it will eventually wedge again. The reassembly queue must be able to process the one missing segment despite having exhausted the global zone limit. I had fixed that at one point in time. It seems that it got lost with the recent changes. Please try this patch: http://people.freebsd.org/~andre/tcp_reass.c-logdebug+missingsegment-20110802.diff Running it with normal amount of nmbclusters and it should still prevent the sessions from stalling. -- Andre