From owner-freebsd-net@FreeBSD.ORG Fri Jan 24 05:06:50 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 65459FB0 for ; Fri, 24 Jan 2014 05:06:50 +0000 (UTC) Received: from mail-ie0-x22c.google.com (mail-ie0-x22c.google.com [IPv6:2607:f8b0:4001:c03::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 341871438 for ; Fri, 24 Jan 2014 05:06:50 +0000 (UTC) Received: by mail-ie0-f172.google.com with SMTP id e14so2353491iej.17 for ; Thu, 23 Jan 2014 21:06:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=VHCtYxYbxKg9hU3ZlP69fzTncVdbKNholfqHChBSp3k=; b=n27AMdgVjMa8+X3Vh4u34cGOqRil+sNjcgmr4G2GKILnSQfGuBlPiTePNLCy1EedZq 8L0bt7S9IoZ6imaFEVJZZqHdlGFpI9144iKXy/xuQ88fD+Zd18YqxVFTfwBTSJVHM1Mn imhWI+M+Mzns7Hk0qEzNnx8hk1d1MOhtPgIwQ1UE+WklML39jmJVodrEaSlxVYtJYytH h21nt+uIntGblycUE6uISyF3G6w7pV1p9NwDnRdbsYzfwRQCVES2Hskc7OgJrDJVeTKM b1yM5K+qaPuyLfpMnH+IMT4LJuFMm5fW1lmmu37heER2FdbWrKTNFJ3QrRBC5dYdYLfC CLqw== MIME-Version: 1.0 X-Received: by 10.42.122.146 with SMTP id n18mr8956971icr.41.1390540009750; Thu, 23 Jan 2014 21:06:49 -0800 (PST) Sender: jdavidlists@gmail.com Received: by 10.42.170.8 with HTTP; Thu, 23 Jan 2014 21:06:49 -0800 (PST) In-Reply-To: <390483613.15499210.1390530437153.JavaMail.root@uoguelph.ca> References: <390483613.15499210.1390530437153.JavaMail.root@uoguelph.ca> Date: Fri, 24 Jan 2014 00:06:49 -0500 X-Google-Sender-Auth: J-RDA1etKcWy_3X8t-BB1Pnxidc Message-ID: Subject: Re: Terrible NFS performance under 9.2-RELEASE? From: J David To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Jan 2014 05:06:50 -0000 On Thu, Jan 23, 2014 at 9:27 PM, Rick Macklem wrote: > Well, my TCP is pretty rusty, but... > Since your stats didn't show any jumbo frames, each IP > datagram needs to fit in the MTU of 1500bytes. NFS hands an mbuf > list of just over 64K (or 32K) to TCP in a single sosend(), then TCP > will generate about 45 (or about 23 for 32K) TCP segments and put > each in an IP datagram, then hand it to the network device driver > for transmission. This is *not* what happens with TSO/LRO. With TSO, TCP generates IP datagrams of up to 64k which are passed directly to the driver, which passes them directly to the hardware. Furthermore, in this unique case (two virtual machines on the same host and bridge with both TSO and LRO enabled end-to-end), the packet is *never* fragmented. The host takes the 64k packet off of one guest's output ring and puts it onto the other guest's input ring, intact. This is, as you might expect, a *massive* performance win. With TSO & LRO: $ time iperf -c 172.20.20.162 -d ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ ------------------------------------------------------------ Client connecting to 172.20.20.162, TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ [ 5] local 172.20.20.169 port 60889 connected with 172.20.20.162 port 5001 [ 4] local 172.20.20.169 port 5001 connected with 172.20.20.162 port 44101 [ ID] Interval Transfer Bandwidth [ 5] 0.0-10.0 sec 17.0 GBytes 14.6 Gbits/sec [ 4] 0.0-10.0 sec 17.4 GBytes 14.9 Gbits/sec real 0m10.061s user 0m0.229s sys 0m7.711s Without TSO & LRO: $ time iperf -c 172.20.20.162 -d ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 1.00 MByte (default) ------------------------------------------------------------ ------------------------------------------------------------ Client connecting to 172.20.20.162, TCP port 5001 TCP window size: 1.26 MByte (default) ------------------------------------------------------------ [ 5] local 172.20.20.169 port 22088 connected with 172.20.20.162 port 5001 [ 4] local 172.20.20.169 port 5001 connected with 172.20.20.162 port 48615 [ ID] Interval Transfer Bandwidth [ 5] 0.0-10.0 sec 637 MBytes 534 Mbits/sec [ 4] 0.0-10.0 sec 767 MBytes 642 Mbits/sec real 0m10.057s user 0m0.231s sys 0m3.935s Look at the difference. In this bidirectional test, TSO is over 25x faster using not even 2x the CPU. This shows how essential TSO/LRO is if you plan to move data at real world speeds and still have enough CPU left to operate on that data. > I recall you saying you tried turning off TSO with no > effect. You might also try turning off checksum offload. I doubt it will > be where things are broken, but might be worth a try. That was not me, that was someone else. If there is a problem with NFS and TSO, the solution is *not* to disable TSO. That is, at best, a workaround that produces much more CPU load and much less throughput. The solution is to find the problem and fix it. More data to follow. Thanks!