From owner-freebsd-net@FreeBSD.ORG Thu May 31 00:18:07 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 41E661065670; Thu, 31 May 2012 00:18:07 +0000 (UTC) (envelope-from gallatin@cs.duke.edu) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.freebsd.org (Postfix) with ESMTP id 082C38FC1B; Thu, 31 May 2012 00:18:06 +0000 (UTC) Received: from [192.168.200.2] (c-24-125-204-77.hsd1.va.comcast.net [24.125.204.77]) (authenticated bits=0) by duke.cs.duke.edu (8.14.5/8.14.5) with ESMTP id q4V0I5X8015236 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 30 May 2012 20:18:06 -0400 (EDT) X-DKIM: Sendmail DKIM Filter v2.8.3 duke.cs.duke.edu q4V0I5X8015236 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=cs.duke.edu; s=mail; t=1338423486; bh=L0mRfPw+ai4cQPGkTQ7zeQXF+Sn0tDkjFsp8as3xt/w=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=cZH9LFHkn0/nc9kXgJjTXgCbABk6P8QUOb3uDXVspF0qW1ZBFEytkCjoy4oTSZphz 1pETyNXl7j4i7cyaNalRmKoZLrf2qRwVgMIZ+FnfTc+huYWyr/9widcHud49uKpzFq m1hoDydoX37CgRzJF/FgZbpZ4P1NtbrBsxQahIkep55RMtGPfplaHSyVa/Yj989dfN OzGFmjeavxRQsImHycdMawsaCDcLd44FWZuBgFJQw3DvUFK2i2VceX8otXuzxYduV7 rKXr+wjxAjb7QLd+RPMGMXtjyhK/lzRAyDiWjv8jj9+sVaf5XUyRMbHl0kYjeY1v/K wBQDB3scrGbHQ== Message-ID: <4FC6B8BD.3060108@cs.duke.edu> Date: Wed, 30 May 2012 20:18:05 -0400 From: Andrew Gallatin User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120329 Thunderbird/11.0.1 MIME-Version: 1.0 To: Colin Percival References: <4FC635CC.5030608@freebsd.org> <4FC63D27.70807@cs.duke.edu> <4FC6A0C1.3010307@freebsd.org> In-Reply-To: <4FC6A0C1.3010307@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org Subject: Re: [please review] TSO mbuf chain length limiting patch X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 May 2012 00:18:07 -0000 On 05/30/12 18:35, Colin Percival wrote: > On 05/30/12 08:30, Andrew Gallatin wrote: >> On 05/30/12 10:59, Colin Percival wrote: >>> The Xen virtual network interface has an issue (ok, really the issue is with >>> the linux back-end, but that's what most people are using) where it can't >>> handle scatter-gather writes with lots of pieces, aka. long mbuf chains. >>> This currently bites us hard with TSO enabled, since it produces said long >>> mbuf chains. >> >> I've never been clear about what the max TSO size supported by FreeBSD >> is. The NIC I maintain (mxge) is limited to 64K - epsilon for both >> IPv4 *AND* IPv6. Up until now, this has been enforced by the 16-bit >> ip length limit of IPv4 and we have not had IPv6 TSO until this week. >> With IPv6, I'm worried that FreeBSD may now send packets down larger >> than I could handle. In my case, however, the problem is not s/g list >> length, but rather it is internal limits in the NIC which limit us to >> 64K - epsilon for IPv6 as well. I think there may be other NICs in >> the same boat for IPv6 (and maybe even some which cannot handle the >> full 64K for IPv4). >> >> Your approach would not work well for my size limit. For >> example, I'd have to set the limit to 4 mbufs to stay under 64KB. >> This would be assuming the worst case of 16KB jumbo mbufs, so >> that would limit me to ~8KB per TSO if 2KB mbufs were used. > > Right, the problem you describe isn't the one I was trying to solve. :-) > >> I think a better approach would be to have a limit on the size of the >> pre-segmented TCP payload size sent to the driver. I tend to think >> that this would be more generically useful, and it is a better match >> for the NDIS APIs, where a driver must specify the max TSO size. I >> think the changes to the TCP stack might be simpler (eg, they >> would seem to jive better with the existing "maxmtu" approach). >> >> I think this could work for you as well. You could set the Xen max >> tso size to be 32K (derived from 18 pages/skb, multiplied by a typical >> 2KB mbuf size, with some slack built in). If the chain was too large, >> you could m_defrag it down to size. > > Sounds good -- I don't want to m_defrag too often, but I imagine in most > cases when TSO is being invoked most of the mbufs will have 2 kB each. > This should also make the patch simpler by avoiding the need to modify > uipc_mbuf.c; if we just limit the TSO payload size then the TCP stack can > figure things out by itself. > > Are you working on a patch, or should I put one together? > No, I'd like to, but I'm afraid that I just don't have the time right now. I would very much appreciate it if you could put it together. I'd be happy to review it. Thanks, Drew