Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 18 Aug 2013 06:48:58 -0700 (PDT)
From:      Barney Cordoba <barney_cordoba@yahoo.com>
To:        Adrian Chadd <adrian@freebsd.org>
Cc:        Lawrence Stewart <lstewart@freebsd.org>, Luigi Rizzo <rizzo@iet.unipi.it>, FreeBSD Net <net@freebsd.org>
Subject:   Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux)
Message-ID:  <1376833738.94737.YahooMailNeo@web121605.mail.ne1.yahoo.com>
In-Reply-To: <CAJ-VmonGeqn5qqbfvF9xWaFPYNMNSVb6VwMx%2BoEVSGXVid98ag@mail.gmail.com>
References:  <520A6D07.5080106@freebsd.org>	<520AFBE8.1090109@freebsd.org>	<520B24A0.4000706@freebsd.org>	<520B3056.1000804@freebsd.org>	<20130814102109.GA63246@onelab2.iet.unipi.it>	<1376745244.6575.YahooMailNeo@web121606.mail.ne1.yahoo.com>	<1376748170.66110.YahooMailNeo@web121601.mail.ne1.yahoo.com> <CAJ-VmonGeqn5qqbfvF9xWaFPYNMNSVb6VwMx%2BoEVSGXVid98ag@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
=0A=0A=0A=0A________________________________=0A From: Adrian Chadd <adrian@=
freebsd.org>=0ATo: Barney Cordoba <barney_cordoba@yahoo.com> =0ACc: Luigi R=
izzo <rizzo@iet.unipi.it>; Lawrence Stewart <lstewart@freebsd.org>; FreeBSD=
 Net <net@freebsd.org> =0ASent: Saturday, August 17, 2013 11:59 AM=0ASubjec=
t: Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux)=
=0A =0A=0A=0A... we get perfectly good throughput without 400k ints a secon=
d on the ixgbe driver.=0A=0AAs in, I can easily saturate 2 x 10GE on ixgbe =
hardware with a handful of flows. That's not terribly difficult.=0A=0AHowev=
er, there's a few interesting problems that need addressing:=0A=0A* There's=
 lock contention between the transmit side from userland and the TCP timers=
, and the receive side with ACK processing. Under very high traffic load a =
lot of lock contention stalls things. We (the royal "we", I'm mostly just d=
oing tooling at the moment) working on that.=0A* There's lock contention on=
 the ARP, routing table and PCB lookups. The latter will go away when we've=
 finally implemented RSS for transmit and receive and then moved things ove=
r to using PCB groups on CPUs which have NIC driver threads bound to them.=
=0A* There's increasing cache thrashing from a larger workload, causing the=
 expensive lookups to be even more expensive.=0A* All the list walks suck. =
We need to be batching things so we use CPU caches much more efficiently.=
=0A=0AThe idea of using TSO on the transmit side and generic LRO on the rec=
eive side is to make the per-packet overhead less. I think we can be much m=
ore efficient in general in packet processing, but that's a big task. :-) S=
o, using at least TSO is a big benefit if purely to avoid decomposing thing=
s into smaller mbufs and contending on those locks in a very big way.=0A=0A=
I'm working on PMC to make it easier to use to find these bottlenecks and m=
ake the code and data more efficient. Then, likely, I'll end up hacking on =
generic TSO/LRO, TX/RX RSS queue management and make the PCB group thing de=
fault on for SMP machines. I may even take a knife to some of the packet pr=
ocessing overhead.=0A=0A-------------------------------=0A=0AThe ints/sec r=
eference was based on Luigi's implication that turning off moderation was s=
ome sort of performance choice.=0A=0AAgain, you're talking "throughput" and=
 not efficiency. I could fill a tx queue with 10gb of traffic with =A0yeste=
ryear's cpus. It's not an achievement. Being able to bridge=A0=0Areal traff=
ic at=A010gb/s with 2 cores is.=0A=0ABC
From owner-freebsd-net@FreeBSD.ORG  Sun Aug 18 14:25:35 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 09F12B82
 for <net@freebsd.org>; Sun, 18 Aug 2013 14:25:35 +0000 (UTC)
 (envelope-from jim@netgate.com)
Received: from mail-oa0-f52.google.com (mail-oa0-f52.google.com
 [209.85.219.52])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id C30232998
 for <net@freebsd.org>; Sun, 18 Aug 2013 14:25:34 +0000 (UTC)
Received: by mail-oa0-f52.google.com with SMTP id n12so4141363oag.25
 for <net@freebsd.org>; Sun, 18 Aug 2013 07:25:28 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=x-gm-message-state:references:mime-version:in-reply-to:content-type
 :content-transfer-encoding:message-id:cc:from:subject:date:to;
 bh=fsJgAmE/V2eOCFdtqy3O2oFz9ZyCN/G62G5aE4U000Y=;
 b=jAfNuFUtNwMvjUnGgWqz0NMS9HT8rUxfJLt0TIMU6g43X/jqboA+yPWVL5fRfJzS8q
 g5DBgFXRyo+7JvUWlfeF/2VnMxUBaLddNBAyl/ntM5XWL29jyvExPnjLkHLeBQIru7Vt
 iDvqTSb3izdwX7TjqIxy5uXzjQpHISviD7uOgp1ZR1c6NhvJSCezIjrD4heEc6x4p6pt
 Sdguqr25R4Cis121ThK3MjFmE6SbEeGy6MTcqKEPnIosdxMjhrVgh8rzihFJMwWD31is
 r/fsJSJn6VnkL2JNXrcspD8B7grSCW0tCFpTI2ZuASIxRRcj2Q78/XizOmAD61a/Rqpv
 QGhg==
X-Gm-Message-State: ALoCoQnbv2TlhuVIyrZPyXKuygwE5k82jgz5342zzVZ9WOjpCT4NMNopL0oDBjX3dqH8TFnmfTXT
X-Received: by 10.182.81.41 with SMTP id w9mr8046505obx.18.1376835928656;
 Sun, 18 Aug 2013 07:25:28 -0700 (PDT)
Received: from [172.21.0.31] (67-198-60-238.static.grandenetworks.net.
 [67.198.60.238])
 by mx.google.com with ESMTPSA id g1sm8179565oeq.6.1969.12.31.16.00.00
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Sun, 18 Aug 2013 07:25:27 -0700 (PDT)
References: <520A6D07.5080106@freebsd.org> <520AFBE8.1090109@freebsd.org>
 <520B24A0.4000706@freebsd.org> <520B3056.1000804@freebsd.org>
 <20130814102109.GA63246@onelab2.iet.unipi.it>
 <1376745244.6575.YahooMailNeo@web121606.mail.ne1.yahoo.com>
 <1376748170.66110.YahooMailNeo@web121601.mail.ne1.yahoo.com>
 <CAJ-VmonGeqn5qqbfvF9xWaFPYNMNSVb6VwMx+oEVSGXVid98ag@mail.gmail.com>
 <1376833738.94737.YahooMailNeo@web121605.mail.ne1.yahoo.com>
Mime-Version: 1.0 (1.0)
In-Reply-To: <1376833738.94737.YahooMailNeo@web121605.mail.ne1.yahoo.com>
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Message-Id: <71EA3DFB-B410-432D-98E0-B6341556BE6D@netgate.com>
X-Mailer: iPhone Mail (10B350)
From: Jim Thompson <jim@netgate.com>
Subject: Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux)
Date: Sun, 18 Aug 2013 09:25:27 -0500
To: Barney Cordoba <barney_cordoba@yahoo.com>
Cc: Lawrence Stewart <lstewart@freebsd.org>, Adrian Chadd <adrian@freebsd.org>,
 Luigi Rizzo <rizzo@iet.unipi.it>, FreeBSD Net <net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>;
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 18 Aug 2013 14:25:35 -0000


On Aug 18, 2013, at 8:48 AM, Barney Cordoba <barney_cordoba@yahoo.com> wrote=
:

> I could fill a tx queue with 10gb of traffic with  yesteryear's cpus. It's=
 not an achievement. Being able to bridge=20
> real traffic at 10gb/s with 2 cores is

Or forward at layer 3.=20

Or filter packets.=20

Or IPSEC.=20

Or...=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1376833738.94737.YahooMailNeo>