Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Sep 2013 05:47:17 -0700 (PDT)
From:      Barney Cordoba <barney_cordoba@yahoo.com>
To:        Adrian Chadd <adrian@freebsd.org>
Cc:        Andre Oppermann <andre@freebsd.org>, Alan Somers <asomers@freebsd.org>, "net@freebsd.org" <net@freebsd.org>, Jack F Vogel <jfv@freebsd.org>, "Justin T. Gibbs" <gibbs@freebsd.org>, Luigi Rizzo <rizzo@iet.unipi.it>, "T.C. Gubatayao" <tgubatayao@barracuda.com>
Subject:   Re: Flow ID, LACP, and igb
Message-ID:  <1378126037.56348.YahooMailNeo@web121603.mail.ne1.yahoo.com>
In-Reply-To: <CAJ-VmomEKxJ5zz3Gw1b-HizDJ03_Mn=6uZVYR07QFTqwBzNsCg@mail.gmail.com>
References:  <D01A0CB2-B1E3-4F4B-97FA-4C821C0E3FD2@FreeBSD.org> <521BBD21.4070304@freebsd.org> <CAOtMX2jvKGY==t9i-a_8RtMAPH2p1VDj950nMHHouryoz3nbsA@mail.gmail.com> <521EE8DA.3060107@freebsd.org> <BCC2C62D4FE171479E2F1C2593FE508B0BE24383@BN-SCL-MBX03.Cudanet.local> <CAOtMX2h5SGh5eYV50y%2BQB_s367V9iattGU862wwXcONDV%2BTG8g@mail.gmail.com> <CA%2BhQ2%2BhgTaK1ZCOLGVFjSPY8nyNPHK4waSecyRQxR1gQcyjztg@mail.gmail.com> <1377952913.44129.YahooMailNeo@web121605.mail.ne1.yahoo.com> <BCC2C62D4FE171479E2F1C2593FE508B0BE2440B@BN-SCL-MBX03.Cudanet.local> <1378001733.36695.YahooMailNeo@web121606.mail.ne1.yahoo.com> <CA%2BhQ2%2Bj-DDuEX1KCDYioCactjL71p-d4AtusPUfePrswDyUpog@mail.gmail.com> <1378050319.62710.YahooMailNeo@web121601.mail.ne1.yahoo.com> <CAJ-VmomEKxJ5zz3Gw1b-HizDJ03_Mn=6uZVYR07QFTqwBzNsCg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Are you using a pcie3 bus? Of course this is only an issue for 10g; what pc=
t of=0AFreeBSD users have a load over 9.5Gb/s? It's completely unnecessary =
for igb=0Aor em driver, so why is it used? because it's there.=0A=0AHere's =
my argument against it. The handful of brains capable of doing driver devel=
opment=0Abecome consumed with BS like LRO and the things that need to be fi=
xed, like=0Abuffer management and basic driver design flaws, never get fixe=
d. The offload=0Acode makes the driver code a virtual mess that can only be=
 maintained by Jack and=0A1 other guy in the entire world. And it takes 10 =
times longer to make a simple change or=0Ato add support for a new NIC.=A0=
=0A=0AIn a week I ripped out the offload crap and the 9000 sysctls, elimina=
ted the=A0=0A"consumer buffer" problem, reduced locking by 40% and now the =
igb driver=0Auses 20% less cpu with a full gig load.=0A=0AAnd the code is c=
leaner and more easily maintained.=0A=0ABC=0A=0A=0A________________________=
________=0A From: Adrian Chadd <adrian@freebsd.org>=0ATo: Barney Cordoba <b=
arney_cordoba@yahoo.com> =0ACc: Andre Oppermann <andre@freebsd.org>; Alan S=
omers <asomers@freebsd.org>; "net@freebsd.org" <net@freebsd.org>; Jack F Vo=
gel <jfv@freebsd.org>; Justin T. Gibbs <gibbs@freebsd.org>; Luigi Rizzo <ri=
zzo@iet.unipi.it>; T.C. Gubatayao <tgubatayao@barracuda.com> =0ASent: Sunda=
y, September 1, 2013 4:51 PM=0ASubject: Re: Flow ID, LACP, and igb=0A =0A=
=0AYo,=0A=0ALRO is an interesting hack that seems to do a good trick of hid=
ing the=0Aridiculous locking and unfriendly cache behaviour that we do per-=
packet.=0A=0AIt helps with LAN test traffic where things are going out in b=
atches from=0Athe TCP layer so the RX layer "sees" these frames in-order an=
d can do LRO.=0AWhen you disable it, I don't easily get 10GE LAN TCP perfor=
mance. That has=0Ato be fixed. Given how fast the CPU cores, bus interconne=
ct and memory=0Ainterconnects are, I don't think there should be any reason=
 why we can't=0Ahit 10GE traffic on a LAN with LRO disabled (in both softwa=
re and hardware.)=0A=0ANow that I have the PMC sandy bridge stuff working r=
ight (but no PEBS, I=0Ahave to talk to Intel about that in a bit more detai=
l before I think about=0Ahacking that in) we can get actual live informatio=
n about this stuff. But=0Athe last time I looked, there's just too much per=
-packet latency going on.=0AThe root cause looks like it's a toss up betwee=
n scheduling, locking and=0Ajust lots of code running to completion per-fra=
me. As I said, that all has=0Ato die somehow.=0A=0A2c,=0A=0A=0A=0A-adrian=
=0A=0A=0A=0AOn 1 September 2013 08:45, Barney Cordoba <barney_cordoba@yahoo=
.com> wrote:=0A=0A>=0A>=0A> Comcast sends packets OOO. With any decent numb=
er of internet hops you're=0A> likely to encounter a load=0A> balancer or p=
acket shaper that sends packets OOO, so you just can't be=0A> worried about=
 it. In fact, your=0A> designs MUST work with OOO packets.=0A>=0A> Getting =
balance on your load balanced lines is certainly a bigger upside=0A> than t=
he additional CPU used.=0A> You can buy a faster processor for your "stack"=
 for a lot less than you=0A> can buy bandwidth.=0A>=0A> Frankly my opinion =
of LRO is that it's a science project suitable for labs=0A> only. It's a tr=
ick to get more bandwidth=0A> than your bus capacity; the answer is to not =
run PCIe2 if you need pcie3.=0A> You can use it internally if you have=0A> =
control of all of the machines. When I modify a driver the first thing=0A> =
that I do is rip it out.=0A>=0A> BC=0A>=0A>=0A> ___________________________=
_____=0A>=A0 From: Luigi Rizzo <rizzo@iet.unipi.it>=0A> To: Barney Cordoba =
<barney_cordoba@yahoo.com>=0A> Cc: Andre Oppermann <andre@freebsd.org>; Ala=
n Somers <asomers@freebsd.org>;=0A> "net@freebsd.org" <net@freebsd.org>; Ja=
ck F Vogel <jfv@freebsd.org>;=0A> Justin T. Gibbs <gibbs@freebsd.org>; T.C.=
 Gubatayao <=0A> tgubatayao@barracuda.com>=0A> Sent: Saturday, August 31, 2=
013 10:27 PM=0A> Subject: Re: Flow ID, LACP, and igb=0A>=0A>=0A> On Sun, Se=
p 1, 2013 at 4:15 AM, Barney Cordoba <barney_cordoba@yahoo.com=0A> >wrote:=
=0A>=0A> > ...=0A> >=0A>=0A> [your point on testing with realistic assumpti=
ons is surely a valid one]=0A>=0A>=0A> >=0A> > Of course there's nothing re=
ally wrong with OOO packets. We had this=0A> > discussion before; lots of p=
eople=0A> > have round robin dual homing without any ill effects. It's just=
 not an=0A> > issue.=0A> >=0A>=0A> It depends on where you are.=0A> It may =
not be an issue if the reordering is not large enough to=0A> trigger retran=
smissions, but even then it is annoying as it causes=0A> more work in the e=
ndpoint -- it prevents LRO from working, and even=0A> on the host stack it =
takes more work to sort where an out of order=0A> segment goes than appendi=
ng an in-order one to the socket buffer.=0A>=0A> cheers=0A> luigi=0A> _____=
__________________________________________=0A> freebsd-net@freebsd.org mail=
ing list=0A> http://lists.freebsd.org/mailman/listinfo/freebsd-net=0A>; To u=
nsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A> ____=
___________________________________________=0A> freebsd-net@freebsd.org mai=
ling list=0A> http://lists.freebsd.org/mailman/listinfo/freebsd-net=0A>; To =
unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A>=0A_=
______________________________________________=0Afreebsd-net@freebsd.org ma=
iling list=0Ahttp://lists.freebsd.org/mailman/listinfo/freebsd-net=0ATo uns=
ubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
From owner-freebsd-net@FreeBSD.ORG  Mon Sep  2 13:01:42 2013
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 76939F19
 for <net@freebsd.org>; Mon,  2 Sep 2013 13:01:42 +0000 (UTC)
 (envelope-from cochard@gmail.com)
Received: from mail-vc0-x22a.google.com (mail-vc0-x22a.google.com
 [IPv6:2607:f8b0:400c:c03::22a])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 310C32DA4
 for <net@freebsd.org>; Mon,  2 Sep 2013 13:01:42 +0000 (UTC)
Received: by mail-vc0-f170.google.com with SMTP id kw10so3130690vcb.1
 for <net@freebsd.org>; Mon, 02 Sep 2013 06:01:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:from:date:message-id
 :subject:to:cc:content-type;
 bh=P9soiYOWUfiJA7hVZIASjyIoEFWYMYH84pdkSMMvMwE=;
 b=o8mp20tS0DUMKB3HxoBi96y26u+Kz7wORmWXcy8mdEM/ODtAXAPQ3C1hNps+YA59p8
 PJaLXRqIbNbnxT99u9k/iuT7iLSA2tyecc29GpOWIl9ZGdaki38Yu5P0yPVfvIQ7MbrJ
 uPDHfHhEbpmkjJodl3dmXr6eM14HGkNRXL7bYAvAT0f4CVSeO6RtocarOUNe50j0MKTY
 Wg9X4usknZG+3HmVKC+TDHeFlD6FLxwjCQnF/FETa4mXm50DXp+2t2YJfMBzyHGQBgeK
 yVTxewFNmy3MRd1cRROSL677oVv3zrDSbV/ah6mjtz17S/Qa17QyQSH1FpOIC4o0NKLk
 kDWA==
X-Received: by 10.52.52.231 with SMTP id w7mr10224301vdo.12.1378126901227;
 Mon, 02 Sep 2013 06:01:41 -0700 (PDT)
MIME-Version: 1.0
Sender: cochard@gmail.com
Received: by 10.58.221.9 with HTTP; Mon, 2 Sep 2013 06:01:21 -0700 (PDT)
In-Reply-To: <1378126037.56348.YahooMailNeo@web121603.mail.ne1.yahoo.com>
References: <D01A0CB2-B1E3-4F4B-97FA-4C821C0E3FD2@FreeBSD.org>
 <521BBD21.4070304@freebsd.org>
 <CAOtMX2jvKGY==t9i-a_8RtMAPH2p1VDj950nMHHouryoz3nbsA@mail.gmail.com>
 <521EE8DA.3060107@freebsd.org>
 <BCC2C62D4FE171479E2F1C2593FE508B0BE24383@BN-SCL-MBX03.Cudanet.local>
 <CAOtMX2h5SGh5eYV50y+QB_s367V9iattGU862wwXcONDV+TG8g@mail.gmail.com>
 <CA+hQ2+hgTaK1ZCOLGVFjSPY8nyNPHK4waSecyRQxR1gQcyjztg@mail.gmail.com>
 <1377952913.44129.YahooMailNeo@web121605.mail.ne1.yahoo.com>
 <BCC2C62D4FE171479E2F1C2593FE508B0BE2440B@BN-SCL-MBX03.Cudanet.local>
 <1378001733.36695.YahooMailNeo@web121606.mail.ne1.yahoo.com>
 <CA+hQ2+j-DDuEX1KCDYioCactjL71p-d4AtusPUfePrswDyUpog@mail.gmail.com>
 <1378050319.62710.YahooMailNeo@web121601.mail.ne1.yahoo.com>
 <CAJ-VmomEKxJ5zz3Gw1b-HizDJ03_Mn=6uZVYR07QFTqwBzNsCg@mail.gmail.com>
 <1378126037.56348.YahooMailNeo@web121603.mail.ne1.yahoo.com>
From: =?ISO-8859-1?Q?Olivier_Cochard=2DLabb=E9?= <olivier@cochard.me>
Date: Mon, 2 Sep 2013 15:01:21 +0200
X-Google-Sender-Auth: X9i8ieHUeOuPZJPbqCmbry5xRZc
Message-ID: <CA+q+TcoxWLqQCh=MjB9UDkbCia0+dTkCQKnNY8K6c7HH_eqkpw@mail.gmail.com>
Subject: Re: Flow ID, LACP, and igb
To: Barney Cordoba <barney_cordoba@yahoo.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "net@freebsd.org" <net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>;
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Sep 2013 13:01:42 -0000

On Mon, Sep 2, 2013 at 2:47 PM, Barney Cordoba <barney_cordoba@yahoo.com> wrote:
>
> In a week I ripped out the offload crap and the 9000 sysctls, eliminated the
> "consumer buffer" problem, reduced locking by 40% and now the igb driver
> uses 20% less cpu with a full gig load.
>

Wow!

where is the patch ? I would like to test it too.

Thanks,

Olivier



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1378126037.56348.YahooMailNeo>