Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 3 Sep 2013 01:04:53 +0100
From:      "Joe Holden" <lists@rewt.org.uk>
To:        "'Barney Cordoba'" <barney_cordoba@yahoo.com>, "'Adrian Chadd'" <adrian@freebsd.org>
Cc:        'Andre Oppermann' <andre@freebsd.org>, 'Alan Somers' <asomers@freebsd.org>, net@freebsd.org, 'Jack F Vogel' <jfv@freebsd.org>, "'Justin T. Gibbs'" <gibbs@freebsd.org>, 'Luigi Rizzo' <rizzo@iet.unipi.it>, "'T.C. Gubatayao'" <tgubatayao@barracuda.com>
Subject:   RE: Flow ID, LACP, and igb
Message-ID:  <25fd01cea839$39cbc1a0$ad6344e0$@rewt.org.uk>
In-Reply-To: <1378126037.56348.YahooMailNeo@web121603.mail.ne1.yahoo.com>
References:  <D01A0CB2-B1E3-4F4B-97FA-4C821C0E3FD2@FreeBSD.org> <521BBD21.4070304@freebsd.org> <CAOtMX2jvKGY==t9i-a_8RtMAPH2p1VDj950nMHHouryoz3nbsA@mail.gmail.com> <521EE8DA.3060107@freebsd.org> <BCC2C62D4FE171479E2F1C2593FE508B0BE24383@BN-SCL-MBX03.Cudanet.local> <CAOtMX2h5SGh5eYV50y%2BQB_s367V9iattGU862wwXcONDV%2BTG8g@mail.gmail.com> <CA%2BhQ2%2BhgTaK1ZCOLGVFjSPY8nyNPHK4waSecyRQxR1gQcyjztg@mail.gmail.com> <1377952913.44129.YahooMailNeo@web121605.mail.ne1.yahoo.com> <BCC2C62D4FE171479E2F1C2593FE508B0BE2440B@BN-SCL-MBX03.Cudanet.local> <1378001733.36695.YahooMailNeo@web121606.mail.ne1.yahoo.com> <CA%2BhQ2%2Bj-DDuEX1KCDYioCactjL71p-d4AtusPUfePrswDyUpog@mail.gmail.com> <1378050319.62710.YahooMailNeo@web121601.mail.ne1.yahoo.com> <CAJ-VmomEKxJ5zz3Gw1b-HizDJ03_Mn=6uZVYR07QFTqwBzNsCg@mail.gmail.com> <1378126037.56348.YahooMailNeo@web121603.mail.ne1.yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Your argument is horseshit on the basis that many x86 and non-x86
(especially mips) usable NICs will happily do linerate (I see you don't
understand how network interfaces actually work... that is pps and frame
sizes are relevant not throughput) on stock FreeBSD without any tuning
whatsoever.  Also: a modern Realtek will do higher pps before becoming
useless than a 2 or 3 generation old 1000 G/CT.

This is *with* PCI-X at 133mhz and 64bit as well as PCIe gen2.  You should
also consider the people buying interfaces from people like Chelsio (who
support FreeBSD rather well considering their customer base includes
basically 0 FreeBSD users) who sell 20/80G PCIe interface cards.

In reality CPU load is entirely irrelevant since 10G won't bother a decent
CPU even with the glaring inefficiencies of the FreeBSD stack - as long as
it isn't live locked who cares?

Ultimately there are very few driver problems and some quite serious stack
design problems which driver behaviour exacerbates.


> -----Original Message-----
> From: owner-freebsd-net@freebsd.org [mailto:owner-freebsd-
> net@freebsd.org] On Behalf Of Barney Cordoba
> Sent: 02 September 2013 13:47
> To: Adrian Chadd
> Cc: Andre Oppermann; Alan Somers; net@freebsd.org; Jack F Vogel; Justin T.
> Gibbs; Luigi Rizzo; T.C. Gubatayao
> Subject: Re: Flow ID, LACP, and igb
>
> Are you using a pcie3 bus? Of course this is only an issue for 10g; what
pct of
> FreeBSD users have a load over 9.5Gb/s? It's completely unnecessary for
igb
> or em driver, so why is it used? because it's there.
>
> Here's my argument against it. The handful of brains capable of doing
driver
> development become consumed with BS like LRO and the things that need
> to be fixed, like buffer management and basic driver design flaws, never
get
> fixed. The offload code makes the driver code a virtual mess that can only
be
> maintained by Jack and
> 1 other guy in the entire world. And it takes 10 times longer to make a
simple
> change or to add support for a new NIC.
>
> In a week I ripped out the offload crap and the 9000 sysctls, eliminated
the
> "consumer buffer" problem, reduced locking by 40% and now the igb driver
> uses 20% less cpu with a full gig load.
>
> And the code is cleaner and more easily maintained.
>
> BC
>
>
> ________________________________
>  From: Adrian Chadd <adrian@freebsd.org>
> To: Barney Cordoba <barney_cordoba@yahoo.com>
> Cc: Andre Oppermann <andre@freebsd.org>; Alan Somers
> <asomers@freebsd.org>; "net@freebsd.org" <net@freebsd.org>; Jack F
> Vogel <jfv@freebsd.org>; Justin T. Gibbs <gibbs@freebsd.org>; Luigi Rizzo
> <rizzo@iet.unipi.it>; T.C. Gubatayao <tgubatayao@barracuda.com>
> Sent: Sunday, September 1, 2013 4:51 PM
> Subject: Re: Flow ID, LACP, and igb
>
>
> Yo,
>
> LRO is an interesting hack that seems to do a good trick of hiding the
> ridiculous locking and unfriendly cache behaviour that we do per-packet.
>
> It helps with LAN test traffic where things are going out in batches from
the
> TCP layer so the RX layer "sees" these frames in-order and can do LRO.
> When you disable it, I don't easily get 10GE LAN TCP performance. That has
> to be fixed. Given how fast the CPU cores, bus interconnect and memory
> interconnects are, I don't think there should be any reason why we can't
hit
> 10GE traffic on a LAN with LRO disabled (in both software and hardware.)
>
> Now that I have the PMC sandy bridge stuff working right (but no PEBS, I
> have to talk to Intel about that in a bit more detail before I think about
> hacking that in) we can get actual live information about this stuff. But
the
> last time I looked, there's just too much per-packet latency going on.
> The root cause looks like it's a toss up between scheduling, locking and
just
> lots of code running to completion per-frame. As I said, that all has to
die
> somehow.
>
> 2c,
>
>
>
> -adrian
>
>
>
> On 1 September 2013 08:45, Barney Cordoba
> <barney_cordoba@yahoo.com> wrote:
>
> >
> >
> > Comcast sends packets OOO. With any decent number of internet hops
> > you're likely to encounter a load balancer or packet shaper that sends
> > packets OOO, so you just can't be worried about it. In fact, your
> > designs MUST work with OOO packets.
> >
> > Getting balance on your load balanced lines is certainly a bigger
> > upside than the additional CPU used.
> > You can buy a faster processor for your "stack" for a lot less than
> > you can buy bandwidth.
> >
> > Frankly my opinion of LRO is that it's a science project suitable for
> > labs only. It's a trick to get more bandwidth than your bus capacity;
> > the answer is to not run PCIe2 if you need pcie3.
> > You can use it internally if you have
> > control of all of the machines. When I modify a driver the first thing
> > that I do is rip it out.
> >
> > BC
> >
> >
> > ________________________________
> >  From: Luigi Rizzo <rizzo@iet.unipi.it>
> > To: Barney Cordoba <barney_cordoba@yahoo.com>
> > Cc: Andre Oppermann <andre@freebsd.org>; Alan Somers
> ><asomers@freebsd.org>;  "net@freebsd.org" <net@freebsd.org>; Jack F
> >Vogel <jfv@freebsd.org>;  Justin T. Gibbs <gibbs@freebsd.org>; T.C.
> >Gubatayao <  tgubatayao@barracuda.com>
> > Sent: Saturday, August 31, 2013 10:27 PM
> > Subject: Re: Flow ID, LACP, and igb
> >
> >
> > On Sun, Sep 1, 2013 at 4:15 AM, Barney Cordoba
> > <barney_cordoba@yahoo.com
> > >wrote:
> >
> > > ...
> > >
> >
> > [your point on testing with realistic assumptions is surely a valid
> > one]
> >
> >
> > >
> > > Of course there's nothing really wrong with OOO packets. We had this
> > > discussion before; lots of people have round robin dual homing
> > > without any ill effects. It's just not an issue.
> > >
> >
> > It depends on where you are.
> > It may not be an issue if the reordering is not large enough to
> > trigger retransmissions, but even then it is annoying as it causes
> > more work in the endpoint -- it prevents LRO from working, and even
> > on the host stack it takes more work to sort where an out of order
> > segment goes than appending an in-order one to the socket buffer.
> >
> > cheers
> > luigi
> > _______________________________________________
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> > _______________________________________________
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> >
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?25fd01cea839$39cbc1a0$ad6344e0$>