Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 9 Dec 2011 03:49:11 +0100
From:      Luigi Rizzo <rizzo@iet.unipi.it>
To:        Andre Oppermann <andre@freebsd.org>
Cc:        Lawrence Stewart <lstewart@freebsd.org>, Daniel Kalchev <daniel@digsys.bg>, Jack Vogel <jfvogel@gmail.com>, current@freebsd.org, np@freebsd.org
Subject:   Re: quick summary results with ixgbe (was Re: datapoints on 10G throughput with TCP ?
Message-ID:  <20111209024911.GB86313@onelab2.iet.unipi.it>
In-Reply-To: <4EE15740.9030505@freebsd.org>
References:  <4EDDF9F4.9070508@digsys.bg> <4EDE259B.4010502@digsys.bg> <CAFOYbcmVR_K0iZU_Z4TxDVzPzx6-GZuzfCxUZbf6KQn4siF2UA@mail.gmail.com> <F5BCA7E9-6A61-4492-9F18-423178E9C9B4@digsys.bg> <20111206210625.GB62605@onelab2.iet.unipi.it> <4EDF471F.1030202@freebsd.org> <20111207180807.GA71878@onelab2.iet.unipi.it> <4EE0B796.3050800@freebsd.org> <20111208153454.GA80979@onelab2.iet.unipi.it> <4EE15740.9030505@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Dec 09, 2011 at 01:33:04AM +0100, Andre Oppermann wrote:
> On 08.12.2011 16:34, Luigi Rizzo wrote:
> >On Fri, Dec 09, 2011 at 12:11:50AM +1100, Lawrence Stewart wrote:
...
> >>Jeff tested the WIP patch and it *doesn't* fix the issue. I don't have
> >>LRO capable hardware setup locally to figure out what I've missed. Most
> >>of the machines in my lab are running em(4) NICs which don't support
> >>LRO, but I'll see if I can find something which does and perhaps
> >>resurrect this patch.
> 
> LRO can always be done in software.  You can do it at driver, ether_input
> or ip_input level.

storing LRO state at the driver (as it is done now) is very convenient,
because it is trivial to flush the pending segments at the end of
an rx interrupt. If you want to do LRO in ether_input() or ip_input(),
you need to add another call to flush the LRO state stored there.

> >a few comments:
> >1. i don't think it makes sense to send multiple acks on
> >    coalesced segments (and the 82599 does not seem to do that).
> >    First of all, the acks would get out with minimal spacing (ideally
> >    less than 100ns) so chances are that the remote end will see
> >    them in a single burst anyways. Secondly, the remote end can
> >    easily tell that a single ACK is reporting multiple MSS and
> >    behave as if an equivalent number of acks had arrived.
> 
> ABC (appropriate byte counting) gets in the way though.

right, during slow start the current ABC specification (RFC3465)
sets a prettly low limit on how much the window can be expanded
on each ACK. On the other hand...

> >2. i am a big fan of LRO (and similar solutions), because it can save
> >    a lot of repeated work when passing packets up the stack, and the
> >    mechanism becomes more and more effective as the system load increases,
> >    which is a wonderful property in terms of system stability.
> >
> >    For this reason, i think it would be useful to add support for software
> >    LRO in the generic code (sys/net/if.c) so that drivers can directly use
> >    the software implementation even without hardware support.
> 
> It hurts on higher RTT links in the general case.  For LAN RTT's
> it's good.

... on the other hand remember that LRO coalescing is limited to
the number of segments that arrive during a mitigation interval,
so even on a 10G interface is't only a handful of packets.
I better run some simulations to see how long it takes to
get full rate on a 10..50ms path when using LRO.

cheers
luigi



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111209024911.GB86313>