Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 25 Apr 2016 01:18:30 -0500
From:      Dustin Marquess <dmarquess@gmail.com>
To:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject:   Re: Issues with ixl(4)
Message-ID:  <CAJpsHY5vkajTbzs6_Y6_DUMYEHyJUaEuT1Fbh-WPKCMUsEM4tQ@mail.gmail.com>
In-Reply-To: <CAHM0Q_MahwEpU53Vn02Pzm73BKbnNtUF6=5kxbw7opC3MgS_PQ@mail.gmail.com>
References:  <CAJpsHY4e_ycce3AGMXw93Z4Pbbgsg2yZk65q3vO9KTscSL3MaA@mail.gmail.com> <2C78DBCF-26F2-44D0-A45E-6EE8918648EA@netapp.com> <CAHM0Q_OMaRVoDZfufyrbMK0GwrG=ZPNYPScLG7YrRvkWfFwRYA@mail.gmail.com> <CAJpsHY4UQZ%2BLS3C_17b%2B-QJq7ah%2BnuNBKhqyVs6zR0Yq5qxfuw@mail.gmail.com> <CAHM0Q_MahwEpU53Vn02Pzm73BKbnNtUF6=5kxbw7opC3MgS_PQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
So I've done some more testing, and it's definitely some kind of
interaction between ixl & lagg, and maybe even ix & lagg.

It doesn't matter if lagg is using "lacp" or "loadbalance" (with the
switch set appropriately), it happens on both.  I did find out that
statically adding an arp entry for the "bad hosts" fixes it, so it's
something to do with ARP replies (tcpdump doesn't show it getting the
ARP replies all of the time in the lagg).  Pretty much exactly this
problem:

https://lists.freebsd.org/pipermail/freebsd-net/2015-June/042593.html

Except that fix is already in the code.

I going all the way back to r294499 of -CURRENT and that didn't change
it.  I also tried 10.3, but that immediately panics on the Intel-based
ixl machine.  I'll see if I can get the AMD-based ix machine to boot
10.3 for testing.

-Dustin

On Thu, Apr 21, 2016 at 4:52 PM, K. Macy <kmacy@freebsd.org> wrote:
>
>
> On Wednesday, April 20, 2016, Dustin Marquess <dmarquess@gmail.com> wrote:
>>
>> I tried backing out that change and everything worked for a few minutes
>> and then started acting up again.  Then I notice Sean Bruno's "TCP Packets
>> Drop!!!" email about LACP.  I disabled LACP on the switch side and then
>> changed the lagg config from "lacp" to "roundrobin", and so far so good.  On
>> the switch side it looks like member ports were randomly bounding in the
>> LACP bundle, and when I'd tcpdump an interface I wouldn't see anything until
>> another LACP&LLDP packet came in.
>>
>> So something seems to be broken with lagg's LACP support recently.  The
>> good news is I don't think the route caching is causing this problem.  I'll
>> put it back in and retest to make sure though.
>>
>
>
> Glad to hear I was in error.
> -M
>
>>
>> Thanks for the help!
>> -Dustin
>>
>> On Tue, Apr 19, 2016 at 6:15 PM, K. Macy <kmacy@freebsd.org> wrote:
>>>
>>> On Mon, Apr 18, 2016 at 10:45 PM, Eggert, Lars <lars@netapp.com> wrote:
>>> > I haven't played with lagg+vlan+bridge, but I briefly evaluated XL710
>>> > boards last year
>>> > (https://lists.freebsd.org/pipermail/freebsd-net/2015-October/043584.html)
>>> > and saw very poor throughputs and latencies even in very simple setups. As
>>> > far as I could figure it out, TSO/LRO wasn't being performed (although
>>> > enabled) and so I ran into packet-rate issues.
>>> >
>>> > I basically gave up and went with a different vendor. FWIW, the XL710
>>> > boards in the same machines booted into Linux performed fine.
>>> >
>>>
>>> FWIW, NFLX sees performance close to that of cxgbe (by far the best
>>> maintained, best performing FreeBSD 40G driver) with an iflib
>>> converted driver. The iflib updated driver will be imported by 11 but
>>> won't become the default driver until 11.1 for wont of QA resources at
>>> Intel.
>>>
>>> -M
>>
>>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJpsHY5vkajTbzs6_Y6_DUMYEHyJUaEuT1Fbh-WPKCMUsEM4tQ>