Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 22 Jun 2014 21:16:19 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        araujo@freebsd.org
Cc:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: [patch][lagg] - Set a better granularity and distribution on roundrobin protocol.
Message-ID:  <CAJ-Vmomt2QDXAVBVUk6m8oH4Pa5yErDdG6wWrP3X7%2BDW137xiA@mail.gmail.com>
In-Reply-To: <CAOfEmZjmb1bdvn0gR6vD1WeP8o8g7KwXod4TE0iJfa=nicyeng@mail.gmail.com>
References:  <CAOfEmZjmb1bdvn0gR6vD1WeP8o8g7KwXod4TE0iJfa=nicyeng@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
...

It's an interesting idea, but doing round robin like that may
introduce out of order packets.

What's the actual problem you're seeing? Are the transmit queues
filling up? Is the distribution with flowid/curcpu not good enough?

Scott saw this happen at Netflix. He added a lagg twiddle to set which
set of bits to care about in the flowid when picking an interface to
choose. The ixgbe hashing was being done on the low x bits, where x is
related to how many CPUs you have (2 CPUs? 1 bit. 8 CPUs? 3 bits.
etc.) lagg was doing the same thing on the same low order set of bits.
He modified lagg so you could pick some new starting point a few bits
up in the flowid to pick a lagg interface with. That fixed the
distribution issue and also kept the in-orderness of it all.

2c,


-a

On 22 June 2014 19:27, Marcelo Araujo <araujobsdport@gmail.com> wrote:
> Hello guys,
>
> I made some changes on roundrobin protocol where from now you can via
> sysctl(8) set a better packets distribution among the interfaces that are
> part of the lagg(4) group.
>
> My motivation for this change was interfaces that use TSO, as example
> ixgbe(4), the performance is terrible, as we can't full fill the TSO buffer
> at once, the throughput drops expressively and we have much more sack
> between hosts.
>
> So, with this patch we can set the number of packets that will be send
> before switch to the next interface.
>
> In my testbed using ixgbe(4), I had a very good performance as you can see
> bellow:
>
> 1) Without patch:
> ------------------------------------------------------------
> Client connecting to 192.168.1.2, TCP port 5001
> TCP window size: 32.5 KByte (default)
> ------------------------------------------------------------
> [  3] local 192.168.1.1 port 32808 connected with 192.168.1.2 port 5001
> [ ID] Interval       Transfer     Bandwidth
> [  3]  0.0- 1.0 sec   406 MBytes  3.40 Gbits/sec
> [  3]  1.0- 2.0 sec   391 MBytes  3.28 Gbits/sec
> [  3]  2.0- 3.0 sec   406 MBytes  3.41 Gbits/sec
> [  3]  3.0- 4.0 sec   585 MBytes  4.91 Gbits/sec
> [  3]  4.0- 5.0 sec   477 MBytes  4.00 Gbits/sec
> [  3]  5.0- 6.0 sec   429 MBytes  3.60 Gbits/sec
> [  3]  6.0- 7.0 sec   520 MBytes  4.36 Gbits/sec
> [  3]  7.0- 8.0 sec   385 MBytes  3.23 Gbits/sec
> [  3]  8.0- 9.0 sec   414 MBytes  3.48 Gbits/sec
> [  3]  9.0-10.0 sec   515 MBytes  4.32 Gbits/sec
> [  3]  0.0-10.0 sec  4.42 GBytes  3.80 Gbits/sec
>
> 2) With patch:
> ------------------------------------------------------------
> Client connecting to 192.168.1.2, TCP port 5001
> TCP window size: 32.5 KByte (default)
> ------------------------------------------------------------
> [  3] local 192.168.1.1 port 10526 connected with 192.168.1.2 port 5001
> [ ID] Interval       Transfer     Bandwidth
> [  3]  0.0- 1.0 sec   694 MBytes  5.83 Gbits/sec
> [  3]  1.0- 2.0 sec   999 MBytes  8.38 Gbits/sec
> [  3]  2.0- 3.0 sec  1.17 GBytes  10.1 Gbits/sec
> [  3]  3.0- 4.0 sec  1.34 GBytes  11.5 Gbits/sec
> [  3]  4.0- 5.0 sec  1.15 GBytes  9.91 Gbits/sec
> [  3]  5.0- 6.0 sec  1.19 GBytes  10.2 Gbits/sec
> [  3]  6.0- 7.0 sec  1.08 GBytes  9.23 Gbits/sec
> [  3]  7.0- 8.0 sec  1.10 GBytes  9.45 Gbits/sec
> [  3]  8.0- 9.0 sec  1.27 GBytes  10.9 Gbits/sec
> [  3]  9.0-10.0 sec  1.39 GBytes  12.0 Gbits/sec
> [  3]  0.0-10.0 sec  11.3 GBytes  9.74 Gbits/sec
>
> So, basically we have a sysctl(8) called "net.link.lagg.rr_packets" where
> we can set the number of packets that will be send before the roundrobin
> move to the next interface.
>
> Any comment and review are very appreciated.
>
> Best Regards,
>
> --
> Marcelo Araujo            (__)araujo@FreeBSD.org
> \\\'',)http://www.FreeBSD.org <http://www.freebsd.org/>;   \/  \ ^
> Power To Server.         .\. /_)
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmomt2QDXAVBVUk6m8oH4Pa5yErDdG6wWrP3X7%2BDW137xiA>