From owner-freebsd-net@FreeBSD.ORG Mon Jun 23 04:16:21 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 387695AD; Mon, 23 Jun 2014 04:16:21 +0000 (UTC) Received: from mail-qc0-x230.google.com (mail-qc0-x230.google.com [IPv6:2607:f8b0:400d:c01::230]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DE5CC2935; Mon, 23 Jun 2014 04:16:20 +0000 (UTC) Received: by mail-qc0-f176.google.com with SMTP id w7so5562548qcr.21 for ; Sun, 22 Jun 2014 21:16:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=ucVynsLPF2Yf6uU4pLFCwz9klU7U0Azr4wm6cjsjrWA=; b=oZBU2Hc27GSeG/N//xwk8RFYh1ML2Hebf2aC6gMKm3aSSmTpYfNbEgGttSPXOb3ltG YdyGFZc+epNqLTrEWomZdwYIaHCQxRsWfGd08wJP16BM9WukIJp7io6WwrY4k2S2uczu yBTdAsTKbFmU5MZ5WRSq0iRyi2XYRrx3dcDsk3J1j+UrSVG5InUeCZpkUEI/QYVMRQ6l +Iwc3bXFRLMr40/Bf4PSRtaca3zil5XaeS37Z+wU9H6kkq9XzzqTHBfAXZvUZpJxShy3 GUhlo4+XCKEpL/kCvvV+9knON2Bt0uupkMfufmiqvXrx173dHD7oYE0D5e99JbQjnW0l rKJg== MIME-Version: 1.0 X-Received: by 10.224.66.70 with SMTP id m6mr29823754qai.55.1403496979871; Sun, 22 Jun 2014 21:16:19 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.43.134 with HTTP; Sun, 22 Jun 2014 21:16:19 -0700 (PDT) In-Reply-To: References: Date: Sun, 22 Jun 2014 21:16:19 -0700 X-Google-Sender-Auth: kmsG9OfSNnjZMFDYFx-nAhokvUo Message-ID: Subject: Re: [patch][lagg] - Set a better granularity and distribution on roundrobin protocol. From: Adrian Chadd To: araujo@freebsd.org Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jun 2014 04:16:21 -0000 ... It's an interesting idea, but doing round robin like that may introduce out of order packets. What's the actual problem you're seeing? Are the transmit queues filling up? Is the distribution with flowid/curcpu not good enough? Scott saw this happen at Netflix. He added a lagg twiddle to set which set of bits to care about in the flowid when picking an interface to choose. The ixgbe hashing was being done on the low x bits, where x is related to how many CPUs you have (2 CPUs? 1 bit. 8 CPUs? 3 bits. etc.) lagg was doing the same thing on the same low order set of bits. He modified lagg so you could pick some new starting point a few bits up in the flowid to pick a lagg interface with. That fixed the distribution issue and also kept the in-orderness of it all. 2c, -a On 22 June 2014 19:27, Marcelo Araujo wrote: > Hello guys, > > I made some changes on roundrobin protocol where from now you can via > sysctl(8) set a better packets distribution among the interfaces that are > part of the lagg(4) group. > > My motivation for this change was interfaces that use TSO, as example > ixgbe(4), the performance is terrible, as we can't full fill the TSO buffer > at once, the throughput drops expressively and we have much more sack > between hosts. > > So, with this patch we can set the number of packets that will be send > before switch to the next interface. > > In my testbed using ixgbe(4), I had a very good performance as you can see > bellow: > > 1) Without patch: > ------------------------------------------------------------ > Client connecting to 192.168.1.2, TCP port 5001 > TCP window size: 32.5 KByte (default) > ------------------------------------------------------------ > [ 3] local 192.168.1.1 port 32808 connected with 192.168.1.2 port 5001 > [ ID] Interval Transfer Bandwidth > [ 3] 0.0- 1.0 sec 406 MBytes 3.40 Gbits/sec > [ 3] 1.0- 2.0 sec 391 MBytes 3.28 Gbits/sec > [ 3] 2.0- 3.0 sec 406 MBytes 3.41 Gbits/sec > [ 3] 3.0- 4.0 sec 585 MBytes 4.91 Gbits/sec > [ 3] 4.0- 5.0 sec 477 MBytes 4.00 Gbits/sec > [ 3] 5.0- 6.0 sec 429 MBytes 3.60 Gbits/sec > [ 3] 6.0- 7.0 sec 520 MBytes 4.36 Gbits/sec > [ 3] 7.0- 8.0 sec 385 MBytes 3.23 Gbits/sec > [ 3] 8.0- 9.0 sec 414 MBytes 3.48 Gbits/sec > [ 3] 9.0-10.0 sec 515 MBytes 4.32 Gbits/sec > [ 3] 0.0-10.0 sec 4.42 GBytes 3.80 Gbits/sec > > 2) With patch: > ------------------------------------------------------------ > Client connecting to 192.168.1.2, TCP port 5001 > TCP window size: 32.5 KByte (default) > ------------------------------------------------------------ > [ 3] local 192.168.1.1 port 10526 connected with 192.168.1.2 port 5001 > [ ID] Interval Transfer Bandwidth > [ 3] 0.0- 1.0 sec 694 MBytes 5.83 Gbits/sec > [ 3] 1.0- 2.0 sec 999 MBytes 8.38 Gbits/sec > [ 3] 2.0- 3.0 sec 1.17 GBytes 10.1 Gbits/sec > [ 3] 3.0- 4.0 sec 1.34 GBytes 11.5 Gbits/sec > [ 3] 4.0- 5.0 sec 1.15 GBytes 9.91 Gbits/sec > [ 3] 5.0- 6.0 sec 1.19 GBytes 10.2 Gbits/sec > [ 3] 6.0- 7.0 sec 1.08 GBytes 9.23 Gbits/sec > [ 3] 7.0- 8.0 sec 1.10 GBytes 9.45 Gbits/sec > [ 3] 8.0- 9.0 sec 1.27 GBytes 10.9 Gbits/sec > [ 3] 9.0-10.0 sec 1.39 GBytes 12.0 Gbits/sec > [ 3] 0.0-10.0 sec 11.3 GBytes 9.74 Gbits/sec > > So, basically we have a sysctl(8) called "net.link.lagg.rr_packets" where > we can set the number of packets that will be send before the roundrobin > move to the next interface. > > Any comment and review are very appreciated. > > Best Regards, > > -- > Marcelo Araujo (__)araujo@FreeBSD.org > \\\'',)http://www.FreeBSD.org \/ \ ^ > Power To Server. .\. /_) > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"