From owner-freebsd-current@FreeBSD.ORG Wed Aug 20 13:29:10 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C7D73C0B; Wed, 20 Aug 2014 13:29:10 +0000 (UTC) Received: from mail.turbocat.net (mail.turbocat.net [IPv6:2a01:4f8:d16:4514::2]) (using TLSv1.1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 59A743EF1; Wed, 20 Aug 2014 13:29:10 +0000 (UTC) Received: from laptop015.home.selasky.org (cm-176.74.213.204.customer.telag.net [176.74.213.204]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 2A3F01FE027; Wed, 20 Aug 2014 15:29:04 +0200 (CEST) Message-ID: <53F4A2AF.6080102@selasky.org> Date: Wed, 20 Aug 2014 15:29:19 +0200 From: Hans Petter Selasky User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: Luigi Rizzo Subject: Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections] References: <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> <53F44F91.2060006@selasky.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: "freebsd-net@freebsd.org" , FreeBSD Current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 13:29:11 -0000 Hi Luigi, On 08/20/14 11:32, Luigi Rizzo wrote: > On Wed, Aug 20, 2014 at 9:34 AM, Hans Petter Selasky > wrote: > >> Hi, >> >> A month has passed since the last e-mail on this topic, and in the >> meanwhile some new patches have been created and tested: >> >> Basically the approach has been changed a little bit: >> >> - The creation of hardware transmit rings has been made independent of the >> TCP stack. This allows firewall applications to forward traffic into >> hardware transmit rings aswell, and not only native TCP applications. This >> should be one more reason to get the feature into the kernel. >> >> - A hardware transmit ring basically can have two modes: FIXED-RATE or >> AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed bytes >> per second rate. In the automatic mode you can configure a time after which >> the TX queue must be empty. The hardware driver uses this to configure the >> actual rate. In automatic mode you can also set an upper and lower transmit >> rate limit. >> >> - The MBUF has got a new field in the packet header: "txringid" >> >> - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of >> the "txringid" field in the mbuf. >> >> The current patch [see attachment] should be much simpler and less >> intrusive than the previous one. >> > > ​the patch seems to include only part of the generic code (ie no ioctls > for manipulating the rates, no backend code). Do i miss something ? The IOCTLs for managing the rates are: SIOCARATECTL, SIOCSRATECTL, SIOCGRATECTL and SIOCDRATECTL And they go to the if_ioctl callback. > > I have a few comments/concerns: > > + looks like flowid and txringid are overlapped in scope, > both will be used (in the backend) to select a specific > tx queue. I don't have a solution but would like to know > how do you plan to address this -- does one have priority > over the other, etc. Not 100% . In some cases the flowID is used differently than the txringid, though it might be possible to join the two. Would need to investigate current users of the flow ID. > + related to the above, a (possibly unavoidable) side effect > of this type of changes is that mbufs explode with custom fields, > so if we could perhaps make one between flowid and txringid, > that would be useful. Right, but ratecontrol is an in-general useful feature, especially for high throughput networks, or do you think otherwise? > > + is there a way to ​avoid the replicated code for SIOCSTXRINGID > (the ioctl handler, i suppose). Maybe make one function and > call it from both ipv4 and ipv6, assuming there aren't other > places like this. Yes, could do that. > > + i am not particularly happy about the explosion of ioctls for > setting and getting rates. Next we'll want to add scheduling, > and intervals, and queue sizes and so on. > For these commands outside the critical path it would be > preferable a single command with an extensible structure. > Bikeshed material i am sure. There is only one IOCTL in the critical path and that is the IOCTL to change or update the TX ring ID. The other IOCTLs are in the non-critical path towards the if_ioctl() callback. If we can merge the flowID and the txringid into one field, would it be acceptable to add an IOCTL to read/write this value for all sockets? --HPS