From owner-freebsd-current@FreeBSD.ORG Wed Aug 20 16:21:40 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id ECC7618B; Wed, 20 Aug 2014 16:21:40 +0000 (UTC) Received: from mail-qc0-x234.google.com (mail-qc0-x234.google.com [IPv6:2607:f8b0:400d:c01::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9627D342F; Wed, 20 Aug 2014 16:21:40 +0000 (UTC) Received: by mail-qc0-f180.google.com with SMTP id l6so8130128qcy.11 for ; Wed, 20 Aug 2014 09:21:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=MEDBPn7oIvwUjsxh+CZpz1yVxFwbna3Jd9wQC8xQs50=; b=yxWqkXBFW4xL9BZUzLpcOaxil13JY3meOvcOzFIUqJPS0PqGqaCHYDNbqcT8sUC6w8 tshddY1BfrZksKwNvzGpUrez9ye1r05R+PDVCXXrAYfxFGg3tJxRn0zNYY8++HtfLPZG PhZr7NocDnofnh1/AP5Rygy9RiuOmAN9nLUkP2kS0aX9+NrGeTwzmJHRVn7TacGeSSAT K4TMrfrnA842ZogAdOujJq2mYap/tUGKpq/15VPSXw1ePbJq70nIHc03YqbNdy+WZfKS PSNwYbtUZfBYDNRa5K4yXWS7k3doNkRLqKh9vHs/76dIEkUFS4/LPQOQAR4eRddjsbDN BLQQ== MIME-Version: 1.0 X-Received: by 10.224.12.134 with SMTP id x6mr39794633qax.1.1408551699682; Wed, 20 Aug 2014 09:21:39 -0700 (PDT) Sender: kmacybsd@gmail.com Received: by 10.224.17.129 with HTTP; Wed, 20 Aug 2014 09:21:39 -0700 (PDT) In-Reply-To: References: <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> <53F44F91.2060006@selasky.org> <53F4A2AF.6080102@selasky.org> Date: Wed, 20 Aug 2014 09:21:39 -0700 X-Google-Sender-Auth: LcH8QDyKws6ZXEPdmZq2F5rVOkI Message-ID: Subject: Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections] From: "K. Macy" To: Luigi Rizzo Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Hans Petter Selasky , "freebsd-net@freebsd.org" , FreeBSD Current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 16:21:41 -0000 On Wed, Aug 20, 2014 at 7:41 AM, Luigi Rizzo wrote: > On Wed, Aug 20, 2014 at 3:29 PM, Hans Petter Selasky > wrote: > > > Hi Luigi, > > > > > > On 08/20/14 11:32, Luigi Rizzo wrote: > > > >> On Wed, Aug 20, 2014 at 9:34 AM, Hans Petter Selasky > >> wrote: > >> > >> Hi, > >>> > >>> A month has passed since the last e-mail on this topic, and in the > >>> meanwhile some new patches have been created and tested: > >>> > >>> Basically the approach has been changed a little bit: > >>> > >>> - The creation of hardware transmit rings has been made independent o= f > >>> the > >>> TCP stack. This allows firewall applications to forward traffic into > >>> hardware transmit rings aswell, and not only native TCP applications. > >>> This > >>> should be one more reason to get the feature into the kernel. > >>> =E2=80=8B... > >>> > >> =E2=80=8Bthe patch seems to include only part of the generic code (ie = no ioctls > >> for manipulating the rates, no backend code). Do i miss something ? > >> > > > > The IOCTLs for managing the rates are: > > > > SIOCARATECTL, SIOCSRATECTL, SIOCGRATECTL and SIOCDRATECTL > > > > And they go to the if_ioctl callback.=E2=80=8B > > > =E2=80=8Bi really think these new 'advanced' features should go > through some ethtool-like API, not more ioctls. > We have a strong need to design and implement such > an API also to have a uniform mechanism to manipulate > rss, queues and other NIC features. > > > There is no ethtool equivalent yet, but exposing them through a sysctl is definitely the place to start before putting it straight in to ifconfig. The ifnet API is already a bit of a mess. > =E2=80=8B...=E2=80=8B > > > > > > > >> I have a few comments/concerns: > >> > >> + looks like flowid and txringid are overlapped in scope, > >> both will be used (in the backend) to select a specific > >> tx queue. I don't have a solution but would like to know > >> how do you plan to address this -- does one have priority > >> over the other, etc. > >> > > > > Not 100% . In some cases the flowID is used differently than the > txringid, > > though it might be possible to join the two. Would need to investigate > > current users of the flow ID. > > > =E2=80=8Bin some 10G drivers i have seen, at the driver > level the flowid is used on the tx path to assign > packets to a given =E2=80=8Btx queue, generally to improve > cpu affinity. Of course some applications > may want a true flow classifier so they do not > have to re-do the classification multiple times. > But then, we have a ton of different classifiers > with the same need -- e.g. ipfw dynamic rules, > dummynet pipe/queue id, divert ports... > Pipes are stored in mtags, which are very expensive > so i do see a point in embedding them in the mbufs, > it's just that going this path there is no end > to the list. > > > The purpose of the flowid was to enforce packet ordering on transmit while being large enough to store a RSS hash, potentially allowing input consumers to use it to semi-uniquely label (srcip, srcport, dstip, dstport) tuples. It seems to that the txringid would be almost entirely redundant. Why not just let users set the flowid? > If we can merge the flowID and the txringid into one field, would it be > > acceptable to add an IOCTL to read/write this value for all sockets? > > That sounds reasonable - although I have not thought through all the implications. -K