From owner-freebsd-net@FreeBSD.ORG Sun Aug 17 03:39:17 2014 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 58560464; Sun, 17 Aug 2014 03:39:17 +0000 (UTC) Received: from mail-vc0-x232.google.com (mail-vc0-x232.google.com [IPv6:2607:f8b0:400c:c03::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 047F52E9C; Sun, 17 Aug 2014 03:39:16 +0000 (UTC) Received: by mail-vc0-f178.google.com with SMTP id la4so4480567vcb.37 for ; Sat, 16 Aug 2014 20:39:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=ZzuHg6CoWWJHBZtXggd3jVCJF61mG1sUHvWeasmNHD0=; b=FwViokWXXjprP6jeMnKgKIV8Urq1CA+QLsypuIOwIDfG8Jb228F8CZxUYR2xkVkCro zxJAd4HMB2AfXcNHILnwtb1u4xWjaNLMayWECfEErc0fwee1HQB+UcnhsP3ku1xcz9nE qOojARzVg61p17WYe+6vFEZmn2ZokGHgF15RxRoKY4Ro/Mx2t5unhaONGzle2i/d2DdO Um2NUhQJVs0wi+6d2e0MAMqHkptRvlxxvZeCMFmOsk1IdSvA8jwTyU/RvE3JSOK42K5a g0D00NBIPGdvjRu17FrTfw3CF12+eZwdHw1MBj9NVfv8nZlUYccjHHzlnjPdQP0K3dVd QCnQ== MIME-Version: 1.0 X-Received: by 10.52.61.136 with SMTP id p8mr203565vdr.15.1408246755985; Sat, 16 Aug 2014 20:39:15 -0700 (PDT) Received: by 10.221.10.210 with HTTP; Sat, 16 Aug 2014 20:39:15 -0700 (PDT) In-Reply-To: References: <53EFAF0B.6060301@yandex-team.ru> Date: Sat, 16 Aug 2014 20:39:15 -0700 Message-ID: Subject: Re: ixgbe i2c interface From: Jack Vogel To: Eric Joyner Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "Alexander V. Chernikov" , Jack F Vogel , FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Aug 2014 03:39:17 -0000 Thanks Eric ! I'll commit it tomorrow since Eric approves :) Jack On Sat, Aug 16, 2014 at 3:27 PM, Eric Joyner wrote: > I unofficially approve of it! > > --- > Eric Joyner > On Aug 16, 2014 12:21 PM, "Alexander V. Chernikov" < > melifaro@yandex-team.ru> wrote: > >> Hello Jack! >> >> Can you please commit (or let me commit) the following one-liner? >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > From owner-freebsd-net@FreeBSD.ORG Mon Aug 18 08:00:13 2014 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4ACC8399 for ; Mon, 18 Aug 2014 08:00:13 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 208203787 for ; Mon, 18 Aug 2014 08:00:13 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s7I80CqK088436 for ; Mon, 18 Aug 2014 08:00:12 GMT (envelope-from bugzilla-noreply@freebsd.org) Message-Id: <201408180800.s7I80CqK088436@kenobi.freebsd.org> From: bugzilla-noreply@freebsd.org To: freebsd-net@FreeBSD.org Subject: [Bugzilla] Commit Needs MFC MIME-Version: 1.0 X-Bugzilla-Type: whine X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated Date: Mon, 18 Aug 2014 08:00:12 +0000 Content-Type: text/plain X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Aug 2014 08:00:13 -0000 Hi, You have a bug in the "Needs MFC" state which has not been touched in 7 or more days. This email serves as a reminder that you may want to MFC this bug or marked it as completed. In the event you have a longer MFC timeout you may update this bug with a comment and I won't remind you again for 7 days. This reminder is only sent on Mondays. Please file a bug about concerns you may have. This search was scheduled by eadler@FreeBSD.org. (1 bugs) Bug 183659: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183659 Severity: Affects Only Me Priority: Normal Hardware: Any Assignee: freebsd-net@FreeBSD.org Status: Needs MFC Resolution: Summary: [tcp] TCP stack lock contention with short-lived connections From owner-freebsd-net@FreeBSD.ORG Mon Aug 18 10:29:58 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E97185CC for ; Mon, 18 Aug 2014 10:29:58 +0000 (UTC) Received: from mail-qc0-x22b.google.com (mail-qc0-x22b.google.com [IPv6:2607:f8b0:400d:c01::22b]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A9EED3532 for ; Mon, 18 Aug 2014 10:29:58 +0000 (UTC) Received: by mail-qc0-f171.google.com with SMTP id r5so4531888qcx.16 for ; Mon, 18 Aug 2014 03:29:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=4yKo6KQV+arvPyD94e93raGtlvquEaeUzTHa9jcYRc4=; b=EWQ6/4GNPBkCT3W7RJlMPrG1nkHmF0j8Rl6VcbYtQIKO2bRdhHBJ+Kk7dqM/eeJRHY YSwMbfgZGsHAWoy7wexdiRyHTB73jaU9bSs2jJx1N7ZBnvvBdX/ZovArPuqLt2hi1+RA gliZ1eIE6JKY06lUfpGK6KDbh+Uf14QRyNgSFVblLDKjqSiPQl0oZsevSlSeD/qCNq0e iMx9sYd6Ap6chyv/yu+VdNPTwkhA2d41ed/ZAIOcVCFbeF1TXZvhFrB+PLKc+BqeVVGP kFGxV6ZkMgPo5BGE2jZECl9vcRUVUP/H4bJ2Zyk0ZjZpC6VetPj/PFupDEvdGjgQUWeX BjRQ== X-Received: by 10.140.105.37 with SMTP id b34mr51346120qgf.1.1408357797218; Mon, 18 Aug 2014 03:29:57 -0700 (PDT) MIME-Version: 1.0 Received: by 10.96.28.6 with HTTP; Mon, 18 Aug 2014 03:29:17 -0700 (PDT) In-Reply-To: <20140804095528.GA12625@onelab2.iet.unipi.it> References: <20140804095528.GA12625@onelab2.iet.unipi.it> From: Carlos Ferreira Date: Mon, 18 Aug 2014 11:29:17 +0100 Message-ID: Subject: Re: tutorial on Netmap in Mountain View - Aug.28 To: Luigi Rizzo Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Aug 2014 10:29:59 -0000 Hi Luigi. Do you have presentations or tutorial code from that tutorial, that you can share here? On 4 August 2014 10:55, Luigi Rizzo wrote: > In case someone (especially those in the bay area) is interested: > I will give a half day tutorial on netmap at Hot Interconnects, > in Mountain View on August 28, 2014 > > http://www.hoti.org/hoti22/tutorials/#tut4 > > This tutorial targets hardware vendors, network engineers, and > researchers looking for solutions to: OS support for high speed NICs; > efficient software packet processing techniques for SDN products; > high speed networking in VMs. We will show how to achieve these > results using netmap. > > cheers > luigi > > (P.S. I have no financial interest in the event. I am posting the info > because I think it might be useful to people on this list, and of course > having a larger audience at the tutorial will generate more interesting > feedback from participants) > > -----------------------------------------+------------------------------- > Prof. Luigi RIZZO, rizzo@iet.unipi.it . Dip. di Ing. dell'Informazione > http://www.iet.unipi.it/~luigi/ . Universita` di Pisa > TEL +39-050-2211611 . via Diotisalvi 2 > Mobile +39-338-6809875 . 56122 PISA (Italy) > -----------------------------------------+------------------------------- > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > -- Carlos Miguel Ferreira Researcher at Telecommunications Institute Aveiro - Portugal Work E-mail - cmf@av.it.pt Skype & GTalk -> carlosmf.pt@gmail.com LinkedIn -> http://www.linkedin.com/in/carlosmferreira From owner-freebsd-net@FreeBSD.ORG Mon Aug 18 13:05:11 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 90F9AE7B for ; Mon, 18 Aug 2014 13:05:11 +0000 (UTC) Received: from mx0.pp.com.pl (sol.pp.com.pl [195.20.3.30]) by mx1.freebsd.org (Postfix) with ESMTP id 506C43659 for ; Mon, 18 Aug 2014 13:05:10 +0000 (UTC) Received: from [192.168.3.17] (lan.pp.com.pl [195.20.3.242]) by mx0.pp.com.pl (Postfix) with ESMTPSA id 31C6D640075 for ; Mon, 18 Aug 2014 15:05:36 +0200 (CEST) Message-ID: <53F1F863.8000408@pp.com.pl> Date: Mon, 18 Aug 2014 14:58:11 +0200 From: Piotr Kubaj User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Icedove/24.6.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Sending data via MAC address Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Aug 2014 13:05:11 -0000 Hi. Please see http://forums.freebsd.org/viewtopic.php?f=15&t=45303#p264204 and http://forums.freebsd.org/viewtopic.php?f=15&t=45303#p264249 . I know I can use web interface or ssh but WinBox is required. In short, using Linux and Wine, I can connect to my routers via MAC, provided they are in the same network. With FreeBSD it's not possible (I've checked various Wine versions, so it's not its fault). Right now I have Debian running on my PC and have tested FreeBSD in VM with bridged NIC. When I run Winbox in Linux, I can connect to RB, with FreeBSD in VM it works only with IP (provided both PC and the router are in the same network). Is it possible in any way to connect using only MAC addresses or when PC and the router are in different networks (no network aliases, as there are times when it's not known what network the router is in). Thanks for answers. From owner-freebsd-net@FreeBSD.ORG Mon Aug 18 18:05:08 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1405EC7D for ; Mon, 18 Aug 2014 18:05:08 +0000 (UTC) Received: from mail-oi0-x22a.google.com (mail-oi0-x22a.google.com [IPv6:2607:f8b0:4003:c06::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D43333337 for ; Mon, 18 Aug 2014 18:05:07 +0000 (UTC) Received: by mail-oi0-f42.google.com with SMTP id a3so3800905oib.15 for ; Mon, 18 Aug 2014 11:05:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=MRHEFpcSvc4aLD8y/ezx6Z1MzEbDuWDC2S7GaFdhgyo=; b=h1+XLzrJs01mcJFpBcqu4Aokl4u8FT3ggUn2XvefhKkpK84rPEImhEe/g4Og5GEbGb xD5HHQ4ew05u31+IFKRkEocwROAJZVPhLbxEUb2402OHf/90K7tN71y+iazLlhuL9uOa hZVCaD3oY2rAy1W4XRNk4iBa0ZDlvQpv0k7GMYxF42m+PfpjZnC6j3YV8aL19VFBjOZM vfdVlOycCrQiHwEw4tTi39nEXYVu3GgT/t5ItPYxGK42EKBCdMydik80eSdDLrhD0zjj Pi6bQw+j90fteHh9KnORHrkYQEsX3uVk0zKxh2kTkXnn7VkrGKPwKErpYay51ZxR3KmY nr5g== MIME-Version: 1.0 X-Received: by 10.182.241.200 with SMTP id wk8mr38631436obc.27.1408385107197; Mon, 18 Aug 2014 11:05:07 -0700 (PDT) Received: by 10.76.187.39 with HTTP; Mon, 18 Aug 2014 11:05:07 -0700 (PDT) In-Reply-To: <53F1F863.8000408@pp.com.pl> References: <53F1F863.8000408@pp.com.pl> Date: Mon, 18 Aug 2014 14:05:07 -0400 Message-ID: Subject: Re: Sending data via MAC address From: Ryan Stone To: Piotr Kubaj Content-Type: text/plain; charset=UTF-8 Cc: freebsd-net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Aug 2014 18:05:08 -0000 On Mon, Aug 18, 2014 at 8:58 AM, Piotr Kubaj wrote: > Hi. Please see > http://forums.freebsd.org/viewtopic.php?f=15&t=45303#p264204 and > http://forums.freebsd.org/viewtopic.php?f=15&t=45303#p264249 . > I know I can use web interface or ssh but WinBox is required. In short, > using Linux and Wine, I can connect to my routers via MAC, provided they > are in the same network. With FreeBSD it's not possible (I've checked > various Wine versions, so it's not its fault). Right now I have Debian > running on my PC and have tested FreeBSD in VM with bridged NIC. When I > run Winbox in Linux, I can connect to RB, with FreeBSD in VM it works > only with IP (provided both PC and the router are in the same network). > Is it possible in any way to connect using only MAC addresses or when PC > and the router are in different networks (no network aliases, as there > are times when it's not known what network the router is in). Thanks for > answers. > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" So the problem, if I'm understanding you correctly, is that you have a router with an unknown IP address (but a known MAC address). You're unable to set the IP on the router and you want to use it to forward your traffic? You could do something like this (assuming your NIC is on the 192.168.1.0/24 subnet: route add default 192.168.1.1 The IP address that you use here is arbitrary. Pick an unused address on your subnet. If you only want to route certain subnets through this router, replace "default" with the subnet that you want to route. arp -s 192.168.1.1 xx:xx:xx:xx:xx:xx pub This will create a static arp entry for 192.168.1.1. Now when you try to route traffic to 192.168.1.1 it will use the static MAC and things should just work. Note that you probably won't be able to do this to access the router at all (e.g. ping 192.168.1.1). The router's IP stack won't respond to packets that aren't addressed to the router's IP address. From owner-freebsd-net@FreeBSD.ORG Mon Aug 18 20:39:38 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 11ABAD9C for ; Mon, 18 Aug 2014 20:39:38 +0000 (UTC) Received: from mail-vc0-x235.google.com (mail-vc0-x235.google.com [IPv6:2607:f8b0:400c:c03::235]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C3C8A34AC for ; Mon, 18 Aug 2014 20:39:37 +0000 (UTC) Received: by mail-vc0-f181.google.com with SMTP id lf12so6463737vcb.12 for ; Mon, 18 Aug 2014 13:39:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=Y4wXyRFtAyijIhRgxlz4SCAQsAjvhmwSbdoQzvibu4Q=; b=mvgIJuNCY+mos1zq8hwQ5fvkQfvZzFBkeQs4r7vpJ4DMDaM4aKY8FdNdOkLOLWA8C8 xZvm+VJDXGMEqB0wP801F3Cr+Ezzko/ZWjMWwbwwccKhtZOq8D5xZKn5WbwiSUyDEG/o CxrPap8Qd5MNyX4PbjyYw7YkLMWaypLYrCskbjY+Oy4oG/Mc/s4IUZkRsoXLR6IM0fgC pyCsFRhzM+M86CI3IZcGvXfqnAptbHfG7p/jkEzWVlx6hpRWK4UHAFe772WeorsVZWhH /fpECCnkAxUmx2XV/zLg9uXMScf48ClUMj/KqS5WS3TZS1fT9CVm5LzpzSxxe0Rk0TVb gOKQ== MIME-Version: 1.0 X-Received: by 10.220.114.5 with SMTP id c5mr26681544vcq.28.1408394376794; Mon, 18 Aug 2014 13:39:36 -0700 (PDT) Sender: ndenev@gmail.com Received: by 10.221.46.133 with HTTP; Mon, 18 Aug 2014 13:39:36 -0700 (PDT) In-Reply-To: References: <53F1F863.8000408@pp.com.pl> Date: Mon, 18 Aug 2014 22:39:36 +0200 X-Google-Sender-Auth: iF5GFdfvFeRQ9nHXAlmtuDRE_AU Message-ID: Subject: Re: Sending data via MAC address From: Nikolay Denev To: Ryan Stone Content-Type: text/plain; charset=UTF-8 Cc: freebsd-net , Piotr Kubaj X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Aug 2014 20:39:38 -0000 On Mon, Aug 18, 2014 at 8:05 PM, Ryan Stone wrote: > On Mon, Aug 18, 2014 at 8:58 AM, Piotr Kubaj wrote: >> Hi. Please see >> http://forums.freebsd.org/viewtopic.php?f=15&t=45303#p264204 and >> http://forums.freebsd.org/viewtopic.php?f=15&t=45303#p264249 . >> I know I can use web interface or ssh but WinBox is required. In short, >> using Linux and Wine, I can connect to my routers via MAC, provided they >> are in the same network. With FreeBSD it's not possible (I've checked >> various Wine versions, so it's not its fault). Right now I have Debian >> running on my PC and have tested FreeBSD in VM with bridged NIC. When I >> run Winbox in Linux, I can connect to RB, with FreeBSD in VM it works >> only with IP (provided both PC and the router are in the same network). >> Is it possible in any way to connect using only MAC addresses or when PC >> and the router are in different networks (no network aliases, as there >> are times when it's not known what network the router is in). Thanks for >> answers. >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > So the problem, if I'm understanding you correctly, is that you have a > router with an unknown IP address (but a known MAC address). You're > unable to set the IP on the router and you want to use it to forward > your traffic? > > You could do something like this (assuming your NIC is on the > 192.168.1.0/24 subnet: > > route add default 192.168.1.1 > > The IP address that you use here is arbitrary. Pick an unused address > on your subnet. If you only want to route certain subnets through > this router, replace "default" with the subnet that you want to route. > > arp -s 192.168.1.1 xx:xx:xx:xx:xx:xx pub > > This will create a static arp entry for 192.168.1.1. Now when you try > to route traffic to 192.168.1.1 it will use the static MAC and things > should just work. > > Note that you probably won't be able to do this to access the router > at all (e.g. ping 192.168.1.1). The router's IP stack won't respond > to packets that aren't addressed to the router's IP address. > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" I think the OP is talking about MikroTik RouterOS based devices that are usually configured via WinBox (a proprietary windows based GUI tool) that can auto-discover and setup such devices either based on IP, or via some proprietary protocol using on L2 if they are on the same ethernet segment, even if they don't have IP configured. For what is worth I was able to run WinBox in Wine under OS X and configure such devices, so I'm not sure what could be the problem on FreeBSD preventing that communication. I think some packet traces might show what's going on. --Nikolay From owner-freebsd-net@FreeBSD.ORG Tue Aug 19 00:00:02 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3D72BC8F; Tue, 19 Aug 2014 00:00:02 +0000 (UTC) Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1lp0140.outbound.protection.outlook.com [207.46.163.140]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "MSIT Machine Auth CA 2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 50FA035F0; Tue, 19 Aug 2014 00:00:00 +0000 (UTC) Received: from CO2PR05MB730.namprd05.prod.outlook.com (10.141.228.15) by CO2PR05MB732.namprd05.prod.outlook.com (10.141.228.22) with Microsoft SMTP Server (TLS) id 15.0.1010.18; Mon, 18 Aug 2014 23:59:56 +0000 Received: from CO2PR05MB730.namprd05.prod.outlook.com ([10.141.228.15]) by CO2PR05MB730.namprd05.prod.outlook.com ([10.141.228.15]) with mapi id 15.00.1005.008; Mon, 18 Aug 2014 23:59:56 +0000 From: Anuranjan Shukla To: hiren panchasara , "freebsd-net@freebsd.org" Subject: Re: Regression test suite for TCP Thread-Topic: Regression test suite for TCP Thread-Index: AQHPtcFhghhoqNw8LkOCBFL2ZA3BHpvWoGcA Date: Mon, 18 Aug 2014 23:59:56 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.4.3.140616 x-originating-ip: [66.129.239.11] x-microsoft-antispam: BCL:0;PCL:0;RULEID:;UriScan:; x-forefront-prvs: 03077579FF x-forefront-antispam-report: SFV:NSPM; SFS:(10019006)(6009001)(51704005)(189002)(199003)(479174003)(24454002)(377454003)(83322001)(19580405001)(19580395003)(86362001)(4396001)(99396002)(46102001)(101416001)(20776003)(54356999)(50986999)(76176999)(87936001)(83072002)(76482001)(77982001)(92726001)(92566001)(79102001)(74662001)(99286002)(85852003)(74502001)(36756003)(81342001)(81542001)(31966008)(2656002)(105586002)(21056001)(66066001)(107886001)(85306004)(83506001)(107046002)(64706001)(95666004)(15202345003)(80022001)(106116001)(106356001)(15975445006); DIR:OUT; SFP:1102; SCL:1; SRVR:CO2PR05MB732; H:CO2PR05MB730.namprd05.prod.outlook.com; FPR:; MLV:sfv; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: juniper.net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Aug 2014 00:00:02 -0000 If you're willing to shell out some $$, Ixia's ANVL is a fairly detailed test suite for TCP and other protocols. It's available as a software you can install on lnx/windows. I'd used it at Juniper while working with Robert for the connection groups work a couple years or so back. Regards, -Anu On 8/11/14, 5:07 PM, "hiren panchasara" wrote: >I was looking for one and found >https://wiki.freebsd.org/SummerOfCode2008#TCP.2FIP_regression_test_suite_. >28tcptest.29 >which is a good start but needs a lot of love (work). > >Please share if you are aware of any covering basic scenarios. > >cheers, >Hiren >_______________________________________________ >freebsd-net@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-net >To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Tue Aug 19 00:47:39 2014 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4564D21D for ; Tue, 19 Aug 2014 00:47:39 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2CADB39D8 for ; Tue, 19 Aug 2014 00:47:39 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s7J0ldsX056034 for ; Tue, 19 Aug 2014 00:47:39 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-net@FreeBSD.org Subject: [Bug 191975] [ng_iface] [regression] in 10.0: cannot contact local services Date: Tue, 19 Aug 2014 00:47:39 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: dgilbert@eicat.ca X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Aug 2014 00:47:39 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191975 --- Comment #3 from dgilbert@eicat.ca --- I continue to try to eek out what's happening here. I had an idea: Why don't I create a firewall rule: rdr on ng1 inet proto tcp from any to 66.96.16.3 port = 2222 -> 66.96.16.3 port 22 and then I can try this. Well... [2:54:354]root@owl:~> pfctl -vs nat No ALTQ support in kernel ALTQ related functions disabled rdr on ng1 inet proto tcp from any to 66.96.16.3 port = 2222 -> 66.96.16.3 port 22 [ Evaluations: 118329 Packets: 7 Bytes: 356 States: 1 ] [ Inserted: uid 0 pid 43426 State Creations: 1 ] [2:55:355]root@owl:~> netstat -an | grep 22 tcp4 0 0 66.96.16.3.22 66.96.16.11.53211 ESTABLISHED tcp4 0 0 *.22 *.* LISTEN tcp6 0 0 *.22 *.* LISTEN so... PF sees the SYN packets, but the local TCP stack does not. Sigh. Help? -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-net@FreeBSD.ORG Tue Aug 19 00:53:17 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 67FEB2E8 for ; Tue, 19 Aug 2014 00:53:17 +0000 (UTC) Received: from mail-vc0-x233.google.com (mail-vc0-x233.google.com [IPv6:2607:f8b0:400c:c03::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2965A3A7C for ; Tue, 19 Aug 2014 00:53:17 +0000 (UTC) Received: by mail-vc0-f179.google.com with SMTP id hq11so6732759vcb.10 for ; Mon, 18 Aug 2014 17:53:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=XGDpYYCLSjxwju3vP7VwoitAK152WEhWjumeGa4p1Tc=; b=mIz/9tE8X+BG5JG9dm3Jo8ehQbC4P2pT4cS2gPuS+J4KZYdS9mnYu2ccl5LcpxBlAo LheSClr13lUwf+mFSE+i/6XlQxj7amvTyBPzddbNTzO70ZcXttQ7X+hvjflemR/uwFX/ n6b3HptvkM3vZyMvLmvatnEazcuJ0CdMHBHub0NQKPuWpZR/YO6jqswDK50ORPJUog+j d0K6n5XGzCRQycFjkOScR79SWKnX8OSdhupCBPzSpD35mEZyD5Ui8ysxR3jbJCzZz+V3 ajBpygmfgykoYM4fw9Dem7p0tsvIPf50kR9S8FDFjOZEv95s1sHNKyPWi7Iu7iL3corS xu2A== MIME-Version: 1.0 X-Received: by 10.220.112.143 with SMTP id w15mr3145987vcp.41.1408409596000; Mon, 18 Aug 2014 17:53:16 -0700 (PDT) Received: by 10.221.12.135 with HTTP; Mon, 18 Aug 2014 17:53:15 -0700 (PDT) Date: Mon, 18 Aug 2014 20:53:15 -0400 Message-ID: Subject: bug 191975 From: Zaphod Beeblebrox To: FreeBSD Net Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Aug 2014 00:53:17 -0000 It seems I'm being outclassed by bug 191975. Simply put: 1. packet arrives on ngX interface (ng_iface) 2. packet destination is local 3. AFAICT packet disappears. This is not true of packet destination is non-local. Routed packets work as advertised. Local services (say, ssh) are also working fine from hosts that connect other than via ngX. This seems also to be true whether the packets are directly from the ngX connected hosts, or from routed hosts beyond the ngX connected host. Can I draw anyone's attention to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191975 ? From owner-freebsd-net@FreeBSD.ORG Tue Aug 19 02:17:49 2014 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 74123B6B for ; Tue, 19 Aug 2014 02:17:49 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5AA0E30C8 for ; Tue, 19 Aug 2014 02:17:49 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id s7J2Hnk3036534 for ; Tue, 19 Aug 2014 02:17:49 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-net@FreeBSD.org Subject: [Bug 191975] [ng_iface] [regression] in 10.0: cannot contact local services Date: Tue, 19 Aug 2014 02:17:49 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: dgilbert@eicat.ca X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: freebsd-net@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Aug 2014 02:17:49 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191975 --- Comment #4 from dgilbert@eicat.ca --- This is to say: that a host connecting trough an ng_iface interface can access the rest of the network, but cannot access the host on which the ng_iface resides. _And_ this is a regression in 10.0. OK. This is a _really_ interesting example. There are two MPD servers: A and B. A is .1 (both v4 and v6) and B is .3 (both v4 and v6). The only difference is that B has gif interfaces to give v6 services to mpd-connected clients. If mpd client is connected to A: WORKS: ssh -4 ...3 WORKS: ssh -6 ..:3 BROKE: ssh -4 ...1 WORKS: ssh -6 ..:1 If mpd client is connected to B: BROKE: ssh -4 ...3 BROKE: ssh -6 ..:3 WORKS: ssh -4 ...1 WORKS: ssh -6 ..:1 ie: if the packet path includes ngX on the host in question, it fails. mpd -> ngX -> gif -> ssh -> fail mpd -> ngX -> gif -> otherhost -> ssh -> success mpd -> ngX -> otherhost -> gif -> ssh -> success mpd -> ngX -> otherhost -> gif -> otherhost -> ssh -> success -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-net@FreeBSD.ORG Tue Aug 19 07:02:13 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 74BD2325 for ; Tue, 19 Aug 2014 07:02:13 +0000 (UTC) Received: from mx0.pp.com.pl (sol.pp.com.pl [195.20.3.30]) by mx1.freebsd.org (Postfix) with ESMTP id E595837EA for ; Tue, 19 Aug 2014 07:02:12 +0000 (UTC) Received: from [192.168.3.17] (lan.pp.com.pl [195.20.3.242]) by mx0.pp.com.pl (Postfix) with ESMTPSA id 539C564003F for ; Tue, 19 Aug 2014 09:09:13 +0200 (CEST) Message-ID: <53F2F669.1030101@pp.com.pl> Date: Tue, 19 Aug 2014 09:02:01 +0200 From: Piotr Kubaj User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Icedove/24.6.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: Sending data via MAC address References: <53F26851.2020309@pp.com.pl> In-Reply-To: <53F26851.2020309@pp.com.pl> X-Forwarded-Message-Id: <53F26851.2020309@pp.com.pl> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Aug 2014 07:02:13 -0000 On 08/18/2014 22:39, Nikolay Denev wrote: > On Mon, Aug 18, 2014 at 8:05 PM, Ryan Stone wrote: >> On Mon, Aug 18, 2014 at 8:58 AM, Piotr Kubaj wrote: >>> Hi. Please see >>> http://forums.freebsd.org/viewtopic.php?f=15&t=45303#p264204 and >>> http://forums.freebsd.org/viewtopic.php?f=15&t=45303#p264249 . >>> I know I can use web interface or ssh but WinBox is required. In short, >>> using Linux and Wine, I can connect to my routers via MAC, provided they >>> are in the same network. With FreeBSD it's not possible (I've checked >>> various Wine versions, so it's not its fault). Right now I have Debian >>> running on my PC and have tested FreeBSD in VM with bridged NIC. When I >>> run Winbox in Linux, I can connect to RB, with FreeBSD in VM it works >>> only with IP (provided both PC and the router are in the same network). >>> Is it possible in any way to connect using only MAC addresses or when PC >>> and the router are in different networks (no network aliases, as there >>> are times when it's not known what network the router is in). Thanks for >>> answers. >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> So the problem, if I'm understanding you correctly, is that you have a >> router with an unknown IP address (but a known MAC address). You're >> unable to set the IP on the router and you want to use it to forward >> your traffic? >> >> You could do something like this (assuming your NIC is on the >> 192.168.1.0/24 subnet: >> >> route add default 192.168.1.1 >> >> The IP address that you use here is arbitrary. Pick an unused address >> on your subnet. If you only want to route certain subnets through >> this router, replace "default" with the subnet that you want to route. >> >> arp -s 192.168.1.1 xx:xx:xx:xx:xx:xx pub >> >> This will create a static arp entry for 192.168.1.1. Now when you try >> to route traffic to 192.168.1.1 it will use the static MAC and things >> should just work. >> >> Note that you probably won't be able to do this to access the router >> at all (e.g. ping 192.168.1.1). The router's IP stack won't respond >> to packets that aren't addressed to the router's IP address. >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > I think the OP is talking about MikroTik RouterOS based devices that > are usually configured > via WinBox (a proprietary windows based GUI tool) that can > auto-discover and setup such devices either based on IP, > or via some proprietary protocol using on L2 if they are on the same > ethernet segment, even if they don't have IP configured. > > For what is worth I was able to run WinBox in Wine under OS X and > configure such devices, so > I'm not sure what could be the problem on FreeBSD preventing that communication. > I think some packet traces might show what's going on. > > --Nikolay Yes, I may have worded it poorly, but that's what I meant. I'll try Ryan's solution tomorrow, but it seems that FreeBSD is missing something and that's why some extra-configuration is needed. WinBox on Linux works OOTB just like in OSX, but it's not like that on FreeBSD. I'll also try to get packet traces. From owner-freebsd-net@FreeBSD.ORG Tue Aug 19 08:26:06 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3D8D3678 for ; Tue, 19 Aug 2014 08:26:06 +0000 (UTC) Received: from mx0.pp.com.pl (sol.pp.com.pl [195.20.3.30]) by mx1.freebsd.org (Postfix) with ESMTP id ECDA73EF6 for ; Tue, 19 Aug 2014 08:26:05 +0000 (UTC) Received: from [192.168.3.17] (lan.pp.com.pl [195.20.3.242]) by mx0.pp.com.pl (Postfix) with ESMTPSA id 7FF8F640027; Tue, 19 Aug 2014 10:33:06 +0200 (CEST) Message-ID: <53F30A13.7050505@pp.com.pl> Date: Tue, 19 Aug 2014 10:25:55 +0200 From: Piotr Kubaj User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Icedove/24.6.0 MIME-Version: 1.0 To: Ryan Stone Subject: Re: Sending data via MAC address References: <53F1F863.8000408@pp.com.pl> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Aug 2014 08:26:06 -0000 On 18.08.2014 20:05, Ryan Stone wrote: > On Mon, Aug 18, 2014 at 8:58 AM, Piotr Kubaj wrote: >> Hi. Please see >> http://forums.freebsd.org/viewtopic.php?f=15&t=45303#p264204 and >> http://forums.freebsd.org/viewtopic.php?f=15&t=45303#p264249 . >> I know I can use web interface or ssh but WinBox is required. In short, >> using Linux and Wine, I can connect to my routers via MAC, provided they >> are in the same network. With FreeBSD it's not possible (I've checked >> various Wine versions, so it's not its fault). Right now I have Debian >> running on my PC and have tested FreeBSD in VM with bridged NIC. When I >> run Winbox in Linux, I can connect to RB, with FreeBSD in VM it works >> only with IP (provided both PC and the router are in the same network). >> Is it possible in any way to connect using only MAC addresses or when PC >> and the router are in different networks (no network aliases, as there >> are times when it's not known what network the router is in). Thanks for >> answers. >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > So the problem, if I'm understanding you correctly, is that you have a > router with an unknown IP address (but a known MAC address). You're > unable to set the IP on the router and you want to use it to forward > your traffic? > > You could do something like this (assuming your NIC is on the > 192.168.1.0/24 subnet: > > route add default 192.168.1.1 > > The IP address that you use here is arbitrary. Pick an unused address > on your subnet. If you only want to route certain subnets through > this router, replace "default" with the subnet that you want to route. > > arp -s 192.168.1.1 xx:xx:xx:xx:xx:xx pub > > This will create a static arp entry for 192.168.1.1. Now when you try > to route traffic to 192.168.1.1 it will use the static MAC and things > should just work. > > Note that you probably won't be able to do this to access the router > at all (e.g. ping 192.168.1.1). The router's IP stack won't respond > to packets that aren't addressed to the router's IP address. Thanks, that seems to work, although adding default route is unnecessary as the whole subnet is already in the routing table, so I only need to create a new ARP entry. And when the entry is deleted, I can no longer connect via MAC. Anyway, something is clearly wrong, since there are no problems like that on Linux and OSX. From owner-freebsd-net@FreeBSD.ORG Tue Aug 19 08:30:30 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8CA4691F for ; Tue, 19 Aug 2014 08:30:30 +0000 (UTC) Received: from mail.allbsd.org (gatekeeper.allbsd.org [IPv6:2001:2f0:104:e001::32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.allbsd.org", Issuer "RapidSSL CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 559F83F43 for ; Tue, 19 Aug 2014 08:30:29 +0000 (UTC) Received: from alph.d.allbsd.org ([IPv6:2001:2f0:104:e010:862b:2bff:febc:8956]) (authenticated bits=56) by mail.allbsd.org (8.14.9/8.14.8) with ESMTP id s7J8U3jU049748 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 19 Aug 2014 17:30:15 +0900 (JST) (envelope-from hrs@FreeBSD.org) Received: from localhost (localhost [IPv6:::1]) (authenticated bits=0) by alph.d.allbsd.org (8.14.8/8.14.8) with ESMTP id s7J8TxoR081294; Tue, 19 Aug 2014 17:30:03 +0900 (JST) (envelope-from hrs@FreeBSD.org) Date: Tue, 19 Aug 2014 17:29:53 +0900 (JST) Message-Id: <20140819.172953.436878206817123055.hrs@allbsd.org> To: fernando@gont.com.ar Subject: Re: Routing IPv6 packets towards oneself with routing sockets? From: Hiroki Sato In-Reply-To: <53E5B71D.2030500@gont.com.ar> References: <53E35DA7.4020800@gont.com.ar> <20140808.053757.1725805140861121363.hrs@allbsd.org> <53E5B71D.2030500@gont.com.ar> X-PGPkey-fingerprint: BDB3 443F A5DD B3D0 A530 FFD7 4F2C D3D8 2793 CF2D X-Mailer: Mew version 6.6 on Emacs 24.3 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Multipart/Signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="--Security_Multipart(Tue_Aug_19_17_29_53_2014_045)--" Content-Transfer-Encoding: 7bit X-Virus-Scanned: clamav-milter 0.97.4 at gatekeeper.allbsd.org X-Virus-Status: Clean X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (mail.allbsd.org [IPv6:2001:2f0:104:e001::32]); Tue, 19 Aug 2014 17:30:21 +0900 (JST) X-Spam-Status: No, score=-97.9 required=13.0 tests=CONTENT_TYPE_PRESENT, RDNS_NONE,SPF_SOFTFAIL,USER_IN_WHITELIST autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on gatekeeper.allbsd.org Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Aug 2014 08:30:30 -0000 ----Security_Multipart(Tue_Aug_19_17_29_53_2014_045)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Fernando Gont wrote in <53E5B71D.2030500@gont.com.ar>: fe> > Although your code assumes RTA_GATEWAY eventually returns the fe> > outgoing interface, it is not always true. RTA_IFP should be used if fe> > you want to look up it instead of looking up gateways until AF_LINK fe> > is obtained. Certainly RTA_GATEWAY returns AF_LINK and you can check fe> > sdl_index in it, but the index number is not always the same as the fe> > actual outgoing interface (one of the examples is a host route). fe> fe> Just curious: what's the meaning of the AF_LINK I was reading? Sorry for the delay. AF_LINK with (sdl_nlen == sdl_alen == sdl_slen == 0) in RTA_GATEWAY was used to create a clone route. Let's consider a situation that there are a IPv4 node (node A) with 192.168.0.1/24 on its em0 and another node (node B) with 192.168.0.2/24 on the same link. If node A is running FreeBSD 8.0 or later, an output of "netstat -nrf inet" on node A will be something like this: Destination Gateway Flags Netif Expire 192.168.0.0/24 link#2 U em0 192.168.0.1 link#2 UHS lo0 but it was the following on an older FreeBSD (and other 4.3BSD-derived implementations to which UNPv1 refers): Destination Gateway Flags Refs Use Netif Expire 192.168.0.0/24 link#2 UC 2 0 em0 192.168.0.1 xx:xx:xx:xx:xx:xx UHLW 0 0 lo0 192.168.0.2 yy:yy:yy:yy:yy:yy UHLW 0 0 em0 A primary difference is that FreeBSD 8.0 and later do not directly have L2 address information in the routing table. Instead, FreeBSD now has L2 address translation table and routing table separately. In the old routing table, a host route on the same net (e.g. 192.168.0.2) was dynamically created and its MAC address was added to the routing table by issuing ARP request. More specifically, when node A attempts to look up a route for 192.168.0.3, for example, an AF_LINK route with empty L2 address in RTA_GATEWAY matches first via the entry 192.168.0.0/24 bacause it is the most specific at the moment. Then the sdl_index in RTA_GATEWAY is used for ARP request, and a host route is added eventually. NDP works in the same way for IPv6. In the new implementation, a route with empty L2 address means which L2 address table (separated in per-interface and per-AF basis) should be referred. A host route is usually used only as loopback route, and in its RTA_GATEWAY it just has empty L2 address with if_index where the address is configured. So, in both cases checking AF_LINK in RTA_GATEWAY is not reliable to know the actual outgoing interface. -- Hiroki ----Security_Multipart(Tue_Aug_19_17_29_53_2014_045)-- Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEABECAAYFAlPzCwEACgkQTyzT2CeTzy0C/QCgwWZlA00Lk0fpCM71/VQeAOTI u1AAn0+QX3gF02MkgnhMfX0+xzTFhVPM =VxPV -----END PGP SIGNATURE----- ----Security_Multipart(Tue_Aug_19_17_29_53_2014_045)---- From owner-freebsd-net@FreeBSD.ORG Tue Aug 19 13:51:13 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id DE2F0C61; Tue, 19 Aug 2014 13:51:13 +0000 (UTC) Received: from mail.ipfw.ru (mail.ipfw.ru [IPv6:2a01:4f8:120:6141::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 617F83154; Tue, 19 Aug 2014 13:51:13 +0000 (UTC) Received: from [2a02:6b8:0:401:222:4dff:fe50:cd2f] (helo=ptichko.yndx.net) by mail.ipfw.ru with esmtpsa (TLSv1:DHE-RSA-AES128-SHA:128) (Exim 4.82 (FreeBSD)) (envelope-from ) id 1XJfqz-000PoC-9y; Tue, 19 Aug 2014 13:37:21 +0400 Message-ID: <53F3563D.6020107@FreeBSD.org> Date: Tue, 19 Aug 2014 17:50:53 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Dmitry Selivanov Subject: Re: ipfw named objejcts, table values and syntax change References: <53DC01DE.3000000@FreeBSD.org> <53DCA25C.1000108@FreeBSD.org> <53DF55FA.8010303@FreeBSD.org> <20140804115817.GA13814@onelab2.iet.unipi.it> <53DFE438.5050209@FreeBSD.org> <53E4BE62.4050303@rlan.ru> <53EE0A30.4020800@FreeBSD.org> <53EE16DE.9020209@rlan.ru> <53EE252D.10109@FreeBSD.org> In-Reply-To: <53EE252D.10109@FreeBSD.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: freebsd-ipfw , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Aug 2014 13:51:14 -0000 On 15.08.2014 19:20, Alexander V. Chernikov wrote: > On 15.08.2014 18:19, Dmitry Selivanov wrote: >> 15.08.2014 17:25, Alexander V. Chernikov пишет: >>> On 08.08.2014 16:11, Dmitry Selivanov wrote: >>>> 04.08.2014 23:51, Alexander V. Chernikov пишет: >>>>> On 04.08.2014 15:58, Luigi Rizzo wrote: >>>>>> On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. Chernikov >>>>>> wrote: >>>>>>> On 02.08.2014 12:33, Alexander V. Chernikov wrote: >>>>>>>> On 02.08.2014 10:33, Luigi Rizzo wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov >>>>>>>>> > wrote: >>>>>>>>> >>>>>>>>> Hello all. >>>>>>>>> >>>>>>>>> I'm currently working on to enhance ipfw in some areas. >>>>>>>>> The most notable (and user-visible) change is named >>>>>>>>> table support. >>>>>>>>> The other one is support for different lookup algorithms >>>>>>>>> for different >>>>>>>>> key types. >>>>>>>>> >>>>>>>>> For example, new ipfw permits writing this: >>>>>>>>> >>>>>>>>> ipfw table tb1 create type cidr >>>>>>>>> ipfw add allow ip from table(tl1) to any >>>>>>>>> ipfw add allow ip from any lookup dst-ip tb1 >>>>>>>>> >>>>>>>>> ipfw table if1 create type iface >>>>>>>>> ipfw add skipto tablearg ip from any to any via table(if1) >>>>>>>>> >>>>>>>>> or even this: >>>>>>>>> ipfw table fl1 create type >>>>>>>>> flow:src-ip,proto,dst-ip,dst-port >>>>>>>>> ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 4444 >>>>>>>>> ipfw add allow ip from any to any flow table(fl1) >>>>>>>>> >>>>>>>>> all these changes fully preserve backward compatibility. >>>>>>>>> (actually tables needs now to be created before use and >>>>>>>>> their type needs >>>>>>>>> to match with opcode used, but new ipfw(8) performs >>>>>>>>> auto-creation >>>>>>>>> for cidr tables). >>>>>>>>> >>>>>>>>> There is another thing I'm going to change and I'm not >>>>>>>>> sure I can keep >>>>>>>>> the same compatibility level. >>>>>>>>> >>>>>>>>> Table values, from one point of view, can be classified >>>>>>>>> to the following >>>>>>>>> types: >>>>>>>>> >>>>>>>>> - skipto argument >>>>>>>>> - fwd argument (*) >>>>>>>>> - link to another object (nat, pipe, queue) >>>>>>>>> - plain u32 (not bound to any object) >>>>>>>>> (divert/tee,netgraph,tag/utag,limit) >>>>>>>>> >>>>>>>>> There are the following reasons why I think it is >>>>>>>>> necessary to implement >>>>>>>>> explicit table values typing (like tables): >>>>>>>>> - Implementing fwd tablearg for IPv6 hosts requires >>>>>>>>> indirection table >>>>>>>>> - Converting nat/pipe instance ids to names renders >>>>>>>>> values unusable >>>>>>>>> - retiring old hack with storing saved pointer of found >>>>>>>>> object/rule >>>>>>>>> inside rule w/o proper locking >>>>>>>>> - making faster skipto >>>>>>>>> >>>>>>>>> >>>>>>>>> ??????i don't buy the idea that you need typed arguments >>>>>>>>> for all the cases above. Maybe the case that >>>>>>>>> may make sense is the fwd argument (and in the future >>>>>>>>> something else). >>>>>>>>> We already discussed, i think, the fact that now it >>>>>>>>> is legal to have references to non existing things >>>>>>>>> (skipto, pipes etc.) implemented as u32. >>>>>>>>> Removing that would break configurations. >>>>>>>> It depends on actual implementation. This can be preserved by >>>>>>>> auto-creating necessary objects in kernel and/or in userspace, so >>>>>>>> we can (and should) avoid breaking in this particular way. >>>>>>> Can you please explain your vision on values another time? >>>>>>> As far as I understand, you're not against it in general, but the >>>>>>> details matter: >>>>>>> * IP address can be one of the types (it won't break much, and >>>>>>> we can >>>>>>> simply skip that one for MFC) >>>>>>> * what about typing for nat/pipes ? we're not going to convert >>>>>>> their ids >>>>>>> to names? (or maybe you can suggest other non-disruptive way?) >>>>>>> * everything else is type "u32" >>>>>> >>>>>> Correct, I am mostly concerned about the details, not on the >>>>>> general concept. >>>>>> >>>>>> To summarize the discussion Alexander and I had about converting >>>>>> identifiers from numbers to arbitrary strings (this is partly >>>>>> related >>>>>> to the values stored in tables, but I think we should have a >>>>>> coherent >>>>>> behaviour) >>>>>> >>>>>> 1. CURRENTLY ipfw uses numeric identifiers in a small range (16 >>>>>> bits or less) >>>>>> for rules, pipes, queues, tables, probably nat instances. >>>>>> >>>>>> 2. CURRENTLY, in all the above contexts, it is legal to reference a >>>>>> non existing object (rule, pipe, table names, etc.), >>>>>> and the kernel will do something reasonable, namely jump to the >>>>>> next rule, drop traffic for non existing pipes, and so on. >>>>>> >>>>>> 3. of course we want to preserve backward compatibility both for >>>>>> the ioctl interface, and for user configurations. >>>>>> >>>>>> 4. The in-kernel representation of identifiers is not visible to >>>>>> users, >>>>>> so we can use a numeric representation in the kernel for >>>>>> identifiers. >>>>>> Strings like "12345" are converted with atoi() or the like, >>>>>> whereas for other identifiers or numbers outside of the 2^16 >>>>>> range >>>>>> the kernel manages a translation table, allocating new numeric >>>>>> identifiers if a new string appears. >>>>>> This permits backward compatibility for old rulesets, and >>>>>> does not >>>>>> impact performance because the translation table is only >>>>>> used during rules additions or deletion. >>>>> Yes. However this requires either holding either (1) 2 pointers >>>>> (old&new >>>>> arrays), or (2) 65k+ index array, or (3) chained hash table. >>>>> (1) would require additional pointers for each subsystem (and some >>>>> additional management), >>>>> (2) will definitely upset embedded guys and >>>>> (3) is worse in terms of performance >>>>>> >>>>>> With this in mind, i think we should follow a similar approach for >>>>>> objects stored in tables, hence >>>>>> >>>>>> if an u32 value was available in the past, it must be >>>>>> available also in the new implementation. >>>>>> >>>>>> The issue with tables is that some convoluted configuration could >>>>>> use the same table to reference pipes _and_ rules _and_ perhaps >>>>>> other things represented as numbers (the former is not too strange, >>>>>> if i have a large configuration i might place sections at rules >>>>>> 12000, 13000, 14000... and associate pipes with the same numberic >>>>>> identifier to each block of rules). >>>>>> >>>>>> Typed table values would clearly disturb backward compatibility >>>>>> in the above configurations. However it should not be difficult >>>>>> to accept arbitrary strings as the values stored in tables, and >>>>>> then store multiple representations as appropriate, including: >>>>> Well, I've thought about thas one. It may be an option, but the >>>>> details >>>>> are not so promising (below) >>>>>> - the string representation, unconditionally >>>>>> - for names that can be resolved by DNS, the ipv6 and ipv4 >>>>>> address(es) >>>>>> associated with them. ipfw already translates hostnames in rules >>>>>> so this is POLA >>>>> I'm not happy what ipfw(8) is doing instead of translation. The >>>>> proper >>>>> way would be not simply using first AF_INET answer but saving ALL >>>>> IPv4+IPv6 records inside rule (and some more tracking should be done >>>>> afterwards, but that's totally different story). Additionally, I'm >>>>> unsure if we really need next-hop value expressed as hostname (how >>>>> can >>>>> we deal with multiple addresses and diffrent AFs?). We may store >>>>> strings >>>>> (and I think we should do it) but I'm unsure about this particular >>>>> option of interpreting them. >>>>>> - for other strings, a u32 from the translation table as previously >>>>>> indicated >>>>>> - and for numeric values, the u32 representation (truncated if >>>>>> needed, >>>>>> according to whatever is the existing behaviour) >>>>>> - >>>>>> If we cannot generate an u32 we will put some value (e.g. 0) >>>>>> that hopefully will not cause confusion. >>>>> As far as I understand, we accept some string "s" as table value >>>>> inside >>>>> the kernel, than, we have some logic that says: >>>>> oh, dummynet pipe has the same name "s"s, oh, nat entity with name >>>>> "s" >>>>> has just been created, let's save indices. >>>>> >>>>> That would require additional indirection table like: >>>>> >>>>> index | [ skipto idx | nat idx | pipe idx | queue idx | fwd index ] >>>>> ( so we will have 2-level indirection table for fwd if we do IPv6) >>>>> >>>>> We can optimize this if we use "same name -> same kidx" approach >>>>> regardless of kernel object we're refering to. That might require >>>>> some >>>>> more memory, but that's OK from my point of view. >>>>> >>>>> So we end up with >>>>> int [ skipto idx | fwd idx | obj idx ] >>>>> >>>>> idx "0" is special value which means the same as 2.CURRENT >>>>> >>>>> That looks better, but still way to complex. >>>>> I do care about compatibility, but it's hard to improve things >>>>> without >>>>> changing. >>>>> >>>>> I'd like to propose the following: >>>>> * Split values into 3 types ("ip|nexthop", "number", "object") >>>>> * Do not insist on object existence, use value "0" to mimic 2.CURRENT >>>>> behavior. >>>>> * Retain full compatibility by introducing special value type >>>>> "legacy" >>>>> which matches any type and is backed by given indirection table. >>>>> * Issue warning in ipfw(8) binary on all auto-created tables that >>>>> auto-creation is legacy and this behavior will be dropped in next >>>>> major >>>>> release (e.g. 11.0) >>>>> * Save this behavior in MFC but drop "legacy" tables in head after a >>>>> month after actual MFC. >>>>> >>>>> That do you think? >>>>>> >>>>>> If we do it this way, we should be able to preserve backward >>>>>> compatibility _and_ add features that people may need. >>>>>> >>>>>> cheers >>>>>> luigi >>>>>> >>>> Here is my idea: tablearg should contain more than one value. I >>>> think getting several values from one table lookup is faster than >>>> several table lookups with one value. >>>> Let tablearg be not just uint32, but array with different value >>>> types inside it. >>> There are some use cases where we might need 2-level value lookup >>> (e.g. algo returning index for index table where actual data reside) >>> and each data item can >>> really be up to 64-bytes long. The problem is in actual partitioning >>> and compatibility. >>>> >>>> For example I have many such rules: >>>> allow src-ip 1.2.3.4 MAC any 11:22:33:44:55:66 recv vlan1234 dst-ip >>>> 1.1.1.1 >>> Sorry, what task are you solving by using given rules? >> Small ISP, clients have static IP with MAC-authorization. Src iface >> must be checked to prevent IP-spoofing. Dst-IP sometimes is used for >> p2p-channels. >>>> >>>> These rules can be replaced with such construction: >>>> allow src-ip table(1) MAC any tablearg[1] recv tablearg[2] dst-ip >>>> tablearg[3] >>>> >>>> But I don't think indexing by value is a good idea. I think >>>> index==starting byte is a better way: >>>> allow src-ip table(1) MAC any tablearg:0 recv tablearg:6 dst-ip >>>> tablearg:32 >>>> where MAC's 6 bytes are from 0 to 5 in tablearg; iface string is >>>> from 6 and till \0, but less than 26 bytes; and IPv4's 4 bytes are >>>> from 32 to 35. >>> >>>> So we need to create table for it: >>>> table 1 set MAC:0 string:6:26 ip:32 >>>> table 1 add 1.2.3.4 11:22:33:44:55:66 vlan1234 1.1.1.1 >>>> >>>> String can be used both for iface and comment. >>>> Other possible value types: >>>> uint16 for nat, pipe, skipto and other 2-bytes actions >>>> IPv4 4 bytes >>>> CIDRv4 5 bytes >>>> IPv6 16 bytes >>>> CIDRv6 17 bytes >>>> table_id 2 bytes - link to another table >>> Well, it seems we have enough space to store most of these, however, >>> problems seem to remain the same: typing and compatibility. >>> When you're creating new table (or it is auto-created) which values >>> types should be assumed ? All of them? >> Default - as usually uint32. > I can't see "uint32" value in the list you have specified before. I'll > rephrase: > what value types (from the list above or similar) should ipfw(8) or > kernel fill in case of "default" table? > (And once again, what should we print as value) ? > Please think about > a) old ipfw binaries > b) new ipfw binaries using exactly the same ruleset they are already > using (with, for example, both "skipto tablearg" and "fwd tablearg " > tables). I've increased kernel<>userland 'struct tentry' value field to 64 bytes. It looks like we were talking about a bit different things. Let me try to explain the problem I'm stuck with: We may take the road you've suggested, it looks OK: * by default tables are created with "all-values" mask. * ipfw(8) value treats default "ipfw table X add Y val" input where value is u32 number as input data for each type specified in all-values without returning error * for non-default mask value data should be validated. e.g. if we have table with valtype="skipto,nat,pipe,ip4,ip6" and "100" as input -> it turns to "100,100,0.0.0.0,::". If we have value with valtype="skipto,ip6" and "100" as input -> error while the valid one would be "100,2a01::1:111", for example. I'm unsure how should one be able to update _specific_ value (e.g. update nat id or skipto arg), but that's not the problem. The problem arises if we start talking about using names for nat/pipe/queue ids instead of numbers. If we have nat instances "nat1", "11" and "23", and one specifies "44" as part of value, logic starts to be complex: we either require nat "44" to exists (and I'm unsure if we can auto-create it *) or start doing complex stuff like tracking all those non-existing objects: e.g. add some special record somewhere that we're wating for nat instance "44" to be created, than auto-update given value with its kernel index, than, do something reasonable if nat "44" instance is destroyed (OK, nat instance can't be destroyed, but pipe can). .. and we have to do the same for pipes/queues and any following kernel object. Or we have to require user to reference existing objects only (create explicitly before use). This one makes things easier in code, but require user to change their scripts. It looks like there is no consensus on that point. * Maybe auto-creation is not so tricky and we should try to evaluate it.. > > >>> What should `ipfw table X list` show as "value" field ? >> I added table "header" in this line: >> table 1 set MAC:0 string:6:26 ip:32 > I don't think that user should be able to set any offsets in userland. > Exact offsets of variable of given type needs to be enforced by kernel, > so you may fill that you want "mac" and "ip" as values for given > table, but not lengths or offsets. >> So `ipfw table X list` should show something like this: >> ---table(0)--- >> 1.2.3.4/32 11:22:33:44:55:66 vlan1234 1.1.1.1 >> We can also add "header" description in output (with or without >> additional parameter - depends on compatibility needs) like this: >> ---table(0)--- addr MAC iface IPv4 >>> How should ipfw(8) treat "add 1.1.1.1 0" input? >> It should look at table "header" and return error message like "Value >> doesn't match table header" > >>> What will happen if we want to add another type field to this list? >>> (MAC address of Infiniband MAC address, for example). >> I don't think there is a sense to mix both MAC[6] and MAC[20] values >> in 1 table. It is easier to create 2 tables with different "headers". >> For Infiniband we can add another type: MAC20 (or something like >> this). Or we can use "MAC"-type like string type(see above): MAC:6:25 >> (1st and last bytes, or 1st and length). >>> >>>> >>>> Table value length can be set for example with loader tunable like >>>> net.inet.ip.fw.table_value_length. >>>> Even with default uint32 value length we can get 2 uint16 values or >>>> 4 uint8 values, this can help in some configurations. >>>> >>>> This way is more complex, but much more flexible. It's like >>>> netgraph subsystem. >>>> I think it suites both Alexander and Luigi requests. >>>> >>>> >>> >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Tue Aug 19 16:06:56 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7157A417; Tue, 19 Aug 2014 16:06:56 +0000 (UTC) Received: from mail.rlan.ru (mail.rlan.ru [213.234.25.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A0E843FE6; Tue, 19 Aug 2014 16:06:54 +0000 (UTC) Message-ID: <53F3760E.9070206@rlan.ru> Date: Tue, 19 Aug 2014 20:06:38 +0400 From: Dmitry Selivanov User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 MIME-Version: 1.0 To: "Alexander V. Chernikov" Subject: Re: ipfw named objejcts, table values and syntax change References: <53DC01DE.3000000@FreeBSD.org> <53DCA25C.1000108@FreeBSD.org> <53DF55FA.8010303@FreeBSD.org> <20140804115817.GA13814@onelab2.iet.unipi.it> <53DFE438.5050209@FreeBSD.org> <53E4BE62.4050303@rlan.ru> <53EE0A30.4020800@FreeBSD.org> <53EE16DE.9020209@rlan.ru> <53EE252D.10109@FreeBSD.org> <53F3563D.6020107@FreeBSD.org> In-Reply-To: <53F3563D.6020107@FreeBSD.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: freebsd-ipfw , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Aug 2014 16:06:56 -0000 19.08.2014 17:50, Alexander V. Chernikov пишет: > On 15.08.2014 19:20, Alexander V. Chernikov wrote: >> On 15.08.2014 18:19, Dmitry Selivanov wrote: >>> 15.08.2014 17:25, Alexander V. Chernikov пишет: >>>> On 08.08.2014 16:11, Dmitry Selivanov wrote: >>>>> 04.08.2014 23:51, Alexander V. Chernikov пишет: >>>>>> On 04.08.2014 15:58, Luigi Rizzo wrote: >>>>>>> On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. Chernikov wrote: >>>>>>>> On 02.08.2014 12:33, Alexander V. Chernikov wrote: >>>>>>>>> On 02.08.2014 10:33, Luigi Rizzo wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov >>>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>> Hello all. >>>>>>>>>> >>>>>>>>>> I'm currently working on to enhance ipfw in some areas. >>>>>>>>>> The most notable (and user-visible) change is named table support. >>>>>>>>>> The other one is support for different lookup algorithms for different >>>>>>>>>> key types. >>>>>>>>>> >>>>>>>>>> For example, new ipfw permits writing this: >>>>>>>>>> >>>>>>>>>> ipfw table tb1 create type cidr >>>>>>>>>> ipfw add allow ip from table(tl1) to any >>>>>>>>>> ipfw add allow ip from any lookup dst-ip tb1 >>>>>>>>>> >>>>>>>>>> ipfw table if1 create type iface >>>>>>>>>> ipfw add skipto tablearg ip from any to any via table(if1) >>>>>>>>>> >>>>>>>>>> or even this: >>>>>>>>>> ipfw table fl1 create type flow:src-ip,proto,dst-ip,dst-port >>>>>>>>>> ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 4444 >>>>>>>>>> ipfw add allow ip from any to any flow table(fl1) >>>>>>>>>> >>>>>>>>>> all these changes fully preserve backward compatibility. >>>>>>>>>> (actually tables needs now to be created before use and their type needs >>>>>>>>>> to match with opcode used, but new ipfw(8) performs auto-creation >>>>>>>>>> for cidr tables). >>>>>>>>>> >>>>>>>>>> There is another thing I'm going to change and I'm not sure I can keep >>>>>>>>>> the same compatibility level. >>>>>>>>>> >>>>>>>>>> Table values, from one point of view, can be classified to the following >>>>>>>>>> types: >>>>>>>>>> >>>>>>>>>> - skipto argument >>>>>>>>>> - fwd argument (*) >>>>>>>>>> - link to another object (nat, pipe, queue) >>>>>>>>>> - plain u32 (not bound to any object) >>>>>>>>>> (divert/tee,netgraph,tag/utag,limit) >>>>>>>>>> >>>>>>>>>> There are the following reasons why I think it is necessary to implement >>>>>>>>>> explicit table values typing (like tables): >>>>>>>>>> - Implementing fwd tablearg for IPv6 hosts requires indirection table >>>>>>>>>> - Converting nat/pipe instance ids to names renders values unusable >>>>>>>>>> - retiring old hack with storing saved pointer of found object/rule >>>>>>>>>> inside rule w/o proper locking >>>>>>>>>> - making faster skipto >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ??????i don't buy the idea that you need typed arguments >>>>>>>>>> for all the cases above. Maybe the case that >>>>>>>>>> may make sense is the fwd argument (and in the future >>>>>>>>>> something else). >>>>>>>>>> We already discussed, i think, the fact that now it >>>>>>>>>> is legal to have references to non existing things >>>>>>>>>> (skipto, pipes etc.) implemented as u32. >>>>>>>>>> Removing that would break configurations. >>>>>>>>> It depends on actual implementation. This can be preserved by >>>>>>>>> auto-creating necessary objects in kernel and/or in userspace, so >>>>>>>>> we can (and should) avoid breaking in this particular way. >>>>>>>> Can you please explain your vision on values another time? >>>>>>>> As far as I understand, you're not against it in general, but the >>>>>>>> details matter: >>>>>>>> * IP address can be one of the types (it won't break much, and we can >>>>>>>> simply skip that one for MFC) >>>>>>>> * what about typing for nat/pipes ? we're not going to convert their ids >>>>>>>> to names? (or maybe you can suggest other non-disruptive way?) >>>>>>>> * everything else is type "u32" >>>>>>> >>>>>>> Correct, I am mostly concerned about the details, not on the general concept. >>>>>>> >>>>>>> To summarize the discussion Alexander and I had about converting >>>>>>> identifiers from numbers to arbitrary strings (this is partly related >>>>>>> to the values stored in tables, but I think we should have a coherent >>>>>>> behaviour) >>>>>>> >>>>>>> 1. CURRENTLY ipfw uses numeric identifiers in a small range (16 bits or less) >>>>>>> for rules, pipes, queues, tables, probably nat instances. >>>>>>> >>>>>>> 2. CURRENTLY, in all the above contexts, it is legal to reference a >>>>>>> non existing object (rule, pipe, table names, etc.), >>>>>>> and the kernel will do something reasonable, namely jump to the >>>>>>> next rule, drop traffic for non existing pipes, and so on. >>>>>>> >>>>>>> 3. of course we want to preserve backward compatibility both for >>>>>>> the ioctl interface, and for user configurations. >>>>>>> >>>>>>> 4. The in-kernel representation of identifiers is not visible to users, >>>>>>> so we can use a numeric representation in the kernel for identifiers. >>>>>>> Strings like "12345" are converted with atoi() or the like, >>>>>>> whereas for other identifiers or numbers outside of the 2^16 range >>>>>>> the kernel manages a translation table, allocating new numeric >>>>>>> identifiers if a new string appears. >>>>>>> This permits backward compatibility for old rulesets, and does not >>>>>>> impact performance because the translation table is only >>>>>>> used during rules additions or deletion. >>>>>> Yes. However this requires either holding either (1) 2 pointers (old&new >>>>>> arrays), or (2) 65k+ index array, or (3) chained hash table. >>>>>> (1) would require additional pointers for each subsystem (and some >>>>>> additional management), >>>>>> (2) will definitely upset embedded guys and >>>>>> (3) is worse in terms of performance >>>>>>> >>>>>>> With this in mind, i think we should follow a similar approach for >>>>>>> objects stored in tables, hence >>>>>>> >>>>>>> if an u32 value was available in the past, it must be >>>>>>> available also in the new implementation. >>>>>>> >>>>>>> The issue with tables is that some convoluted configuration could >>>>>>> use the same table to reference pipes _and_ rules _and_ perhaps >>>>>>> other things represented as numbers (the former is not too strange, >>>>>>> if i have a large configuration i might place sections at rules >>>>>>> 12000, 13000, 14000... and associate pipes with the same numberic >>>>>>> identifier to each block of rules). >>>>>>> >>>>>>> Typed table values would clearly disturb backward compatibility >>>>>>> in the above configurations. However it should not be difficult >>>>>>> to accept arbitrary strings as the values stored in tables, and >>>>>>> then store multiple representations as appropriate, including: >>>>>> Well, I've thought about thas one. It may be an option, but the details >>>>>> are not so promising (below) >>>>>>> - the string representation, unconditionally >>>>>>> - for names that can be resolved by DNS, the ipv6 and ipv4 address(es) >>>>>>> associated with them. ipfw already translates hostnames in rules >>>>>>> so this is POLA >>>>>> I'm not happy what ipfw(8) is doing instead of translation. The proper >>>>>> way would be not simply using first AF_INET answer but saving ALL >>>>>> IPv4+IPv6 records inside rule (and some more tracking should be done >>>>>> afterwards, but that's totally different story). Additionally, I'm >>>>>> unsure if we really need next-hop value expressed as hostname (how can >>>>>> we deal with multiple addresses and diffrent AFs?). We may store strings >>>>>> (and I think we should do it) but I'm unsure about this particular >>>>>> option of interpreting them. >>>>>>> - for other strings, a u32 from the translation table as previously >>>>>>> indicated >>>>>>> - and for numeric values, the u32 representation (truncated if needed, >>>>>>> according to whatever is the existing behaviour) >>>>>>> - >>>>>>> If we cannot generate an u32 we will put some value (e.g. 0) >>>>>>> that hopefully will not cause confusion. >>>>>> As far as I understand, we accept some string "s" as table value inside >>>>>> the kernel, than, we have some logic that says: >>>>>> oh, dummynet pipe has the same name "s"s, oh, nat entity with name "s" >>>>>> has just been created, let's save indices. >>>>>> >>>>>> That would require additional indirection table like: >>>>>> >>>>>> index | [ skipto idx | nat idx | pipe idx | queue idx | fwd index ] >>>>>> ( so we will have 2-level indirection table for fwd if we do IPv6) >>>>>> >>>>>> We can optimize this if we use "same name -> same kidx" approach >>>>>> regardless of kernel object we're refering to. That might require some >>>>>> more memory, but that's OK from my point of view. >>>>>> >>>>>> So we end up with >>>>>> int [ skipto idx | fwd idx | obj idx ] >>>>>> >>>>>> idx "0" is special value which means the same as 2.CURRENT >>>>>> >>>>>> That looks better, but still way to complex. >>>>>> I do care about compatibility, but it's hard to improve things without >>>>>> changing. >>>>>> >>>>>> I'd like to propose the following: >>>>>> * Split values into 3 types ("ip|nexthop", "number", "object") >>>>>> * Do not insist on object existence, use value "0" to mimic 2.CURRENT >>>>>> behavior. >>>>>> * Retain full compatibility by introducing special value type "legacy" >>>>>> which matches any type and is backed by given indirection table. >>>>>> * Issue warning in ipfw(8) binary on all auto-created tables that >>>>>> auto-creation is legacy and this behavior will be dropped in next major >>>>>> release (e.g. 11.0) >>>>>> * Save this behavior in MFC but drop "legacy" tables in head after a >>>>>> month after actual MFC. >>>>>> >>>>>> That do you think? >>>>>>> >>>>>>> If we do it this way, we should be able to preserve backward >>>>>>> compatibility _and_ add features that people may need. >>>>>>> >>>>>>> cheers >>>>>>> luigi >>>>>>> >>>>> Here is my idea: tablearg should contain more than one value. I think getting several values from one table lookup is faster than several table lookups with one value. >>>>> Let tablearg be not just uint32, but array with different value types inside it. >>>> There are some use cases where we might need 2-level value lookup (e.g. algo returning index for index table where actual data reside) and each data item can >>>> really be up to 64-bytes long. The problem is in actual partitioning and compatibility. >>>>> >>>>> For example I have many such rules: >>>>> allow src-ip 1.2.3.4 MAC any 11:22:33:44:55:66 recv vlan1234 dst-ip 1.1.1.1 >>>> Sorry, what task are you solving by using given rules? >>> Small ISP, clients have static IP with MAC-authorization. Src iface must be checked to prevent IP-spoofing. Dst-IP sometimes is used for p2p-channels. >>>>> >>>>> These rules can be replaced with such construction: >>>>> allow src-ip table(1) MAC any tablearg[1] recv tablearg[2] dst-ip tablearg[3] >>>>> >>>>> But I don't think indexing by value is a good idea. I think index==starting byte is a better way: >>>>> allow src-ip table(1) MAC any tablearg:0 recv tablearg:6 dst-ip tablearg:32 >>>>> where MAC's 6 bytes are from 0 to 5 in tablearg; iface string is from 6 and till \0, but less than 26 bytes; and IPv4's 4 bytes are from 32 to 35. >>>> >>>>> So we need to create table for it: >>>>> table 1 set MAC:0 string:6:26 ip:32 >>>>> table 1 add 1.2.3.4 11:22:33:44:55:66 vlan1234 1.1.1.1 >>>>> >>>>> String can be used both for iface and comment. >>>>> Other possible value types: >>>>> uint16 for nat, pipe, skipto and other 2-bytes actions >>>>> IPv4 4 bytes >>>>> CIDRv4 5 bytes >>>>> IPv6 16 bytes >>>>> CIDRv6 17 bytes >>>>> table_id 2 bytes - link to another table >>>> Well, it seems we have enough space to store most of these, however, problems seem to remain the same: typing and compatibility. >>>> When you're creating new table (or it is auto-created) which values types should be assumed ? All of them? >>> Default - as usually uint32. >> I can't see "uint32" value in the list you have specified before. I'll rephrase: >> what value types (from the list above or similar) should ipfw(8) or kernel fill in case of "default" table? >> (And once again, what should we print as value) ? >> Please think about >> a) old ipfw binaries >> b) new ipfw binaries using exactly the same ruleset they are already using (with, for example, both "skipto tablearg" and "fwd tablearg " tables). At that time I meant default table "header" is "ip:0" (in my context). It would be completely compatible with old ipfw tables. > I've increased kernel<>userland 'struct tentry' value field to 64 bytes. > It looks like we were talking about a bit different things. > Let me try to explain the problem I'm stuck with: > > We may take the road you've suggested, it looks OK: > > * by default tables are created with "all-values" mask. > * ipfw(8) value treats default "ipfw table X add Y val" input where value is u32 number as input data for each type specified in all-values without returning error > * for non-default mask value data should be validated. > > e.g. if we have table with valtype="skipto,nat,pipe,ip4,ip6" and "100" as input -> it turns to "100,100,0.0.0.0,::". I don't fully understand. One "100" value for all valtypes? Then "100" can't be equal "0.0.0.0" and "::". Or you meant "100,100,0,0" as input? > If we have value with valtype="skipto,ip6" and "100" as input -> error while the valid one would be "100,2a01::1:111", for example. > > I'm unsure how should one be able to update _specific_ value (e.g. update nat id or skipto arg), but that's not the problem. Maybe new command would help, like "ipfw table X set Y newval". > > The problem arises if we start talking about using names for nat/pipe/queue ids instead of numbers. > If we have nat instances "nat1", "11" and "23", and one specifies "44" as part of value, logic starts to be complex: > > we either require nat "44" to exists (and I'm unsure if we can auto-create it *) or start doing complex stuff like tracking all those non-existing objects: > e.g. add some special record somewhere that we're wating for nat instance "44" to be created, than auto-update given value with its kernel index, > than, do something reasonable if nat "44" instance is destroyed (OK, nat instance can't be destroyed, but pipe can). > .. and we have to do the same for pipes/queues and any following kernel object. > > Or we have to require user to reference existing objects only (create explicitly before use). This one makes things easier in code, but require user to change their scripts. > It looks like there is no consensus on that point. User can destroy object after table creating. I think this way: "no object - no packet (explicitly deny)". No need to check object existence. > > * Maybe auto-creation is not so tricky and we should try to evaluate it.. > >> >> >>>> What should `ipfw table X list` show as "value" field ? >>> I added table "header" in this line: >>> table 1 set MAC:0 string:6:26 ip:32 >> I don't think that user should be able to set any offsets in userland. Exact offsets of variable of given type needs to be enforced by kernel, >> so you may fill that you want "mac" and "ip" as values for given table, but not lengths or offsets. Does your way allow to use strings (e.g. iface or comments)? >>> So `ipfw table X list` should show something like this: >>> ---table(0)--- >>> 1.2.3.4/32 11:22:33:44:55:66 vlan1234 1.1.1.1 >>> We can also add "header" description in output (with or without additional parameter - depends on compatibility needs) like this: >>> ---table(0)--- addr MAC iface IPv4 >>>> How should ipfw(8) treat "add 1.1.1.1 0" input? >>> It should look at table "header" and return error message like "Value doesn't match table header" >> >>>> What will happen if we want to add another type field to this list? (MAC address of Infiniband MAC address, for example). >>> I don't think there is a sense to mix both MAC[6] and MAC[20] values in 1 table. It is easier to create 2 tables with different "headers". >>> For Infiniband we can add another type: MAC20 (or something like this). Or we can use "MAC"-type like string type(see above): MAC:6:25 (1st and last bytes, or 1st and length). >>>> >>>>> >>>>> Table value length can be set for example with loader tunable like net.inet.ip.fw.table_value_length. >>>>> Even with default uint32 value length we can get 2 uint16 values or 4 uint8 values, this can help in some configurations. >>>>> >>>>> This way is more complex, but much more flexible. It's like netgraph subsystem. >>>>> I think it suites both Alexander and Luigi requests. >>>>> >>>>> From owner-freebsd-net@FreeBSD.ORG Tue Aug 19 17:36:47 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 04A4ED55; Tue, 19 Aug 2014 17:36:47 +0000 (UTC) Received: from mail.ipfw.ru (mail.ipfw.ru [IPv6:2a01:4f8:120:6141::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7AFBC3AB6; Tue, 19 Aug 2014 17:36:46 +0000 (UTC) Received: from [2a02:6b8:0:401:222:4dff:fe50:cd2f] (helo=ptichko.yndx.net) by mail.ipfw.ru with esmtpsa (TLSv1:DHE-RSA-AES128-SHA:128) (Exim 4.82 (FreeBSD)) (envelope-from ) id 1XJjNF-0002IO-8Y; Tue, 19 Aug 2014 17:22:53 +0400 Message-ID: <53F38B18.60409@FreeBSD.org> Date: Tue, 19 Aug 2014 21:36:24 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Dmitry Selivanov Subject: Re: ipfw named objejcts, table values and syntax change References: <53DC01DE.3000000@FreeBSD.org> <53DCA25C.1000108@FreeBSD.org> <53DF55FA.8010303@FreeBSD.org> <20140804115817.GA13814@onelab2.iet.unipi.it> <53DFE438.5050209@FreeBSD.org> <53E4BE62.4050303@rlan.ru> <53EE0A30.4020800@FreeBSD.org> <53EE16DE.9020209@rlan.ru> <53EE252D.10109@FreeBSD.org> <53F3563D.6020107@FreeBSD.org> <53F3760E.9070206@rlan.ru> In-Reply-To: <53F3760E.9070206@rlan.ru> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: freebsd-ipfw , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Aug 2014 17:36:47 -0000 On 19.08.2014 20:06, Dmitry Selivanov wrote: > 19.08.2014 17:50, Alexander V. Chernikov пишет: >> On 15.08.2014 19:20, Alexander V. Chernikov wrote: >>> On 15.08.2014 18:19, Dmitry Selivanov wrote: >>>> 15.08.2014 17:25, Alexander V. Chernikov пишет: >>>>> On 08.08.2014 16:11, Dmitry Selivanov wrote: >>>>>> 04.08.2014 23:51, Alexander V. Chernikov пишет: >>>>>>> On 04.08.2014 15:58, Luigi Rizzo wrote: >>>>>>>> On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. >>>>>>>> Chernikov wrote: >>>>>>>>> On 02.08.2014 12:33, Alexander V. Chernikov wrote: >>>>>>>>>> On 02.08.2014 10:33, Luigi Rizzo wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov >>>>>>>>>>> > wrote: >>>>>>>>>>> >>>>>>>>>>> Hello all. >>>>>>>>>>> >>>>>>>>>>> I'm currently working on to enhance ipfw in some areas. >>>>>>>>>>> The most notable (and user-visible) change is named >>>>>>>>>>> table support. >>>>>>>>>>> The other one is support for different lookup >>>>>>>>>>> algorithms for different >>>>>>>>>>> key types. >>>>>>>>>>> >>>>>>>>>>> For example, new ipfw permits writing this: >>>>>>>>>>> >>>>>>>>>>> ipfw table tb1 create type cidr >>>>>>>>>>> ipfw add allow ip from table(tl1) to any >>>>>>>>>>> ipfw add allow ip from any lookup dst-ip tb1 >>>>>>>>>>> >>>>>>>>>>> ipfw table if1 create type iface >>>>>>>>>>> ipfw add skipto tablearg ip from any to any via >>>>>>>>>>> table(if1) >>>>>>>>>>> >>>>>>>>>>> or even this: >>>>>>>>>>> ipfw table fl1 create type >>>>>>>>>>> flow:src-ip,proto,dst-ip,dst-port >>>>>>>>>>> ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 4444 >>>>>>>>>>> ipfw add allow ip from any to any flow table(fl1) >>>>>>>>>>> >>>>>>>>>>> all these changes fully preserve backward compatibility. >>>>>>>>>>> (actually tables needs now to be created before use >>>>>>>>>>> and their type needs >>>>>>>>>>> to match with opcode used, but new ipfw(8) performs >>>>>>>>>>> auto-creation >>>>>>>>>>> for cidr tables). >>>>>>>>>>> >>>>>>>>>>> There is another thing I'm going to change and I'm not >>>>>>>>>>> sure I can keep >>>>>>>>>>> the same compatibility level. >>>>>>>>>>> >>>>>>>>>>> Table values, from one point of view, can be >>>>>>>>>>> classified to the following >>>>>>>>>>> types: >>>>>>>>>>> >>>>>>>>>>> - skipto argument >>>>>>>>>>> - fwd argument (*) >>>>>>>>>>> - link to another object (nat, pipe, queue) >>>>>>>>>>> - plain u32 (not bound to any object) >>>>>>>>>>> (divert/tee,netgraph,tag/utag,limit) >>>>>>>>>>> >>>>>>>>>>> There are the following reasons why I think it is >>>>>>>>>>> necessary to implement >>>>>>>>>>> explicit table values typing (like tables): >>>>>>>>>>> - Implementing fwd tablearg for IPv6 hosts requires >>>>>>>>>>> indirection table >>>>>>>>>>> - Converting nat/pipe instance ids to names renders >>>>>>>>>>> values unusable >>>>>>>>>>> - retiring old hack with storing saved pointer of >>>>>>>>>>> found object/rule >>>>>>>>>>> inside rule w/o proper locking >>>>>>>>>>> - making faster skipto >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ??????i don't buy the idea that you need typed arguments >>>>>>>>>>> for all the cases above. Maybe the case that >>>>>>>>>>> may make sense is the fwd argument (and in the future >>>>>>>>>>> something else). >>>>>>>>>>> We already discussed, i think, the fact that now it >>>>>>>>>>> is legal to have references to non existing things >>>>>>>>>>> (skipto, pipes etc.) implemented as u32. >>>>>>>>>>> Removing that would break configurations. >>>>>>>>>> It depends on actual implementation. This can be preserved by >>>>>>>>>> auto-creating necessary objects in kernel and/or in >>>>>>>>>> userspace, so >>>>>>>>>> we can (and should) avoid breaking in this particular way. >>>>>>>>> Can you please explain your vision on values another time? >>>>>>>>> As far as I understand, you're not against it in general, but the >>>>>>>>> details matter: >>>>>>>>> * IP address can be one of the types (it won't break much, and >>>>>>>>> we can >>>>>>>>> simply skip that one for MFC) >>>>>>>>> * what about typing for nat/pipes ? we're not going to convert >>>>>>>>> their ids >>>>>>>>> to names? (or maybe you can suggest other non-disruptive way?) >>>>>>>>> * everything else is type "u32" >>>>>>>> >>>>>>>> Correct, I am mostly concerned about the details, not on the >>>>>>>> general concept. >>>>>>>> >>>>>>>> To summarize the discussion Alexander and I had about converting >>>>>>>> identifiers from numbers to arbitrary strings (this is partly >>>>>>>> related >>>>>>>> to the values stored in tables, but I think we should have a >>>>>>>> coherent >>>>>>>> behaviour) >>>>>>>> >>>>>>>> 1. CURRENTLY ipfw uses numeric identifiers in a small range (16 >>>>>>>> bits or less) >>>>>>>> for rules, pipes, queues, tables, probably nat instances. >>>>>>>> >>>>>>>> 2. CURRENTLY, in all the above contexts, it is legal to >>>>>>>> reference a >>>>>>>> non existing object (rule, pipe, table names, etc.), >>>>>>>> and the kernel will do something reasonable, namely jump to >>>>>>>> the >>>>>>>> next rule, drop traffic for non existing pipes, and so on. >>>>>>>> >>>>>>>> 3. of course we want to preserve backward compatibility both for >>>>>>>> the ioctl interface, and for user configurations. >>>>>>>> >>>>>>>> 4. The in-kernel representation of identifiers is not visible >>>>>>>> to users, >>>>>>>> so we can use a numeric representation in the kernel for >>>>>>>> identifiers. >>>>>>>> Strings like "12345" are converted with atoi() or the like, >>>>>>>> whereas for other identifiers or numbers outside of the >>>>>>>> 2^16 range >>>>>>>> the kernel manages a translation table, allocating new numeric >>>>>>>> identifiers if a new string appears. >>>>>>>> This permits backward compatibility for old rulesets, and >>>>>>>> does not >>>>>>>> impact performance because the translation table is only >>>>>>>> used during rules additions or deletion. >>>>>>> Yes. However this requires either holding either (1) 2 pointers >>>>>>> (old&new >>>>>>> arrays), or (2) 65k+ index array, or (3) chained hash table. >>>>>>> (1) would require additional pointers for each subsystem (and some >>>>>>> additional management), >>>>>>> (2) will definitely upset embedded guys and >>>>>>> (3) is worse in terms of performance >>>>>>>> >>>>>>>> With this in mind, i think we should follow a similar approach for >>>>>>>> objects stored in tables, hence >>>>>>>> >>>>>>>> if an u32 value was available in the past, it must be >>>>>>>> available also in the new implementation. >>>>>>>> >>>>>>>> The issue with tables is that some convoluted configuration could >>>>>>>> use the same table to reference pipes _and_ rules _and_ perhaps >>>>>>>> other things represented as numbers (the former is not too >>>>>>>> strange, >>>>>>>> if i have a large configuration i might place sections at rules >>>>>>>> 12000, 13000, 14000... and associate pipes with the same numberic >>>>>>>> identifier to each block of rules). >>>>>>>> >>>>>>>> Typed table values would clearly disturb backward compatibility >>>>>>>> in the above configurations. However it should not be difficult >>>>>>>> to accept arbitrary strings as the values stored in tables, and >>>>>>>> then store multiple representations as appropriate, including: >>>>>>> Well, I've thought about thas one. It may be an option, but the >>>>>>> details >>>>>>> are not so promising (below) >>>>>>>> - the string representation, unconditionally >>>>>>>> - for names that can be resolved by DNS, the ipv6 and ipv4 >>>>>>>> address(es) >>>>>>>> associated with them. ipfw already translates hostnames in >>>>>>>> rules >>>>>>>> so this is POLA >>>>>>> I'm not happy what ipfw(8) is doing instead of translation. The >>>>>>> proper >>>>>>> way would be not simply using first AF_INET answer but saving ALL >>>>>>> IPv4+IPv6 records inside rule (and some more tracking should be >>>>>>> done >>>>>>> afterwards, but that's totally different story). Additionally, I'm >>>>>>> unsure if we really need next-hop value expressed as hostname >>>>>>> (how can >>>>>>> we deal with multiple addresses and diffrent AFs?). We may store >>>>>>> strings >>>>>>> (and I think we should do it) but I'm unsure about this particular >>>>>>> option of interpreting them. >>>>>>>> - for other strings, a u32 from the translation table as >>>>>>>> previously >>>>>>>> indicated >>>>>>>> - and for numeric values, the u32 representation (truncated if >>>>>>>> needed, >>>>>>>> according to whatever is the existing behaviour) >>>>>>>> - >>>>>>>> If we cannot generate an u32 we will put some value (e.g. 0) >>>>>>>> that hopefully will not cause confusion. >>>>>>> As far as I understand, we accept some string "s" as table value >>>>>>> inside >>>>>>> the kernel, than, we have some logic that says: >>>>>>> oh, dummynet pipe has the same name "s"s, oh, nat entity with >>>>>>> name "s" >>>>>>> has just been created, let's save indices. >>>>>>> >>>>>>> That would require additional indirection table like: >>>>>>> >>>>>>> index | [ skipto idx | nat idx | pipe idx | queue idx | fwd index ] >>>>>>> ( so we will have 2-level indirection table for fwd if we do IPv6) >>>>>>> >>>>>>> We can optimize this if we use "same name -> same kidx" approach >>>>>>> regardless of kernel object we're refering to. That might >>>>>>> require some >>>>>>> more memory, but that's OK from my point of view. >>>>>>> >>>>>>> So we end up with >>>>>>> int [ skipto idx | fwd idx | obj idx ] >>>>>>> >>>>>>> idx "0" is special value which means the same as 2.CURRENT >>>>>>> >>>>>>> That looks better, but still way to complex. >>>>>>> I do care about compatibility, but it's hard to improve things >>>>>>> without >>>>>>> changing. >>>>>>> >>>>>>> I'd like to propose the following: >>>>>>> * Split values into 3 types ("ip|nexthop", "number", "object") >>>>>>> * Do not insist on object existence, use value "0" to mimic >>>>>>> 2.CURRENT >>>>>>> behavior. >>>>>>> * Retain full compatibility by introducing special value type >>>>>>> "legacy" >>>>>>> which matches any type and is backed by given indirection table. >>>>>>> * Issue warning in ipfw(8) binary on all auto-created tables that >>>>>>> auto-creation is legacy and this behavior will be dropped in >>>>>>> next major >>>>>>> release (e.g. 11.0) >>>>>>> * Save this behavior in MFC but drop "legacy" tables in head >>>>>>> after a >>>>>>> month after actual MFC. >>>>>>> >>>>>>> That do you think? >>>>>>>> >>>>>>>> If we do it this way, we should be able to preserve backward >>>>>>>> compatibility _and_ add features that people may need. >>>>>>>> >>>>>>>> cheers >>>>>>>> luigi >>>>>>>> >>>>>> Here is my idea: tablearg should contain more than one value. I >>>>>> think getting several values from one table lookup is faster than >>>>>> several table lookups with one value. >>>>>> Let tablearg be not just uint32, but array with different value >>>>>> types inside it. >>>>> There are some use cases where we might need 2-level value lookup >>>>> (e.g. algo returning index for index table where actual data >>>>> reside) and each data item can >>>>> really be up to 64-bytes long. The problem is in actual >>>>> partitioning and compatibility. >>>>>> >>>>>> For example I have many such rules: >>>>>> allow src-ip 1.2.3.4 MAC any 11:22:33:44:55:66 recv vlan1234 >>>>>> dst-ip 1.1.1.1 >>>>> Sorry, what task are you solving by using given rules? >>>> Small ISP, clients have static IP with MAC-authorization. Src iface >>>> must be checked to prevent IP-spoofing. Dst-IP sometimes is used >>>> for p2p-channels. >>>>>> >>>>>> These rules can be replaced with such construction: >>>>>> allow src-ip table(1) MAC any tablearg[1] recv tablearg[2] dst-ip >>>>>> tablearg[3] >>>>>> >>>>>> But I don't think indexing by value is a good idea. I think >>>>>> index==starting byte is a better way: >>>>>> allow src-ip table(1) MAC any tablearg:0 recv tablearg:6 dst-ip >>>>>> tablearg:32 >>>>>> where MAC's 6 bytes are from 0 to 5 in tablearg; iface string is >>>>>> from 6 and till \0, but less than 26 bytes; and IPv4's 4 bytes >>>>>> are from 32 to 35. >>>>> >>>>>> So we need to create table for it: >>>>>> table 1 set MAC:0 string:6:26 ip:32 >>>>>> table 1 add 1.2.3.4 11:22:33:44:55:66 vlan1234 1.1.1.1 >>>>>> >>>>>> String can be used both for iface and comment. >>>>>> Other possible value types: >>>>>> uint16 for nat, pipe, skipto and other 2-bytes actions >>>>>> IPv4 4 bytes >>>>>> CIDRv4 5 bytes >>>>>> IPv6 16 bytes >>>>>> CIDRv6 17 bytes >>>>>> table_id 2 bytes - link to another table >>>>> Well, it seems we have enough space to store most of these, >>>>> however, problems seem to remain the same: typing and compatibility. >>>>> When you're creating new table (or it is auto-created) which >>>>> values types should be assumed ? All of them? >>>> Default - as usually uint32. >>> I can't see "uint32" value in the list you have specified before. >>> I'll rephrase: >>> what value types (from the list above or similar) should ipfw(8) or >>> kernel fill in case of "default" table? >>> (And once again, what should we print as value) ? >>> Please think about >>> a) old ipfw binaries >>> b) new ipfw binaries using exactly the same ruleset they are already >>> using (with, for example, both "skipto tablearg" and "fwd tablearg " >>> tables). > At that time I meant default table "header" is "ip:0" (in my context). > It would be completely compatible with old ipfw tables. > >> I've increased kernel<>userland 'struct tentry' value field to 64 bytes. >> It looks like we were talking about a bit different things. >> Let me try to explain the problem I'm stuck with: >> >> We may take the road you've suggested, it looks OK: >> >> * by default tables are created with "all-values" mask. >> * ipfw(8) value treats default "ipfw table X add Y val" input where >> value is u32 number as input data for each type specified in >> all-values without returning error >> * for non-default mask value data should be validated. >> >> e.g. if we have table with valtype="skipto,nat,pipe,ip4,ip6" and >> "100" as input -> it turns to "100,100,0.0.0.0,::". > I don't fully understand. One "100" value for all valtypes? Then "100" > can't be equal "0.0.0.0" and "::". Or you meant "100,100,0,0" as input? We have to handle the case when user with _unmodified_ scripts tries to use new ipfw (either with new binary or the old one). The goal is not to throw error and break everything, of course. > >> If we have value with valtype="skipto,ip6" and "100" as input -> >> error while the valid one would be "100,2a01::1:111", for example. >> >> I'm unsure how should one be able to update _specific_ value (e.g. >> update nat id or skipto arg), but that's not the problem. > Maybe new command would help, like "ipfw table X set Y newval". > >> >> The problem arises if we start talking about using names for >> nat/pipe/queue ids instead of numbers. >> If we have nat instances "nat1", "11" and "23", and one specifies >> "44" as part of value, logic starts to be complex: >> >> we either require nat "44" to exists (and I'm unsure if we can >> auto-create it *) or start doing complex stuff like tracking all >> those non-existing objects: >> e.g. add some special record somewhere that we're wating for nat >> instance "44" to be created, than auto-update given value with its >> kernel index, >> than, do something reasonable if nat "44" instance is destroyed (OK, >> nat instance can't be destroyed, but pipe can). >> .. and we have to do the same for pipes/queues and any following >> kernel object. >> >> Or we have to require user to reference existing objects only (create >> explicitly before use). This one makes things easier in code, but >> require user to change their scripts. >> It looks like there is no consensus on that point. > User can destroy object after table creating. I think this way: "no > object - no packet (explicitly deny)". No need to check object existence. Yes, but even this behavior has to be supported by kernel: Let me explain in more details: user calls -> ipfw nat "23" iface ... Kernel sees string "23" which is not the name of any existing nat instance, so it creates one and allocates new kernel index for that (let it be 1). The same for "nat1" -> 2 and "11" -> 3. Kernel indexes are purely internal and can not be referenced by userland. So, when you enter "44" inside new value, the following happens: 1) some special object binding name "44" and value of record X is created 2) nat instance list is searched to see if "44" is and existing name. If entry is found, its kernel index is saved to "value", 0 is saved otherwise. 3) If nat entry is destroyed, we have to walk all entries and set their appropriate parts back to 0 (otherwise some other entry may use this index later leading to packes being aliased to another nat instance. "show" command would print incorrect values, too). This can be done (and we have to write code for each type of kernel object, e.g. one for nat, one for pipe/queue, etc..), but require a lot of code which we would have to support forever. I'd like an idea to enforce hard bindings (with, maybe, some intermediate period of compatible behavior for MFC). >> >> * Maybe auto-creation is not so tricky and we should try to evaluate >> it.. >> >>> >>> >>>>> What should `ipfw table X list` show as "value" field ? >>>> I added table "header" in this line: >>>> table 1 set MAC:0 string:6:26 ip:32 >>> I don't think that user should be able to set any offsets in >>> userland. Exact offsets of variable of given type needs to be >>> enforced by kernel, >>> so you may fill that you want "mac" and "ip" as values for given >>> table, but not lengths or offsets. > Does your way allow to use strings (e.g. iface or comments)? I'm not sure on what you're going to do with interfaces as values. Comments - per value or per table entry? I can think of it, but probably not all algorithms will support that functionality. > >>>> So `ipfw table X list` should show something like this: >>>> ---table(0)--- >>>> 1.2.3.4/32 11:22:33:44:55:66 vlan1234 1.1.1.1 >>>> We can also add "header" description in output (with or without >>>> additional parameter - depends on compatibility needs) like this: >>>> ---table(0)--- addr MAC iface IPv4 >>>>> How should ipfw(8) treat "add 1.1.1.1 0" input? >>>> It should look at table "header" and return error message like >>>> "Value doesn't match table header" >>> >>>>> What will happen if we want to add another type field to this >>>>> list? (MAC address of Infiniband MAC address, for example). >>>> I don't think there is a sense to mix both MAC[6] and MAC[20] >>>> values in 1 table. It is easier to create 2 tables with different >>>> "headers". >>>> For Infiniband we can add another type: MAC20 (or something like >>>> this). Or we can use "MAC"-type like string type(see above): >>>> MAC:6:25 (1st and last bytes, or 1st and length). >>>>> >>>>>> >>>>>> Table value length can be set for example with loader tunable >>>>>> like net.inet.ip.fw.table_value_length. >>>>>> Even with default uint32 value length we can get 2 uint16 values >>>>>> or 4 uint8 values, this can help in some configurations. >>>>>> >>>>>> This way is more complex, but much more flexible. It's like >>>>>> netgraph subsystem. >>>>>> I think it suites both Alexander and Luigi requests. >>>>>> >>>>>> > > From owner-freebsd-net@FreeBSD.ORG Wed Aug 20 07:34:28 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6F250392; Wed, 20 Aug 2014 07:34:28 +0000 (UTC) Received: from mail.turbocat.net (mail.turbocat.net [IPv6:2a01:4f8:d16:4514::2]) (using TLSv1.1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 17C67390A; Wed, 20 Aug 2014 07:34:28 +0000 (UTC) Received: from laptop015.home.selasky.org (cm-176.74.213.204.customer.telag.net [176.74.213.204]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 373B51FE027; Wed, 20 Aug 2014 09:34:26 +0200 (CEST) Message-ID: <53F44F91.2060006@selasky.org> Date: Wed, 20 Aug 2014 09:34:41 +0200 From: Hans Petter Selasky User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org, FreeBSD Current Subject: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections] References: <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> In-Reply-To: <20140709163146.GA21731@ox> Content-Type: multipart/mixed; boundary="------------080406030702000505070708" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 07:34:28 -0000 This is a multi-part message in MIME format. --------------080406030702000505070708 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi, A month has passed since the last e-mail on this topic, and in the meanwhile some new patches have been created and tested: Basically the approach has been changed a little bit: - The creation of hardware transmit rings has been made independent of the TCP stack. This allows firewall applications to forward traffic into hardware transmit rings aswell, and not only native TCP applications. This should be one more reason to get the feature into the kernel. - A hardware transmit ring basically can have two modes: FIXED-RATE or AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed bytes per second rate. In the automatic mode you can configure a time after which the TX queue must be empty. The hardware driver uses this to configure the actual rate. In automatic mode you can also set an upper and lower transmit rate limit. - The MBUF has got a new field in the packet header: "txringid" - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of the "txringid" field in the mbuf. The current patch [see attachment] should be much simpler and less intrusive than the previous one. Any comments ? --HPS --------------080406030702000505070708 Content-Type: text/x-diff; name="net_ratectl.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="net_ratectl.diff" === sys/net/if.h ================================================================== --- sys/net/if.h (revision 270138) +++ sys/net/if.h (local) @@ -239,6 +239,7 @@ #define IFCAP_RXCSUM_IPV6 0x200000 /* can offload checksum on IPv6 RX */ #define IFCAP_TXCSUM_IPV6 0x400000 /* can offload checksum on IPv6 TX */ #define IFCAP_HWSTATS 0x800000 /* manages counters internally */ +#define IFCAP_HWTXRINGS 0x1000000 /* hardware supports TX rings */ #define IFCAP_HWCSUM_IPV6 (IFCAP_RXCSUM_IPV6 | IFCAP_TXCSUM_IPV6) === sys/netinet/in.c ================================================================== --- sys/netinet/in.c (revision 270138) +++ sys/netinet/in.c (local) @@ -42,6 +42,7 @@ #include #include #include +#include #include #include #include @@ -201,9 +202,23 @@ struct in_ifaddr *ia; int error; - if (ifp == NULL) - return (EADDRNOTAVAIL); + if (ifp == NULL) { + struct inpcb *inp; + switch (cmd) { + case SIOCSTXRINGID: + inp = sotoinpcb(so); + if (inp == NULL) + return (EINVAL); + INP_WLOCK(inp); + inp->inp_txringid = *(unsigned *)data; + INP_WUNLOCK(inp); + return (0); + default: + return (EADDRNOTAVAIL); + } + } + /* * Filter out 4 ioctls we implement directly. Forward the rest * to specific functions and ifp->if_ioctl(). === sys/netinet/in_pcb.h ================================================================== --- sys/netinet/in_pcb.h (revision 270138) +++ sys/netinet/in_pcb.h (local) @@ -46,6 +46,7 @@ #ifdef _KERNEL #include #include +#include #include #include #endif @@ -177,7 +178,8 @@ u_char inp_ip_ttl; /* (i) time to live proto */ u_char inp_ip_p; /* (c) protocol proto */ u_char inp_ip_minttl; /* (i) minimum TTL or drop */ - uint32_t inp_flowid; /* (x) flow id / queue id */ + m_flowid_t inp_flowid; /* (x) flow ID */ + m_txringid_t inp_txringid; /* (x) transmit ring ID */ u_int inp_refcount; /* (i) refcount */ void *inp_pspare[5]; /* (x) route caching / general use */ uint32_t inp_flowtype; /* (x) M_HASHTYPE value */ === sys/netinet/in_var.h ================================================================== --- sys/netinet/in_var.h (revision 270138) +++ sys/netinet/in_var.h (local) @@ -33,6 +33,7 @@ #ifndef _NETINET_IN_VAR_H_ #define _NETINET_IN_VAR_H_ +#include #include #include #include @@ -81,6 +82,18 @@ struct sockaddr_in ifra_mask; int ifra_vhid; }; + +struct in_ratectlreq { + char ifreq_name[IFNAMSIZ]; + m_txringid_t tx_ring_id; + uint32_t min_bytes_per_interval; + uint32_t max_bytes_per_interval; + uint32_t micros_per_interval; + uint32_t mode; +#define IN_RATECTLREQ_MODE_FIXED 0 /* min rate = max rate */ +#define IN_RATECTLREQ_MODE_AUTOMATIC 1 /* bounded by min/max */ +}; + /* * Given a pointer to an in_ifaddr (ifaddr), * return a pointer to the addr as a sockaddr_in. === sys/netinet/ip_output.c ================================================================== --- sys/netinet/ip_output.c (revision 270138) +++ sys/netinet/ip_output.c (local) @@ -145,6 +145,7 @@ if (inp != NULL) { INP_LOCK_ASSERT(inp); M_SETFIB(m, inp->inp_inc.inc_fibnum); + m->m_pkthdr.txringid = inp->inp_txringid; if (inp->inp_flags & (INP_HW_FLOWID|INP_SW_FLOWID)) { m->m_pkthdr.flowid = inp->inp_flowid; M_HASHTYPE_SET(m, inp->inp_flowtype); === sys/netinet6/in6.c ================================================================== --- sys/netinet6/in6.c (revision 270138) +++ sys/netinet6/in6.c (local) @@ -235,6 +235,23 @@ int error; u_long ocmd = cmd; + if (ifp == NULL) { + struct inpcb *inp; + + switch (cmd) { + case SIOCSTXRINGID: + inp = sotoinpcb(so); + if (inp == NULL) + return (EINVAL); + INP_WLOCK(inp); + inp->inp_txringid = *(unsigned *)data; + INP_WUNLOCK(inp); + return (0); + default: + break; + } + } + /* * Compat to make pre-10.x ifconfig(8) operable. */ === sys/sys/mbuf.h ================================================================== --- sys/sys/mbuf.h (revision 270138) +++ sys/sys/mbuf.h (local) @@ -114,6 +114,10 @@ void (*m_tag_free)(struct m_tag *); }; +typedef uint32_t m_flowid_t; +typedef uint32_t m_txringid_t; +#define M_TXRINGID_UNDEFINED 0 + /* * Record/packet header in first mbuf of chain; valid only if M_PKTHDR is set. * Size ILP32: 48 @@ -125,7 +129,8 @@ int32_t len; /* total packet length */ /* Layer crossing persistent information. */ - uint32_t flowid; /* packet's 4-tuple system */ + m_flowid_t flowid; /* packet's 4-tuple system */ + m_txringid_t txringid; /* transmit ring ID */ uint64_t csum_flags; /* checksum and offload features */ uint16_t fibnum; /* this packet should use this fib */ uint8_t cosqos; /* class/quality of service */ === sys/sys/sockio.h ================================================================== --- sys/sys/sockio.h (revision 270138) +++ sys/sys/sockio.h (local) @@ -43,6 +43,7 @@ #define SIOCATMARK _IOR('s', 7, int) /* at oob mark? */ #define SIOCSPGRP _IOW('s', 8, int) /* set process group */ #define SIOCGPGRP _IOR('s', 9, int) /* get process group */ +#define SIOCSTXRINGID _IOW('s', 10, unsigned) /* set transmit ring ID */ /* SIOCADDRT _IOW('r', 10, struct ortentry) 4.3BSD */ /* SIOCDELRT _IOW('r', 11, struct ortentry) 4.3BSD */ @@ -128,4 +129,9 @@ #define SIOCDIFGROUP _IOW('i', 137, struct ifgroupreq) /* delete ifgroup */ #define SIOCGIFGMEMB _IOWR('i', 138, struct ifgroupreq) /* get members */ +#define SIOCARATECTL _IOWR('i', 139, struct in_ratectlreq) /* add new new rate control HW ring */ +#define SIOCSRATECTL _IOWR('i', 140, struct in_ratectlreq) /* set parameters for existing HW ring */ +#define SIOCGRATECTL _IOWR('i', 141, struct in_ratectlreq) /* get parameters for existing HW ring */ +#define SIOCDRATECTL _IOW('i', 142, struct in_ratectlreq) /* delete existing HW ring */ + #endif /* !_SYS_SOCKIO_H_ */ --------------080406030702000505070708-- From owner-freebsd-net@FreeBSD.ORG Wed Aug 20 08:39:44 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6D84DF9D; Wed, 20 Aug 2014 08:39:44 +0000 (UTC) Received: from mail.rlan.ru (mail.rlan.ru [213.234.25.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 806163EE4; Wed, 20 Aug 2014 08:39:42 +0000 (UTC) Message-ID: <53F45EC8.7030004@rlan.ru> Date: Wed, 20 Aug 2014 12:39:36 +0400 From: Dmitry Selivanov User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 MIME-Version: 1.0 To: "Alexander V. Chernikov" Subject: Re: ipfw named objejcts, table values and syntax change References: <53DC01DE.3000000@FreeBSD.org> <53DCA25C.1000108@FreeBSD.org> <53DF55FA.8010303@FreeBSD.org> <20140804115817.GA13814@onelab2.iet.unipi.it> <53DFE438.5050209@FreeBSD.org> <53E4BE62.4050303@rlan.ru> <53EE0A30.4020800@FreeBSD.org> <53EE16DE.9020209@rlan.ru> <53EE252D.10109@FreeBSD.org> <53F3563D.6020107@FreeBSD.org> <53F3760E.9070206@rlan.ru> <53F38B18.60409@FreeBSD.org> In-Reply-To: <53F38B18.60409@FreeBSD.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: freebsd-ipfw , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 08:39:44 -0000 19.08.2014 21:36, Alexander V. Chernikov пишет: > On 19.08.2014 20:06, Dmitry Selivanov wrote: >> 19.08.2014 17:50, Alexander V. Chernikov пишет: >>> On 15.08.2014 19:20, Alexander V. Chernikov wrote: >>>> On 15.08.2014 18:19, Dmitry Selivanov wrote: >>>>> 15.08.2014 17:25, Alexander V. Chernikov пишет: >>>>>> On 08.08.2014 16:11, Dmitry Selivanov wrote: >>>>>>> 04.08.2014 23:51, Alexander V. Chernikov пишет: >>>>>>>> On 04.08.2014 15:58, Luigi Rizzo wrote: >>>>>>>>> On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. Chernikov wrote: >>>>>>>>>> On 02.08.2014 12:33, Alexander V. Chernikov wrote: >>>>>>>>>>> On 02.08.2014 10:33, Luigi Rizzo wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov >>>>>>>>>>>> > wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hello all. >>>>>>>>>>>> >>>>>>>>>>>> I'm currently working on to enhance ipfw in some areas. >>>>>>>>>>>> The most notable (and user-visible) change is named table support. >>>>>>>>>>>> The other one is support for different lookup algorithms for different >>>>>>>>>>>> key types. >>>>>>>>>>>> >>>>>>>>>>>> For example, new ipfw permits writing this: >>>>>>>>>>>> >>>>>>>>>>>> ipfw table tb1 create type cidr >>>>>>>>>>>> ipfw add allow ip from table(tl1) to any >>>>>>>>>>>> ipfw add allow ip from any lookup dst-ip tb1 >>>>>>>>>>>> >>>>>>>>>>>> ipfw table if1 create type iface >>>>>>>>>>>> ipfw add skipto tablearg ip from any to any via table(if1) >>>>>>>>>>>> >>>>>>>>>>>> or even this: >>>>>>>>>>>> ipfw table fl1 create type flow:src-ip,proto,dst-ip,dst-port >>>>>>>>>>>> ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 4444 >>>>>>>>>>>> ipfw add allow ip from any to any flow table(fl1) >>>>>>>>>>>> >>>>>>>>>>>> all these changes fully preserve backward compatibility. >>>>>>>>>>>> (actually tables needs now to be created before use and their type needs >>>>>>>>>>>> to match with opcode used, but new ipfw(8) performs auto-creation >>>>>>>>>>>> for cidr tables). >>>>>>>>>>>> >>>>>>>>>>>> There is another thing I'm going to change and I'm not sure I can keep >>>>>>>>>>>> the same compatibility level. >>>>>>>>>>>> >>>>>>>>>>>> Table values, from one point of view, can be classified to the following >>>>>>>>>>>> types: >>>>>>>>>>>> >>>>>>>>>>>> - skipto argument >>>>>>>>>>>> - fwd argument (*) >>>>>>>>>>>> - link to another object (nat, pipe, queue) >>>>>>>>>>>> - plain u32 (not bound to any object) >>>>>>>>>>>> (divert/tee,netgraph,tag/utag,limit) >>>>>>>>>>>> >>>>>>>>>>>> There are the following reasons why I think it is necessary to implement >>>>>>>>>>>> explicit table values typing (like tables): >>>>>>>>>>>> - Implementing fwd tablearg for IPv6 hosts requires indirection table >>>>>>>>>>>> - Converting nat/pipe instance ids to names renders values unusable >>>>>>>>>>>> - retiring old hack with storing saved pointer of found object/rule >>>>>>>>>>>> inside rule w/o proper locking >>>>>>>>>>>> - making faster skipto >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> ??????i don't buy the idea that you need typed arguments >>>>>>>>>>>> for all the cases above. Maybe the case that >>>>>>>>>>>> may make sense is the fwd argument (and in the future >>>>>>>>>>>> something else). >>>>>>>>>>>> We already discussed, i think, the fact that now it >>>>>>>>>>>> is legal to have references to non existing things >>>>>>>>>>>> (skipto, pipes etc.) implemented as u32. >>>>>>>>>>>> Removing that would break configurations. >>>>>>>>>>> It depends on actual implementation. This can be preserved by >>>>>>>>>>> auto-creating necessary objects in kernel and/or in userspace, so >>>>>>>>>>> we can (and should) avoid breaking in this particular way. >>>>>>>>>> Can you please explain your vision on values another time? >>>>>>>>>> As far as I understand, you're not against it in general, but the >>>>>>>>>> details matter: >>>>>>>>>> * IP address can be one of the types (it won't break much, and we can >>>>>>>>>> simply skip that one for MFC) >>>>>>>>>> * what about typing for nat/pipes ? we're not going to convert their ids >>>>>>>>>> to names? (or maybe you can suggest other non-disruptive way?) >>>>>>>>>> * everything else is type "u32" >>>>>>>>> >>>>>>>>> Correct, I am mostly concerned about the details, not on the general concept. >>>>>>>>> >>>>>>>>> To summarize the discussion Alexander and I had about converting >>>>>>>>> identifiers from numbers to arbitrary strings (this is partly related >>>>>>>>> to the values stored in tables, but I think we should have a coherent >>>>>>>>> behaviour) >>>>>>>>> >>>>>>>>> 1. CURRENTLY ipfw uses numeric identifiers in a small range (16 bits or less) >>>>>>>>> for rules, pipes, queues, tables, probably nat instances. >>>>>>>>> >>>>>>>>> 2. CURRENTLY, in all the above contexts, it is legal to reference a >>>>>>>>> non existing object (rule, pipe, table names, etc.), >>>>>>>>> and the kernel will do something reasonable, namely jump to the >>>>>>>>> next rule, drop traffic for non existing pipes, and so on. >>>>>>>>> >>>>>>>>> 3. of course we want to preserve backward compatibility both for >>>>>>>>> the ioctl interface, and for user configurations. >>>>>>>>> >>>>>>>>> 4. The in-kernel representation of identifiers is not visible to users, >>>>>>>>> so we can use a numeric representation in the kernel for identifiers. >>>>>>>>> Strings like "12345" are converted with atoi() or the like, >>>>>>>>> whereas for other identifiers or numbers outside of the 2^16 range >>>>>>>>> the kernel manages a translation table, allocating new numeric >>>>>>>>> identifiers if a new string appears. >>>>>>>>> This permits backward compatibility for old rulesets, and does not >>>>>>>>> impact performance because the translation table is only >>>>>>>>> used during rules additions or deletion. >>>>>>>> Yes. However this requires either holding either (1) 2 pointers (old&new >>>>>>>> arrays), or (2) 65k+ index array, or (3) chained hash table. >>>>>>>> (1) would require additional pointers for each subsystem (and some >>>>>>>> additional management), >>>>>>>> (2) will definitely upset embedded guys and >>>>>>>> (3) is worse in terms of performance >>>>>>>>> >>>>>>>>> With this in mind, i think we should follow a similar approach for >>>>>>>>> objects stored in tables, hence >>>>>>>>> >>>>>>>>> if an u32 value was available in the past, it must be >>>>>>>>> available also in the new implementation. >>>>>>>>> >>>>>>>>> The issue with tables is that some convoluted configuration could >>>>>>>>> use the same table to reference pipes _and_ rules _and_ perhaps >>>>>>>>> other things represented as numbers (the former is not too strange, >>>>>>>>> if i have a large configuration i might place sections at rules >>>>>>>>> 12000, 13000, 14000... and associate pipes with the same numberic >>>>>>>>> identifier to each block of rules). >>>>>>>>> >>>>>>>>> Typed table values would clearly disturb backward compatibility >>>>>>>>> in the above configurations. However it should not be difficult >>>>>>>>> to accept arbitrary strings as the values stored in tables, and >>>>>>>>> then store multiple representations as appropriate, including: >>>>>>>> Well, I've thought about thas one. It may be an option, but the details >>>>>>>> are not so promising (below) >>>>>>>>> - the string representation, unconditionally >>>>>>>>> - for names that can be resolved by DNS, the ipv6 and ipv4 address(es) >>>>>>>>> associated with them. ipfw already translates hostnames in rules >>>>>>>>> so this is POLA >>>>>>>> I'm not happy what ipfw(8) is doing instead of translation. The proper >>>>>>>> way would be not simply using first AF_INET answer but saving ALL >>>>>>>> IPv4+IPv6 records inside rule (and some more tracking should be done >>>>>>>> afterwards, but that's totally different story). Additionally, I'm >>>>>>>> unsure if we really need next-hop value expressed as hostname (how can >>>>>>>> we deal with multiple addresses and diffrent AFs?). We may store strings >>>>>>>> (and I think we should do it) but I'm unsure about this particular >>>>>>>> option of interpreting them. >>>>>>>>> - for other strings, a u32 from the translation table as previously >>>>>>>>> indicated >>>>>>>>> - and for numeric values, the u32 representation (truncated if needed, >>>>>>>>> according to whatever is the existing behaviour) >>>>>>>>> - >>>>>>>>> If we cannot generate an u32 we will put some value (e.g. 0) >>>>>>>>> that hopefully will not cause confusion. >>>>>>>> As far as I understand, we accept some string "s" as table value inside >>>>>>>> the kernel, than, we have some logic that says: >>>>>>>> oh, dummynet pipe has the same name "s"s, oh, nat entity with name "s" >>>>>>>> has just been created, let's save indices. >>>>>>>> >>>>>>>> That would require additional indirection table like: >>>>>>>> >>>>>>>> index | [ skipto idx | nat idx | pipe idx | queue idx | fwd index ] >>>>>>>> ( so we will have 2-level indirection table for fwd if we do IPv6) >>>>>>>> >>>>>>>> We can optimize this if we use "same name -> same kidx" approach >>>>>>>> regardless of kernel object we're refering to. That might require some >>>>>>>> more memory, but that's OK from my point of view. >>>>>>>> >>>>>>>> So we end up with >>>>>>>> int [ skipto idx | fwd idx | obj idx ] >>>>>>>> >>>>>>>> idx "0" is special value which means the same as 2.CURRENT >>>>>>>> >>>>>>>> That looks better, but still way to complex. >>>>>>>> I do care about compatibility, but it's hard to improve things without >>>>>>>> changing. >>>>>>>> >>>>>>>> I'd like to propose the following: >>>>>>>> * Split values into 3 types ("ip|nexthop", "number", "object") >>>>>>>> * Do not insist on object existence, use value "0" to mimic 2.CURRENT >>>>>>>> behavior. >>>>>>>> * Retain full compatibility by introducing special value type "legacy" >>>>>>>> which matches any type and is backed by given indirection table. >>>>>>>> * Issue warning in ipfw(8) binary on all auto-created tables that >>>>>>>> auto-creation is legacy and this behavior will be dropped in next major >>>>>>>> release (e.g. 11.0) >>>>>>>> * Save this behavior in MFC but drop "legacy" tables in head after a >>>>>>>> month after actual MFC. >>>>>>>> >>>>>>>> That do you think? >>>>>>>>> >>>>>>>>> If we do it this way, we should be able to preserve backward >>>>>>>>> compatibility _and_ add features that people may need. >>>>>>>>> >>>>>>>>> cheers >>>>>>>>> luigi >>>>>>>>> >>>>>>> Here is my idea: tablearg should contain more than one value. I think getting several values from one table lookup is faster than several table lookups with one value. >>>>>>> Let tablearg be not just uint32, but array with different value types inside it. >>>>>> There are some use cases where we might need 2-level value lookup (e.g. algo returning index for index table where actual data reside) and each data item can >>>>>> really be up to 64-bytes long. The problem is in actual partitioning and compatibility. >>>>>>> >>>>>>> For example I have many such rules: >>>>>>> allow src-ip 1.2.3.4 MAC any 11:22:33:44:55:66 recv vlan1234 dst-ip 1.1.1.1 >>>>>> Sorry, what task are you solving by using given rules? >>>>> Small ISP, clients have static IP with MAC-authorization. Src iface must be checked to prevent IP-spoofing. Dst-IP sometimes is used for p2p-channels. >>>>>>> >>>>>>> These rules can be replaced with such construction: >>>>>>> allow src-ip table(1) MAC any tablearg[1] recv tablearg[2] dst-ip tablearg[3] >>>>>>> >>>>>>> But I don't think indexing by value is a good idea. I think index==starting byte is a better way: >>>>>>> allow src-ip table(1) MAC any tablearg:0 recv tablearg:6 dst-ip tablearg:32 >>>>>>> where MAC's 6 bytes are from 0 to 5 in tablearg; iface string is from 6 and till \0, but less than 26 bytes; and IPv4's 4 bytes are from 32 to 35. >>>>>> >>>>>>> So we need to create table for it: >>>>>>> table 1 set MAC:0 string:6:26 ip:32 >>>>>>> table 1 add 1.2.3.4 11:22:33:44:55:66 vlan1234 1.1.1.1 >>>>>>> >>>>>>> String can be used both for iface and comment. >>>>>>> Other possible value types: >>>>>>> uint16 for nat, pipe, skipto and other 2-bytes actions >>>>>>> IPv4 4 bytes >>>>>>> CIDRv4 5 bytes >>>>>>> IPv6 16 bytes >>>>>>> CIDRv6 17 bytes >>>>>>> table_id 2 bytes - link to another table >>>>>> Well, it seems we have enough space to store most of these, however, problems seem to remain the same: typing and compatibility. >>>>>> When you're creating new table (or it is auto-created) which values types should be assumed ? All of them? >>>>> Default - as usually uint32. >>>> I can't see "uint32" value in the list you have specified before. I'll rephrase: >>>> what value types (from the list above or similar) should ipfw(8) or kernel fill in case of "default" table? >>>> (And once again, what should we print as value) ? >>>> Please think about >>>> a) old ipfw binaries >>>> b) new ipfw binaries using exactly the same ruleset they are already using (with, for example, both "skipto tablearg" and "fwd tablearg " tables). >> At that time I meant default table "header" is "ip:0" (in my context). It would be completely compatible with old ipfw tables. >> >>> I've increased kernel<>userland 'struct tentry' value field to 64 bytes. >>> It looks like we were talking about a bit different things. >>> Let me try to explain the problem I'm stuck with: >>> >>> We may take the road you've suggested, it looks OK: >>> >>> * by default tables are created with "all-values" mask. >>> * ipfw(8) value treats default "ipfw table X add Y val" input where value is u32 number as input data for each type specified in all-values without returning error >>> * for non-default mask value data should be validated. >>> >>> e.g. if we have table with valtype="skipto,nat,pipe,ip4,ip6" and "100" as input -> it turns to "100,100,0.0.0.0,::". >> I don't fully understand. One "100" value for all valtypes? Then "100" can't be equal "0.0.0.0" and "::". Or you meant "100,100,0,0" as input? > We have to handle the case when user with _unmodified_ scripts tries to use new ipfw (either with new binary or the old one). > The goal is not to throw error and break everything, of course. > How can valtype="skipto,nat,pipe,ip4,ip6" appear in _unmodified_ scripts? I thought default valtype was uint32. >> >>> If we have value with valtype="skipto,ip6" and "100" as input -> error while the valid one would be "100,2a01::1:111", for example. >>> >>> I'm unsure how should one be able to update _specific_ value (e.g. update nat id or skipto arg), but that's not the problem. >> Maybe new command would help, like "ipfw table X set Y newval". >> >>> >>> The problem arises if we start talking about using names for nat/pipe/queue ids instead of numbers. >>> If we have nat instances "nat1", "11" and "23", and one specifies "44" as part of value, logic starts to be complex: >>> >>> we either require nat "44" to exists (and I'm unsure if we can auto-create it *) or start doing complex stuff like tracking all those non-existing objects: >>> e.g. add some special record somewhere that we're wating for nat instance "44" to be created, than auto-update given value with its kernel index, >>> than, do something reasonable if nat "44" instance is destroyed (OK, nat instance can't be destroyed, but pipe can). >>> .. and we have to do the same for pipes/queues and any following kernel object. >>> >>> Or we have to require user to reference existing objects only (create explicitly before use). This one makes things easier in code, but require user to change their scripts. >>> It looks like there is no consensus on that point. >> User can destroy object after table creating. I think this way: "no object - no packet (explicitly deny)". No need to check object existence. > Yes, but even this behavior has to be supported by kernel: > Let me explain in more details: > user calls -> ipfw nat "23" iface ... > Kernel sees string "23" which is not the name of any existing nat instance, so it creates one and allocates new kernel index for that (let it be 1). > The same for "nat1" -> 2 and "11" -> 3. Kernel indexes are purely internal and can not be referenced by userland. > > So, when you enter "44" inside new value, the following happens: > 1) some special object binding name "44" and value of record X is created > 2) nat instance list is searched to see if "44" is and existing name. If entry is found, its kernel index is saved to "value", 0 is saved otherwise. > 3) If nat entry is destroyed, we have to walk all entries and set their appropriate parts back to 0 (otherwise some other entry may use this index later leading to packes being aliased to another nat > instance. "show" command would print incorrect values, too). > > This can be done (and we have to write code for each type of kernel object, e.g. one for nat, one for pipe/queue, etc..), but require a lot of code which we would have to support forever. > I'd like an idea to enforce hard bindings (with, maybe, some intermediate period of compatible behavior for MFC). > In fact, I haven't ever needed strings instead of IDs. I always use named vars in shell scripts for table/nat/pipe IDs. And I can't advice you with this question. >>> >>> * Maybe auto-creation is not so tricky and we should try to evaluate it.. >>> >>>> >>>> >>>>>> What should `ipfw table X list` show as "value" field ? >>>>> I added table "header" in this line: >>>>> table 1 set MAC:0 string:6:26 ip:32 >>>> I don't think that user should be able to set any offsets in userland. Exact offsets of variable of given type needs to be enforced by kernel, >>>> so you may fill that you want "mac" and "ip" as values for given table, but not lengths or offsets. >> Does your way allow to use strings (e.g. iface or comments)? > I'm not sure on what you're going to do with interfaces as values. For rules flexibility. My idea was to extend using of tablearg with additional rule options (dst-ip/src-ip, MAC, recv/xmit/via; maybe dst-port/src-port, fib and other options with number/string parameters). As for iface, I explained above this rule: allow src-ip table(1) MAC any tablearg[1] recv tablearg[2], where MAC is checked for authorization and iface for antispoof. > Comments - per value or per table entry? I can think of it, but probably not all algorithms will support that functionality. Comment per table entry initially. But it would be worth to be implemented only if iface valtype was implemented (I think both of them can be string). Even more, we can use such valtype (in your form): skipto,nat,comment,pipe,ip4,ip6,comment,ip4 - I think nothing prevents use more than one value with the same type. I understand that my example is very unreal, but it's example only. >> >>>>> So `ipfw table X list` should show something like this: >>>>> ---table(0)--- >>>>> 1.2.3.4/32 11:22:33:44:55:66 vlan1234 1.1.1.1 >>>>> We can also add "header" description in output (with or without additional parameter - depends on compatibility needs) like this: >>>>> ---table(0)--- addr MAC iface IPv4 >>>>>> How should ipfw(8) treat "add 1.1.1.1 0" input? >>>>> It should look at table "header" and return error message like "Value doesn't match table header" >>>> >>>>>> What will happen if we want to add another type field to this list? (MAC address of Infiniband MAC address, for example). >>>>> I don't think there is a sense to mix both MAC[6] and MAC[20] values in 1 table. It is easier to create 2 tables with different "headers". >>>>> For Infiniband we can add another type: MAC20 (or something like this). Or we can use "MAC"-type like string type(see above): MAC:6:25 (1st and last bytes, or 1st and length). >>>>>> >>>>>>> >>>>>>> Table value length can be set for example with loader tunable like net.inet.ip.fw.table_value_length. >>>>>>> Even with default uint32 value length we can get 2 uint16 values or 4 uint8 values, this can help in some configurations. >>>>>>> >>>>>>> This way is more complex, but much more flexible. It's like netgraph subsystem. >>>>>>> I think it suites both Alexander and Luigi requests. >>>>>>> From owner-freebsd-net@FreeBSD.ORG Wed Aug 20 09:32:28 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E69F88B5; Wed, 20 Aug 2014 09:32:28 +0000 (UTC) Received: from mail-lb0-x233.google.com (mail-lb0-x233.google.com [IPv6:2a00:1450:4010:c04::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 305B8344F; Wed, 20 Aug 2014 09:32:28 +0000 (UTC) Received: by mail-lb0-f179.google.com with SMTP id v6so6483974lbi.38 for ; Wed, 20 Aug 2014 02:32:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=5eT3lTm3TgjugthKLkDN25vxrfbSC4ZBgOuL9WpTP8o=; b=bWwUfIg3cCXlDeaTNK17k2iiIufLXDD2O5jzcZaBvEOWukrRbP6Ay2mnTWqHCpgcRh fe2RcLjjsXFSEPRcL1BVQ4aCdeKLr5fHhjYvuDxlnzQlsmiU6gNtLWyDhCoSQyLf9/qr I4vcvlahaTuVRPFHg2RqjQOIkh6MdFsSAD4VkO7/plUhmH+hTyhm83YjVu6+gcPfL6tu QtOI+6W+EL16fXYNN/Ss69XfJ21rJZUbT4oR2NkhuHVmpO/C4YRaJlg13hpi5PK61Tqd w98pXNfD7z0GkVLSiiww3w2UwQosKZxSsVlkB2tjr+juJzIiq/nQVewQ2AkfIFKpRdq2 Kiew== MIME-Version: 1.0 X-Received: by 10.112.35.44 with SMTP id e12mr38763111lbj.13.1408527146067; Wed, 20 Aug 2014 02:32:26 -0700 (PDT) Sender: rizzo.unipi@gmail.com Received: by 10.114.244.2 with HTTP; Wed, 20 Aug 2014 02:32:26 -0700 (PDT) In-Reply-To: <53F44F91.2060006@selasky.org> References: <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> <53F44F91.2060006@selasky.org> Date: Wed, 20 Aug 2014 11:32:26 +0200 X-Google-Sender-Auth: KXSBRY2Umn7HUstkEjYX2X6Xqnk Message-ID: Subject: Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections] From: Luigi Rizzo To: Hans Petter Selasky Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , FreeBSD Current X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 09:32:29 -0000 On Wed, Aug 20, 2014 at 9:34 AM, Hans Petter Selasky wrote: > Hi, > > A month has passed since the last e-mail on this topic, and in the > meanwhile some new patches have been created and tested: > > Basically the approach has been changed a little bit: > > - The creation of hardware transmit rings has been made independent of th= e > TCP stack. This allows firewall applications to forward traffic into > hardware transmit rings aswell, and not only native TCP applications. Thi= s > should be one more reason to get the feature into the kernel. > > - A hardware transmit ring basically can have two modes: FIXED-RATE or > AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed byt= es > per second rate. In the automatic mode you can configure a time after whi= ch > the TX queue must be empty. The hardware driver uses this to configure th= e > actual rate. In automatic mode you can also set an upper and lower transm= it > rate limit. > > - The MBUF has got a new field in the packet header: "txringid" > > - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of > the "txringid" field in the mbuf. > > The current patch [see attachment] should be much simpler and less > intrusive than the previous one. > =E2=80=8Bthe patch seems to include only part of the generic code (ie no io= ctls for manipulating the rates, no backend code). Do i miss something ? I have a few comments/concerns: + looks like flowid and txringid are overlapped in scope, both will be used (in the backend) to select a specific tx queue. I don't have a solution but would like to know how do you plan to address this -- does one have priority over the other, etc. + related to the above, a (possibly unavoidable) side effect of this type of changes is that mbufs explode with custom fields, so if we could perhaps make one between flowid and txringid, that would be useful. + is there a way to =E2=80=8Bavoid the replicated code for SIOCSTXRINGID (the ioctl handler, i suppose). Maybe make one function and call it from both ipv4 and ipv6, assuming there aren't other places like this. + i am not particularly happy about the explosion of ioctls for setting and getting rates. Next we'll want to add scheduling, and intervals, and queue sizes and so on. For these commands outside the critical path it would be preferable a single command with an extensible structure. Bikeshed material i am sure. cheers luigi From owner-freebsd-net@FreeBSD.ORG Wed Aug 20 13:29:10 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C7D73C0B; Wed, 20 Aug 2014 13:29:10 +0000 (UTC) Received: from mail.turbocat.net (mail.turbocat.net [IPv6:2a01:4f8:d16:4514::2]) (using TLSv1.1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 59A743EF1; Wed, 20 Aug 2014 13:29:10 +0000 (UTC) Received: from laptop015.home.selasky.org (cm-176.74.213.204.customer.telag.net [176.74.213.204]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 2A3F01FE027; Wed, 20 Aug 2014 15:29:04 +0200 (CEST) Message-ID: <53F4A2AF.6080102@selasky.org> Date: Wed, 20 Aug 2014 15:29:19 +0200 From: Hans Petter Selasky User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: Luigi Rizzo Subject: Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections] References: <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> <53F44F91.2060006@selasky.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: "freebsd-net@freebsd.org" , FreeBSD Current X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 13:29:11 -0000 Hi Luigi, On 08/20/14 11:32, Luigi Rizzo wrote: > On Wed, Aug 20, 2014 at 9:34 AM, Hans Petter Selasky > wrote: > >> Hi, >> >> A month has passed since the last e-mail on this topic, and in the >> meanwhile some new patches have been created and tested: >> >> Basically the approach has been changed a little bit: >> >> - The creation of hardware transmit rings has been made independent of the >> TCP stack. This allows firewall applications to forward traffic into >> hardware transmit rings aswell, and not only native TCP applications. This >> should be one more reason to get the feature into the kernel. >> >> - A hardware transmit ring basically can have two modes: FIXED-RATE or >> AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed bytes >> per second rate. In the automatic mode you can configure a time after which >> the TX queue must be empty. The hardware driver uses this to configure the >> actual rate. In automatic mode you can also set an upper and lower transmit >> rate limit. >> >> - The MBUF has got a new field in the packet header: "txringid" >> >> - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of >> the "txringid" field in the mbuf. >> >> The current patch [see attachment] should be much simpler and less >> intrusive than the previous one. >> > > ​the patch seems to include only part of the generic code (ie no ioctls > for manipulating the rates, no backend code). Do i miss something ? The IOCTLs for managing the rates are: SIOCARATECTL, SIOCSRATECTL, SIOCGRATECTL and SIOCDRATECTL And they go to the if_ioctl callback. > > I have a few comments/concerns: > > + looks like flowid and txringid are overlapped in scope, > both will be used (in the backend) to select a specific > tx queue. I don't have a solution but would like to know > how do you plan to address this -- does one have priority > over the other, etc. Not 100% . In some cases the flowID is used differently than the txringid, though it might be possible to join the two. Would need to investigate current users of the flow ID. > + related to the above, a (possibly unavoidable) side effect > of this type of changes is that mbufs explode with custom fields, > so if we could perhaps make one between flowid and txringid, > that would be useful. Right, but ratecontrol is an in-general useful feature, especially for high throughput networks, or do you think otherwise? > > + is there a way to ​avoid the replicated code for SIOCSTXRINGID > (the ioctl handler, i suppose). Maybe make one function and > call it from both ipv4 and ipv6, assuming there aren't other > places like this. Yes, could do that. > > + i am not particularly happy about the explosion of ioctls for > setting and getting rates. Next we'll want to add scheduling, > and intervals, and queue sizes and so on. > For these commands outside the critical path it would be > preferable a single command with an extensible structure. > Bikeshed material i am sure. There is only one IOCTL in the critical path and that is the IOCTL to change or update the TX ring ID. The other IOCTLs are in the non-critical path towards the if_ioctl() callback. If we can merge the flowID and the txringid into one field, would it be acceptable to add an IOCTL to read/write this value for all sockets? --HPS From owner-freebsd-net@FreeBSD.ORG Wed Aug 20 14:41:29 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 61FD589F; Wed, 20 Aug 2014 14:41:29 +0000 (UTC) Received: from mail-la0-x229.google.com (mail-la0-x229.google.com [IPv6:2a00:1450:4010:c03::229]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 96ADD36C4; Wed, 20 Aug 2014 14:41:28 +0000 (UTC) Received: by mail-la0-f41.google.com with SMTP id s18so7390085lam.14 for ; Wed, 20 Aug 2014 07:41:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=/cEvyJg2v1esY+eJ4fpvO3kbpP3t7VTjrLqwzxVHAyQ=; b=iVQ4PHBNtiS+6uN752S4ynKUyHkm4Q8Bk0SAzD77TXFUr/E9G7u7aICWZ7gZdszeF1 w+3pi92gXn5j/uRcf9z2uqxGAbPUe7BEbKg/9/ueRijWKMbG5vrvqT6ZkE8BC9kCxxOV amzXZdRHGxaN71Zqde7JPXdiz/voM5N3MiAtI11WfccKuByMXjUsKJ+R0rdxrycLJkyj e/zOda44X0wirwTira38HFuepLVXZXufxFKpygyza8gcm1BeZCJ17gqRBAlfWCbycePR p93o2O0g6LhuU3T1MOfJ0vhohui5+7gudlHlGah/4oAvwgZOAnCELylyQ8b5wXqkcjyH qMzQ== MIME-Version: 1.0 X-Received: by 10.152.243.43 with SMTP id wv11mr43884437lac.52.1408545686343; Wed, 20 Aug 2014 07:41:26 -0700 (PDT) Sender: rizzo.unipi@gmail.com Received: by 10.114.244.2 with HTTP; Wed, 20 Aug 2014 07:41:26 -0700 (PDT) In-Reply-To: <53F4A2AF.6080102@selasky.org> References: <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> <53F44F91.2060006@selasky.org> <53F4A2AF.6080102@selasky.org> Date: Wed, 20 Aug 2014 16:41:26 +0200 X-Google-Sender-Auth: nsckhSo8zl9qPRFJY6BwzoD4CF4 Message-ID: Subject: Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections] From: Luigi Rizzo To: Hans Petter Selasky Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , FreeBSD Current X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 14:41:29 -0000 On Wed, Aug 20, 2014 at 3:29 PM, Hans Petter Selasky wrote: > Hi Luigi, > > > On 08/20/14 11:32, Luigi Rizzo wrote: > >> On Wed, Aug 20, 2014 at 9:34 AM, Hans Petter Selasky >> wrote: >> >> Hi, >>> >>> A month has passed since the last e-mail on this topic, and in the >>> meanwhile some new patches have been created and tested: >>> >>> Basically the approach has been changed a little bit: >>> >>> - The creation of hardware transmit rings has been made independent of >>> the >>> TCP stack. This allows firewall applications to forward traffic into >>> hardware transmit rings aswell, and not only native TCP applications. >>> This >>> should be one more reason to get the feature into the kernel. >>> =E2=80=8B... >>> >> =E2=80=8Bthe patch seems to include only part of the generic code (ie no= ioctls >> for manipulating the rates, no backend code). Do i miss something ? >> > > The IOCTLs for managing the rates are: > > SIOCARATECTL, SIOCSRATECTL, SIOCGRATECTL and SIOCDRATECTL > > And they go to the if_ioctl callback.=E2=80=8B =E2=80=8Bi really think these new 'advanced' features should go through some ethtool-like API, not more ioctls. We have a strong need to design and implement such an API also to have a uniform mechanism to manipulate rss, queues and other NIC features. =E2=80=8B...=E2=80=8B > > > >> I have a few comments/concerns: >> >> + looks like flowid and txringid are overlapped in scope, >> both will be used (in the backend) to select a specific >> tx queue. I don't have a solution but would like to know >> how do you plan to address this -- does one have priority >> over the other, etc. >> > > Not 100% . In some cases the flowID is used differently than the txringid= , > though it might be possible to join the two. Would need to investigate > current users of the flow ID. =E2=80=8Bin some 10G drivers i have seen, at the driver level the flowid is used on the tx path to assign packets to a given =E2=80=8Btx queue, generally to improve cpu affinity. Of course some applications may want a true flow classifier so they do not have to re-do the classification multiple times. But then, we have a ton of different classifiers with the same need -- e.g. ipfw dynamic rules, dummynet pipe/queue id, divert ports... Pipes are stored in mtags, which are very expensive so i do see a point in embedding them in the mbufs, it's just that going this path there is no end to the list. > > + related to the above, a (possibly unavoidable) side effect >> of this type of changes is that mbufs explode with custom fields, >> so if we could perhaps make one between flowid and txringid, >> that would be useful. >> > > Right, but ratecontrol is an in-general useful feature, especially for > high throughput networks, or do you think otherwise? of course i think =E2=80=8Bthe feature is useful, but see the previous point. We should find a way to manage it (and others) that does not pollute or require continuous changes to the struct mbuf. > > > >> + i am not particularly happy about the explosion of ioctls for >> setting and getting rates. Next we'll want to add scheduling, >> and intervals, and queue sizes and so on. >> For these commands outside the critical path it would be >> preferable a single command with an extensible structure. >> Bikeshed material i am sure. >> > > There is only one IOCTL in the critical path and that is the IOCTL to > change or update the TX ring ID. The other IOCTLs are in the non-critical > path towards the if_ioctl() callback. > > If we can merge the flowID and the txringid into one field, would it be > acceptable to add an IOCTL to read/write this value for all sockets? =E2=80=8Bsee above. i'd prefer an ethtool-like solution. cheers luigi =E2=80=8B From owner-freebsd-net@FreeBSD.ORG Wed Aug 20 16:21:40 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id ECC7618B; Wed, 20 Aug 2014 16:21:40 +0000 (UTC) Received: from mail-qc0-x234.google.com (mail-qc0-x234.google.com [IPv6:2607:f8b0:400d:c01::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9627D342F; Wed, 20 Aug 2014 16:21:40 +0000 (UTC) Received: by mail-qc0-f180.google.com with SMTP id l6so8130128qcy.11 for ; Wed, 20 Aug 2014 09:21:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=MEDBPn7oIvwUjsxh+CZpz1yVxFwbna3Jd9wQC8xQs50=; b=yxWqkXBFW4xL9BZUzLpcOaxil13JY3meOvcOzFIUqJPS0PqGqaCHYDNbqcT8sUC6w8 tshddY1BfrZksKwNvzGpUrez9ye1r05R+PDVCXXrAYfxFGg3tJxRn0zNYY8++HtfLPZG PhZr7NocDnofnh1/AP5Rygy9RiuOmAN9nLUkP2kS0aX9+NrGeTwzmJHRVn7TacGeSSAT K4TMrfrnA842ZogAdOujJq2mYap/tUGKpq/15VPSXw1ePbJq70nIHc03YqbNdy+WZfKS PSNwYbtUZfBYDNRa5K4yXWS7k3doNkRLqKh9vHs/76dIEkUFS4/LPQOQAR4eRddjsbDN BLQQ== MIME-Version: 1.0 X-Received: by 10.224.12.134 with SMTP id x6mr39794633qax.1.1408551699682; Wed, 20 Aug 2014 09:21:39 -0700 (PDT) Sender: kmacybsd@gmail.com Received: by 10.224.17.129 with HTTP; Wed, 20 Aug 2014 09:21:39 -0700 (PDT) In-Reply-To: References: <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> <53F44F91.2060006@selasky.org> <53F4A2AF.6080102@selasky.org> Date: Wed, 20 Aug 2014 09:21:39 -0700 X-Google-Sender-Auth: LcH8QDyKws6ZXEPdmZq2F5rVOkI Message-ID: Subject: Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections] From: "K. Macy" To: Luigi Rizzo Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Hans Petter Selasky , "freebsd-net@freebsd.org" , FreeBSD Current X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 16:21:41 -0000 On Wed, Aug 20, 2014 at 7:41 AM, Luigi Rizzo wrote: > On Wed, Aug 20, 2014 at 3:29 PM, Hans Petter Selasky > wrote: > > > Hi Luigi, > > > > > > On 08/20/14 11:32, Luigi Rizzo wrote: > > > >> On Wed, Aug 20, 2014 at 9:34 AM, Hans Petter Selasky > >> wrote: > >> > >> Hi, > >>> > >>> A month has passed since the last e-mail on this topic, and in the > >>> meanwhile some new patches have been created and tested: > >>> > >>> Basically the approach has been changed a little bit: > >>> > >>> - The creation of hardware transmit rings has been made independent o= f > >>> the > >>> TCP stack. This allows firewall applications to forward traffic into > >>> hardware transmit rings aswell, and not only native TCP applications. > >>> This > >>> should be one more reason to get the feature into the kernel. > >>> =E2=80=8B... > >>> > >> =E2=80=8Bthe patch seems to include only part of the generic code (ie = no ioctls > >> for manipulating the rates, no backend code). Do i miss something ? > >> > > > > The IOCTLs for managing the rates are: > > > > SIOCARATECTL, SIOCSRATECTL, SIOCGRATECTL and SIOCDRATECTL > > > > And they go to the if_ioctl callback.=E2=80=8B > > > =E2=80=8Bi really think these new 'advanced' features should go > through some ethtool-like API, not more ioctls. > We have a strong need to design and implement such > an API also to have a uniform mechanism to manipulate > rss, queues and other NIC features. > > > There is no ethtool equivalent yet, but exposing them through a sysctl is definitely the place to start before putting it straight in to ifconfig. The ifnet API is already a bit of a mess. > =E2=80=8B...=E2=80=8B > > > > > > > >> I have a few comments/concerns: > >> > >> + looks like flowid and txringid are overlapped in scope, > >> both will be used (in the backend) to select a specific > >> tx queue. I don't have a solution but would like to know > >> how do you plan to address this -- does one have priority > >> over the other, etc. > >> > > > > Not 100% . In some cases the flowID is used differently than the > txringid, > > though it might be possible to join the two. Would need to investigate > > current users of the flow ID. > > > =E2=80=8Bin some 10G drivers i have seen, at the driver > level the flowid is used on the tx path to assign > packets to a given =E2=80=8Btx queue, generally to improve > cpu affinity. Of course some applications > may want a true flow classifier so they do not > have to re-do the classification multiple times. > But then, we have a ton of different classifiers > with the same need -- e.g. ipfw dynamic rules, > dummynet pipe/queue id, divert ports... > Pipes are stored in mtags, which are very expensive > so i do see a point in embedding them in the mbufs, > it's just that going this path there is no end > to the list. > > > The purpose of the flowid was to enforce packet ordering on transmit while being large enough to store a RSS hash, potentially allowing input consumers to use it to semi-uniquely label (srcip, srcport, dstip, dstport) tuples. It seems to that the txringid would be almost entirely redundant. Why not just let users set the flowid? > If we can merge the flowID and the txringid into one field, would it be > > acceptable to add an IOCTL to read/write this value for all sockets? > > That sounds reasonable - although I have not thought through all the implications. -K From owner-freebsd-net@FreeBSD.ORG Wed Aug 20 17:40:45 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 500FC44F for ; Wed, 20 Aug 2014 17:40:45 +0000 (UTC) Received: from mail-qc0-x236.google.com (mail-qc0-x236.google.com [IPv6:2607:f8b0:400d:c01::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 112033C72 for ; Wed, 20 Aug 2014 17:40:45 +0000 (UTC) Received: by mail-qc0-f182.google.com with SMTP id i8so8145053qcq.27 for ; Wed, 20 Aug 2014 10:40:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=acPauPzGLyXBzvx+rEELT34xA7A+QMJczD4TviVGVRs=; b=o9GEoM11bdsVJ5xXhTuWZCB7Rht6GznGcFjgmtynTIKxQGMBwsKevAZ9+ivU3XTyYo WxK9KkxGWT9yZ+rGf0c4fa+LvDzHZxPM+K34iJ0Z8WuEwIihUaxyyh+btycGFix88MnJ qCGyPUlyLm8/+BXhiCwT+Tb3thdnXgk1B62mQcJv8nFQU4uYfCm2qx0lXO1uPhSxYGdO ntPyLy4D1eOKaL66RRsj+EWumz1vRggNMeCfO5oLJhPeivB9Uea788QXs0twSGct2kKm czKRyz4ufMQ1Tp4mg2cdHqjIZpO02RTvtnhxm829Fe7tkPIdUJvQ50qgHWRCWGoDbWwY /kiw== MIME-Version: 1.0 X-Received: by 10.224.36.130 with SMTP id t2mr25488330qad.45.1408556442446; Wed, 20 Aug 2014 10:40:42 -0700 (PDT) Sender: hiren.panchasara@gmail.com Received: by 10.96.170.230 with HTTP; Wed, 20 Aug 2014 10:40:42 -0700 (PDT) In-Reply-To: References: Date: Wed, 20 Aug 2014 10:40:42 -0700 X-Google-Sender-Auth: c74y_-U9G1uBoGShsM436Gmz0NA Message-ID: Subject: Re: Regression test suite for TCP From: hiren panchasara To: Anuranjan Shukla Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 17:40:45 -0000 On Mon, Aug 18, 2014 at 4:59 PM, Anuranjan Shukla wrote: > If you're willing to shell out some $$, Ixia's ANVL is a fairly detailed > test suite for TCP and other protocols. It's available as a software you > can install on lnx/windows. I'd used it at Juniper while working with > Robert for the connection groups work a couple years or so back. Thanks Anu. It looks pretty good but I am looking for something more on open source side. cheers, Hiren > > Regards, > -Anu > > On 8/11/14, 5:07 PM, "hiren panchasara" wrote: > >>I was looking for one and found >>https://wiki.freebsd.org/SummerOfCode2008#TCP.2FIP_regression_test_suite_. >>28tcptest.29 >>which is a good start but needs a lot of love (work). >> >>Please share if you are aware of any covering basic scenarios. >> >>cheers, >>Hiren >>_______________________________________________ >>freebsd-net@freebsd.org mailing list >>http://lists.freebsd.org/mailman/listinfo/freebsd-net >>To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Wed Aug 20 18:44:31 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A224B90D; Wed, 20 Aug 2014 18:44:31 +0000 (UTC) Received: from mail-pa0-x22a.google.com (mail-pa0-x22a.google.com [IPv6:2607:f8b0:400e:c03::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6E6A938B3; Wed, 20 Aug 2014 18:44:31 +0000 (UTC) Received: by mail-pa0-f42.google.com with SMTP id lf10so12789518pab.15 for ; Wed, 20 Aug 2014 11:44:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=7GC/NyvsdtfbrqnjCHst0OjcrxRKe5VSynfwWwyIYtw=; b=w5A5IBHMUlaA4Rp2nOON2UvyzvPFLy0CWG7H0xhhbkq/gQqVQu996OVrkyH9UvlYcC Xz2NZaXDv2Rpkt6jdsoRlFbe2PRC/WPAl+juf5he0wSIXauvZALj/z90zqzIu2LHjOHi 4WDLtcDYatOnTbe84VtKplpok2DlmB4Z5AJNzKIMcvmccbYa5c6Vh03kk2T6gif9TvYA ziEuJdKhYbWLINQZOrKPkK5sgVnHFn3d3IondpHIAs7WA2rIhsis/DJZdukdJfgRXrIv HOiALxG1lEeVVIb9KblvoOYVkfYo1DmxC8BYLw/GUzGBAzisMxrC9Yq1d/s5dEBRB+Jx LMEw== X-Received: by 10.68.69.46 with SMTP id b14mr54338257pbu.70.1408560270443; Wed, 20 Aug 2014 11:44:30 -0700 (PDT) Received: from [10.192.166.0] (stargate.chelsio.com. [67.207.112.58]) by mx.google.com with ESMTPSA id ou8sm11171547pbc.84.2014.08.20.11.44.29 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Aug 2014 11:44:29 -0700 (PDT) Sender: Navdeep Parhar Message-ID: <53F4EC8A.9090804@FreeBSD.org> Date: Wed, 20 Aug 2014 11:44:26 -0700 From: Navdeep Parhar User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: Hans Petter Selasky , freebsd-net@freebsd.org, FreeBSD Current Subject: Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections] References: <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> <53F44F91.2060006@selasky.org> In-Reply-To: <53F44F91.2060006@selasky.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 18:44:31 -0000 On 08/20/14 00:34, Hans Petter Selasky wrote: > Hi, > > A month has passed since the last e-mail on this topic, and in the > meanwhile some new patches have been created and tested: > > Basically the approach has been changed a little bit: > > - The creation of hardware transmit rings has been made independent of > the TCP stack. This allows firewall applications to forward traffic into > hardware transmit rings aswell, and not only native TCP applications. > This should be one more reason to get the feature into the kernel. > > - A hardware transmit ring basically can have two modes: FIXED-RATE or > AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed > bytes per second rate. In the automatic mode you can configure a time > after which the TX queue must be empty. The hardware driver uses this to > configure the actual rate. In automatic mode you can also set an upper > and lower transmit rate limit. > > - The MBUF has got a new field in the packet header: "txringid" > > - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of > the "txringid" field in the mbuf. > > The current patch [see attachment] should be much simpler and less > intrusive than the previous one. > > Any comments ? Here are some thoughts. The first two bullets cover relatively minor issues, the rest are more important. - All of the mbuf pkthdr fields today have the same meaning no matter what the context. It is not clear what txringid's global meaning is. Is it even possible for driver foo to interpret it the same way as driver bar? What if the number of rings are different, or if the ring at the particular index for foo is setup differently than the ring at that same index for bar? You are attempting to influence the driver's txq selection and traditionally the mbuf's flowid has been used for this purpose. Have you considered allowing the user to set the flowid directly? And mark it as such via a new rsstype so the kernel will leave it alone. - uint32_t -> m_flowid_t is plain gratuitous. Now we need to include mbuf.h in more places just to get this definition. What's the advantage of this? style(9) isn't too fond of typedefs either. Also, drivers *do* need to know the width of the flowid. At least lagg(4) looks at the high bits of the flowid (see flowid_shift in lagg). How high it can go depends on the width of the flowid. - Interfaces can come and go, routes can change, and so the relationship between an inpcb and txringid is not stable at all. What happens when the outbound route for an inpcb changes? - The in_ratectlreq structure that you propose is inadequate in its current form. For example, cxgbe's hardware can do rate limiting on a per-ring as well as per-connection basis, and it allows for pps, bandwidth, or min-max limits. I think this is the critical piece that we NIC maintainers must agree on before any code hits the core kernel: how to express a rate-limit policy in a standard way and allow for hardware assistance opportunistically. ipfw(4)'s dummynet is probably interested in this part too, so it's great that Luigi is paying attention to this thread. - The RATECTL ioctls deal with in_ratectlreq so we need to standardize the ratectlreq structure before these ioctls can be considered generic ifnet ioctls. This is the reason cxgbetool (and not ifconfig) has a private ioctl to frob cxgbe's per-queue rate-limiters. I did not want to add ifnet ioctls that in reality were cxgbe only. Ditto for i2c ioctls. Now we have multiple drivers with i2c and melifaro@ is doing the right thing by promoting these private ioctls to a standard ifnet ioctl. Have you considered a private mlxtool as a stop gap measure? To summarize my take on all of this: we need a standard ratectlreq structure, a standard way to associate an inpcb with one, and a standard way to pass on this info to if_transmit. After all this is in place we could even have a dummynet-ish software layer that implements rate limiters when the underlying hardware offers no assistance. Regards, Navdeep From owner-freebsd-net@FreeBSD.ORG Wed Aug 20 19:25:13 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8A34D865; Wed, 20 Aug 2014 19:25:13 +0000 (UTC) Received: from mail.turbocat.net (mail.turbocat.net [IPv6:2a01:4f8:d16:4514::2]) (using TLSv1.1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 322913C74; Wed, 20 Aug 2014 19:25:13 +0000 (UTC) Received: from laptop015.home.selasky.org (cm-176.74.213.204.customer.telag.net [176.74.213.204]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 572C91FE027; Wed, 20 Aug 2014 21:25:11 +0200 (CEST) Message-ID: <53F4F626.9030806@selasky.org> Date: Wed, 20 Aug 2014 21:25:26 +0200 From: Hans Petter Selasky User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: Navdeep Parhar , freebsd-net@freebsd.org, FreeBSD Current Subject: Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections] References: <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> <53F44F91.2060006@selasky.org> <53F4EC8A.9090804@FreeBSD.org> In-Reply-To: <53F4EC8A.9090804@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 19:25:13 -0000 On 08/20/14 20:44, Navdeep Parhar wrote: > On 08/20/14 00:34, Hans Petter Selasky wrote: >> Hi, >> >> A month has passed since the last e-mail on this topic, and in the >> meanwhile some new patches have been created and tested: >> >> Basically the approach has been changed a little bit: >> >> - The creation of hardware transmit rings has been made independent of >> the TCP stack. This allows firewall applications to forward traffic into >> hardware transmit rings aswell, and not only native TCP applications. >> This should be one more reason to get the feature into the kernel. >> >> - A hardware transmit ring basically can have two modes: FIXED-RATE or >> AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed >> bytes per second rate. In the automatic mode you can configure a time >> after which the TX queue must be empty. The hardware driver uses this to >> configure the actual rate. In automatic mode you can also set an upper >> and lower transmit rate limit. >> >> - The MBUF has got a new field in the packet header: "txringid" >> >> - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of >> the "txringid" field in the mbuf. >> >> The current patch [see attachment] should be much simpler and less >> intrusive than the previous one. >> >> Any comments ? > > > Here are some thoughts. The first two bullets cover relatively > minor issues, the rest are more important. > > - All of the mbuf pkthdr fields today have the same meaning no matter > what the context. It is not clear what txringid's global meaning is. > Is it even possible for driver foo to interpret it the same way as > driver bar? What if the number of rings are different, or if the ring > at the particular index for foo is setup differently than the ring at > that same index for bar? You are attempting to influence the driver's > txq selection and traditionally the mbuf's flowid has been used for > this purpose. Have you considered allowing the user to set the flowid > directly? And mark it as such via a new rsstype so the kernel will > leave it alone. Hi, At work so to speak, we have tried to make a simple approach that will not break existing code, without trying to optimise the possibilities and reduce memory footprint. > > - uint32_t -> m_flowid_t is plain gratuitous. Now we need to include > mbuf.h in more places just to get this definition. What's the > advantage of this? style(9) isn't too fond of typedefs either. Also, > drivers *do* need to know the width of the flowid. At least lagg(4) > looks at the high bits of the flowid (see flowid_shift in lagg). How > high it can go depends on the width of the flowid. The flowid should be typedef'ed. Else how can you know its type passing flowid along function arguments and so on? > > - Interfaces can come and go, routes can change, and so the relationship > between an inpcb and txringid is not stable at all. What happens when > the outbound route for an inpcb changes? This is managed separately by a daemon or such. The problem about using the "inpcb" approach which you are suggesting, is that you limit the rate control feature to traffic which is bound by sockets. Can your way of doing rate control be useful to non-socket based firewall applications, for example? You also assume a 1:1 mapping between "inpcb" and the flowID, right. What about M:N mappings, where multiple streams should share the same flowID, because it makes more sense? > > - The in_ratectlreq structure that you propose is inadequate in its > current form. For example, cxgbe's hardware can do rate limiting on a > per-ring as well as per-connection basis, and it allows for pps, > bandwidth, or min-max limits. I think this is the critical piece that > we NIC maintainers must agree on before any code hits the core kernel: > how to express a rate-limit policy in a standard way and allow for > hardware assistance opportunistically. ipfw(4)'s dummynet is probably > interested in this part too, so it's great that Luigi is paying > attention to this thread. My "in_ratectlreq" is a work in progress. > > - The RATECTL ioctls deal with in_ratectlreq so we need to standardize > the ratectlreq structure before these ioctls can be considered generic > ifnet ioctls. This is the reason cxgbetool (and not ifconfig) has a > private ioctl to frob cxgbe's per-queue rate-limiters. I did not want > to add ifnet ioctls that in reality were cxgbe only. Ditto for i2c > ioctls. Now we have multiple drivers with i2c and melifaro@ is doing > the right thing by promoting these private ioctls to a standard ifnet > ioctl. Have you considered a private mlxtool as a stop gap measure? It might end that we need to create our own tool for this, having vendor specific IOCTLs, if we cannot agree how to do this in a general way. > > To summarize my take on all of this: we need a standard ratectlreq > structure, Agree. > a standard way to associate an inpcb with one, Maybe. > and a standard > way to pass on this info to if_transmit. Agree. > After all this is in place we > could even have a dummynet-ish software layer that implements rate > limiters when the underlying hardware offers no assistance. Right. --HPS From owner-freebsd-net@FreeBSD.ORG Wed Aug 20 20:15:07 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A4D80F3B; Wed, 20 Aug 2014 20:15:07 +0000 (UTC) Received: from mail-pa0-x230.google.com (mail-pa0-x230.google.com [IPv6:2607:f8b0:400e:c03::230]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 751743170; Wed, 20 Aug 2014 20:15:07 +0000 (UTC) Received: by mail-pa0-f48.google.com with SMTP id et14so12793981pad.7 for ; Wed, 20 Aug 2014 13:15:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:subject:message-id:importance:from:to:cc:mime-version :content-type; bh=8nH7prsQCetICA/RViAvBXttIJLivJu4Bbg5l6K/ZPE=; b=o2XRlRThqEXzoQCInjp1YaKWxxj3TYyMe2+GgneenMoaOXTehPdO238LPj+UDB8u3/ TTKiWBbN+v+0eyY8p3V8R2b8sni3IekRda328hDgMkrS1lFGW/YMN4Aa1w1PDItD6mLL 5esyYrQdPkToKTwPDbqPJiYVI8VY7tDpYyOL6Us+Dxhc7UX6QW8dETFxey1t2i6aumHS 2tbBGojMcev+cryZt8kRmpydpNBC6cKT9oAhrQTZXLaSWC4/x67tazStniCu/NiLCNjr f28HVXYbfaTviPfIxY5dm8RI+0gzCyIMIJh+j6qQdjI43Cg1YTOBkGxAqjmQMKCcY6P/ sNIQ== X-Received: by 10.66.66.225 with SMTP id i1mr55249648pat.56.1408565704210; Wed, 20 Aug 2014 13:15:04 -0700 (PDT) Received: from [10.55.52.214] ([198.95.226.236]) by mx.google.com with ESMTPSA id fw9sm17402103pdb.45.2014.08.20.13.15.02 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 20 Aug 2014 13:15:03 -0700 (PDT) Date: Wed, 20 Aug 2014 13:14:57 -0700 Subject: Re: Regression test suite for TCP Message-ID: <7evw8ss04fyvhld15h9fdx7n.1408565695689@email.android.com> Importance: normal From: "vijju.singh" To: hiren panchasara , Anuranjan Shukla MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: base64 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 20:15:07 -0000 SGF2ZSB5b3UgbG9va2VkIGF0IHBhY2tldGRyaWxsIGZyb20gR29vZ2xlP8KgCgoKU2VudCB2aWEg dGhlIFNhbXN1bmcgR0FMQVhZIFPCrjQsIGFuIEFUJlQgNEcgTFRFIHNtYXJ0cGhvbmUKCjxkaXY+ LS0tLS0tLS0gT3JpZ2luYWwgbWVzc2FnZSAtLS0tLS0tLTwvZGl2PjxkaXY+RnJvbTogaGlyZW4g cGFuY2hhc2FyYSA8aGlyZW5ARnJlZUJTRC5vcmc+IDwvZGl2PjxkaXY+RGF0ZTowOC8yMC8yMDE0 ICAxMDo0MCBBTSAgKEdNVC0wODowMCkgPC9kaXY+PGRpdj5UbzogQW51cmFuamFuIFNodWtsYSA8 YW5zaHVrbGFAanVuaXBlci5uZXQ+IDwvZGl2PjxkaXY+Q2M6IGZyZWVic2QtbmV0QGZyZWVic2Qu b3JnIDwvZGl2PjxkaXY+U3ViamVjdDogUmU6IFJlZ3Jlc3Npb24gdGVzdCBzdWl0ZSBmb3IgVENQ IDwvZGl2PjxkaXY+CjwvZGl2Pk9uIE1vbiwgQXVnIDE4LCAyMDE0IGF0IDQ6NTkgUE0sIEFudXJh bmphbiBTaHVrbGEgPGFuc2h1a2xhQGp1bmlwZXIubmV0PiB3cm90ZToKPiBJZiB5b3UncmUgd2ls bGluZyB0byBzaGVsbCBvdXQgc29tZSAkJCwgSXhpYSdzIEFOVkwgaXMgYSBmYWlybHkgZGV0YWls ZWQKPiB0ZXN0IHN1aXRlIGZvciBUQ1AgYW5kIG90aGVyIHByb3RvY29scy4gSXQncyBhdmFpbGFi bGUgYXMgYSBzb2Z0d2FyZSB5b3UKPiBjYW4gaW5zdGFsbCBvbiBsbngvd2luZG93cy4gSSdkIHVz ZWQgaXQgYXQgSnVuaXBlciB3aGlsZSB3b3JraW5nIHdpdGgKPiBSb2JlcnQgZm9yIHRoZSBjb25u ZWN0aW9uIGdyb3VwcyB3b3JrIGEgY291cGxlIHllYXJzIG9yIHNvIGJhY2suCgpUaGFua3MgQW51 LgoKSXQgbG9va3MgcHJldHR5IGdvb2QgYnV0IEkgYW0gbG9va2luZyBmb3Igc29tZXRoaW5nIG1v cmUgb24gb3BlbiBzb3VyY2Ugc2lkZS4KCmNoZWVycywKSGlyZW4KPgo+IFJlZ2FyZHMsCj4gLUFu dQo+Cj4gT24gOC8xMS8xNCwgNTowNyBQTSwgImhpcmVuIHBhbmNoYXNhcmEiIDxoaXJlbkBGcmVl QlNELm9yZz4gd3JvdGU6Cj4KPj5JIHdhcyBsb29raW5nIGZvciBvbmUgYW5kIGZvdW5kCj4+aHR0 cHM6Ly93aWtpLmZyZWVic2Qub3JnL1N1bW1lck9mQ29kZTIwMDgjVENQLjJGSVBfcmVncmVzc2lv bl90ZXN0X3N1aXRlXy4KPj4yOHRjcHRlc3QuMjkKPj53aGljaCBpcyBhIGdvb2Qgc3RhcnQgYnV0 IG5lZWRzIGEgbG90IG9mIGxvdmUgKHdvcmspLgo+Pgo+PlBsZWFzZSBzaGFyZSBpZiB5b3UgYXJl IGF3YXJlIG9mIGFueSBjb3ZlcmluZyBiYXNpYyBzY2VuYXJpb3MuCj4+Cj4+Y2hlZXJzLAo+Pkhp cmVuCj4+X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KPj5m cmVlYnNkLW5ldEBmcmVlYnNkLm9yZyBtYWlsaW5nIGxpc3QKPj5odHRwOi8vbGlzdHMuZnJlZWJz ZC5vcmcvbWFpbG1hbi9saXN0aW5mby9mcmVlYnNkLW5ldAo+PlRvIHVuc3Vic2NyaWJlLCBzZW5k IGFueSBtYWlsIHRvICJmcmVlYnNkLW5ldC11bnN1YnNjcmliZUBmcmVlYnNkLm9yZyIKPgpfX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpmcmVlYnNkLW5ldEBm cmVlYnNkLm9yZyBtYWlsaW5nIGxpc3QKaHR0cDovL2xpc3RzLmZyZWVic2Qub3JnL21haWxtYW4v bGlzdGluZm8vZnJlZWJzZC1uZXQKVG8gdW5zdWJzY3JpYmUsIHNlbmQgYW55IG1haWwgdG8gImZy ZWVic2QtbmV0LXVuc3Vic2NyaWJlQGZyZWVic2Qub3JnIgo= From owner-freebsd-net@FreeBSD.ORG Wed Aug 20 21:19:23 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 46996A59; Wed, 20 Aug 2014 21:19:23 +0000 (UTC) Received: from mail-pa0-x22e.google.com (mail-pa0-x22e.google.com [IPv6:2607:f8b0:400e:c03::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 11AC73776; Wed, 20 Aug 2014 21:19:23 +0000 (UTC) Received: by mail-pa0-f46.google.com with SMTP id lj1so12898170pab.5 for ; Wed, 20 Aug 2014 14:19:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=wC3I/mNwfLeWhtE91XDSY9x3swf3q0V2anpuPnmDWHM=; b=lFrm3J9i9EARyasiC+gsrR20gCmEMn5sogrZ6fNK5KGB2E12nQtfmuQhzQpR+so0fj KzmKWfis5/7S/6nzq00uTEZsBx4e2viSjoltXGTjpx01aidtRvpkGigz/kiOwei+djqx SDywElYLdRbXJLzcbViQ/zd19q5io5pJp3xlDYBSydhaAnHbLJdS7GtUk2JJuM0rD1eR 4ZftptVXdo15s1nWYSJhmca/gKuAWGPqlECPfea6Ba94QDO77wwGuF6yW6jtHhL2/p0s 1YZKGwZg9Jzhy+lIUa/194WG50P8G+34+rGT3HsYoimLEkbAqsK9acMr6lS5VxE9j2HN ZYCg== X-Received: by 10.70.128.17 with SMTP id nk17mr6082686pdb.89.1408569562649; Wed, 20 Aug 2014 14:19:22 -0700 (PDT) Received: from [10.192.166.0] (stargate.chelsio.com. [67.207.112.58]) by mx.google.com with ESMTPSA id a5sm35523341pdp.38.2014.08.20.14.19.21 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Aug 2014 14:19:21 -0700 (PDT) Sender: Navdeep Parhar Message-ID: <53F510D8.4060801@FreeBSD.org> Date: Wed, 20 Aug 2014 14:19:20 -0700 From: Navdeep Parhar User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: Hans Petter Selasky , freebsd-net@freebsd.org, FreeBSD Current Subject: Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections] References: <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> <53F44F91.2060006@selasky.org> <53F4EC8A.9090804@FreeBSD.org> <53F4F626.9030806@selasky.org> In-Reply-To: <53F4F626.9030806@selasky.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 21:19:23 -0000 On 08/20/14 12:25, Hans Petter Selasky wrote: > On 08/20/14 20:44, Navdeep Parhar wrote: >> On 08/20/14 00:34, Hans Petter Selasky wrote: >>> Hi, >>> >>> A month has passed since the last e-mail on this topic, and in the >>> meanwhile some new patches have been created and tested: >>> >>> Basically the approach has been changed a little bit: >>> >>> - The creation of hardware transmit rings has been made independent of >>> the TCP stack. This allows firewall applications to forward traffic into >>> hardware transmit rings aswell, and not only native TCP applications. >>> This should be one more reason to get the feature into the kernel. >>> >>> - A hardware transmit ring basically can have two modes: FIXED-RATE or >>> AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed >>> bytes per second rate. In the automatic mode you can configure a time >>> after which the TX queue must be empty. The hardware driver uses this to >>> configure the actual rate. In automatic mode you can also set an upper >>> and lower transmit rate limit. >>> >>> - The MBUF has got a new field in the packet header: "txringid" >>> >>> - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of >>> the "txringid" field in the mbuf. >>> >>> The current patch [see attachment] should be much simpler and less >>> intrusive than the previous one. >>> >>> Any comments ? >> >> >> Here are some thoughts. The first two bullets cover relatively >> minor issues, the rest are more important. >> >> - All of the mbuf pkthdr fields today have the same meaning no matter >> what the context. It is not clear what txringid's global meaning is. >> Is it even possible for driver foo to interpret it the same way as >> driver bar? What if the number of rings are different, or if the ring >> at the particular index for foo is setup differently than the ring at >> that same index for bar? You are attempting to influence the driver's >> txq selection and traditionally the mbuf's flowid has been used for >> this purpose. Have you considered allowing the user to set the flowid >> directly? And mark it as such via a new rsstype so the kernel will >> leave it alone. > > Hi, > > At work so to speak, we have tried to make a simple approach that will > not break existing code, without trying to optimise the possibilities > and reduce memory footprint. > >> >> - uint32_t -> m_flowid_t is plain gratuitous. Now we need to include >> mbuf.h in more places just to get this definition. What's the >> advantage of this? style(9) isn't too fond of typedefs either. Also, >> drivers *do* need to know the width of the flowid. At least lagg(4) >> looks at the high bits of the flowid (see flowid_shift in lagg). How >> high it can go depends on the width of the flowid. > > The flowid should be typedef'ed. Else how can you know its type passing > flowid along function arguments and so on? It's just a simple 32 bit unsigned int and all drivers know exactly what it is. I don't think we need type checking for trivial stuff like this. We trust code to do the right thing and that's the correct tradeoff here, in my opinion. Or else we'd end up with errno_t, fd_t, etc. and programming in C would not be fun anymore. Here's a hyperbolic example: errno_t socket(domain_t domain, socktype_t type, protocol_t protocol); (oops, it returns an int -1 or 0 so errno_t is not strictly correct, but you get my point). > >> >> - Interfaces can come and go, routes can change, and so the relationship >> between an inpcb and txringid is not stable at all. What happens when >> the outbound route for an inpcb changes? > > This is managed separately by a daemon or such. The problem about using > the "inpcb" approach which you are suggesting, is that you limit the > rate control feature to traffic which is bound by sockets. Can your way > of doing rate control be useful to non-socket based firewall > applications, for example? > > You also assume a 1:1 mapping between "inpcb" and the flowID, right. > What about M:N mappings, where multiple streams should share the same > flowID, because it makes more sense? You're right that an inpcb based scheme won't work for non-socket based firewall. inpcb represents an endpoint, almost always with an associated socket, and it mostly has a 1:1 relation with an n-tuple (SO_LISTEN and UDP sockets with no default destination are notable exceptions). If you're talking of non-socket based firewalls, then where is the inpcb coming from? Firewalls typically keep their own state for the n-tuples that they are interested in. It almost seems like you need a n-tuple -> rate_limit mapping scheme instead of inpcb -> rate_limit. Regards, Navdeep > >> >> - The in_ratectlreq structure that you propose is inadequate in its >> current form. For example, cxgbe's hardware can do rate limiting on a >> per-ring as well as per-connection basis, and it allows for pps, >> bandwidth, or min-max limits. I think this is the critical piece that >> we NIC maintainers must agree on before any code hits the core kernel: >> how to express a rate-limit policy in a standard way and allow for >> hardware assistance opportunistically. ipfw(4)'s dummynet is probably >> interested in this part too, so it's great that Luigi is paying >> attention to this thread. > > My "in_ratectlreq" is a work in progress. > >> >> - The RATECTL ioctls deal with in_ratectlreq so we need to standardize >> the ratectlreq structure before these ioctls can be considered generic >> ifnet ioctls. This is the reason cxgbetool (and not ifconfig) has a >> private ioctl to frob cxgbe's per-queue rate-limiters. I did not want >> to add ifnet ioctls that in reality were cxgbe only. Ditto for i2c >> ioctls. Now we have multiple drivers with i2c and melifaro@ is doing >> the right thing by promoting these private ioctls to a standard ifnet >> ioctl. Have you considered a private mlxtool as a stop gap measure? > > It might end that we need to create our own tool for this, having vendor > specific IOCTLs, if we cannot agree how to do this in a general way. > >> >> To summarize my take on all of this: we need a standard ratectlreq >> structure, > > Agree. > >> a standard way to associate an inpcb with one, > > Maybe. > >> and a standard >> way to pass on this info to if_transmit. > > Agree. > >> After all this is in place we >> could even have a dummynet-ish software layer that implements rate >> limiters when the underlying hardware offers no assistance. > > Right. > > --HPS From owner-freebsd-net@FreeBSD.ORG Wed Aug 20 21:40:20 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 418F1E3B; Wed, 20 Aug 2014 21:40:20 +0000 (UTC) Received: from mail-qa0-x22d.google.com (mail-qa0-x22d.google.com [IPv6:2607:f8b0:400d:c00::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CF4923923; Wed, 20 Aug 2014 21:40:19 +0000 (UTC) Received: by mail-qa0-f45.google.com with SMTP id cm18so7413005qab.4 for ; Wed, 20 Aug 2014 14:40:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=vgQmpE2gTpgOy1yKj0ZSu2AdiXnnzDCKthQUTvZMsOA=; b=EDnqwKk1m3WmtWsG/7ary8bvMhjjVG+XtaC89DnLJjfCIHs4MCVpJ1yvkWIPx6ac91 blmneV3dac7MBWqMaIZJqeSegrUKRXVbCu9koDWqiR5ujZ0o7RB3qDduMrKU1S/22AiL y3N7hGpYbTG9BdpXfhFRrbbPAJX0B4YXR9N0O9+SLz13YbKSBpe/BoDWVLG51+LKopWC Hg3uJxuQ0910+1YQRM/NXR0AVvVRTvZdMVPksKXcMjSL2dGWM1z6ociY3Ipa1xQBFQF6 OYJt0ZikNgDk2SXWh6cmnPGdDgFm546E1HuOUDhsnWfOGBNxysTD4q/JK30SyncTlhwf QU+A== MIME-Version: 1.0 X-Received: by 10.224.75.130 with SMTP id y2mr81650108qaj.72.1408570818873; Wed, 20 Aug 2014 14:40:18 -0700 (PDT) Sender: kmacybsd@gmail.com Received: by 10.224.17.129 with HTTP; Wed, 20 Aug 2014 14:40:18 -0700 (PDT) In-Reply-To: <53F4F626.9030806@selasky.org> References: <53BC2E73.6090700@selasky.org> <53BC43AE.3040409@FreeBSD.org> <53BD5385.4090208@selasky.org> <20140709163146.GA21731@ox> <53F44F91.2060006@selasky.org> <53F4EC8A.9090804@FreeBSD.org> <53F4F626.9030806@selasky.org> Date: Wed, 20 Aug 2014 14:40:18 -0700 X-Google-Sender-Auth: U8es1S4X4Uk_24FVyfgq3Ls0hIY Message-ID: Subject: Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections] From: "K. Macy" To: Hans Petter Selasky Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-net@freebsd.org" , FreeBSD Current , Navdeep Parhar X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2014 21:40:20 -0000 > > > >> - uint32_t -> m_flowid_t is plain gratuitous. Now we need to include >> mbuf.h in more places just to get this definition. What's the >> advantage of this? style(9) isn't too fond of typedefs either. Also, >> drivers *do* need to know the width of the flowid. At least lagg(4) >> looks at the high bits of the flowid (see flowid_shift in lagg). How >> high it can go depends on the width of the flowid. >> > > The flowid should be typedef'ed. Else how can you know its type passing > flowid along function arguments and so on? I agree with Navdeep. It's usage should be obvious from context. This just pollutes the namespace. > > > >> - Interfaces can come and go, routes can change, and so the relationship >> between an inpcb and txringid is not stable at all. What happens when >> the outbound route for an inpcb changes? >> > > This is managed separately by a daemon or such. No it's not. Currently, unless you're using flowtables, the route and llentry are looked up for every single outbound packet. Most users are lightly enough loaded that they don't see the potential 8x reduction in pps from the added overhead. > The problem about using the "inpcb" approach which you are suggesting, is > that you limit the rate control feature to traffic which is bound by > sockets. Can your way of doing rate control be useful to non-socket based > firewall applications, for example? > > You also assume a 1:1 mapping between "inpcb" and the flowID, right. What > about M:N mappings, where multiple streams should share the same flowID, > because it makes more sense? That doesn't make any sense to me. FlowIDs are not a limited resource like 8-bit ASIDs where clever resource management was required. An M:N mapping would permit arbitrary interleaving of multiple streams which simply doesn't seem useful unless there is some critical case where it is a huge performance win. In general it is adding complexity for a gratuitous generalization. > > > >> - The in_ratectlreq structure that you propose is inadequate in its >> current form. For example, cxgbe's hardware can do rate limiting on a >> per-ring as well as per-connection basis, and it allows for pps, >> bandwidth, or min-max limits. I think this is the critical piece that >> we NIC maintainers must agree on before any code hits the core kernel: >> how to express a rate-limit policy in a standard way and allow for >> hardware assistance opportunistically. ipfw(4)'s dummynet is probably >> interested in this part too, so it's great that Luigi is paying >> attention to this thread. >> > > My "in_ratectlreq" is a work in progress. Which means that it probably makes sense to not impinge upon the core system until it is more refined. > - The RATECTL ioctls deal with in_ratectlreq so we need to standardize >> the ratectlreq structure before these ioctls can be considered generic >> ifnet ioctls. This is the reason cxgbetool (and not ifconfig) has a >> private ioctl to frob cxgbe's per-queue rate-limiters. I did not want >> to add ifnet ioctls that in reality were cxgbe only. Ditto for i2c >> ioctls. Now we have multiple drivers with i2c and melifaro@ is doing >> the right thing by promoting these private ioctls to a standard ifnet >> ioctl. Have you considered a private mlxtool as a stop gap measure? >> > > It might end that we need to create our own tool for this, having vendor > specific IOCTLs, if we cannot agree how to do this in a general way. > > > >> To summarize my take on all of this: we need a standard ratectlreq >> structure, >> > > Agree. > > > a standard way to associate an inpcb with one, >> > > Maybe. Associating it with an inpcb doesn't exclude adding a mechanism for supporting it in firewalls. -K From owner-freebsd-net@FreeBSD.ORG Thu Aug 21 03:23:21 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id AF1B760B for ; Thu, 21 Aug 2014 03:23:21 +0000 (UTC) Received: from mail-pd0-x230.google.com (mail-pd0-x230.google.com [IPv6:2607:f8b0:400e:c02::230]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7FA903AC5 for ; Thu, 21 Aug 2014 03:23:21 +0000 (UTC) Received: by mail-pd0-f176.google.com with SMTP id y10so12818409pdj.35 for ; Wed, 20 Aug 2014 20:23:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:date:from:reply-to:user-agent:mime-version:to:cc :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=ACSizgfFi5uri57SR/zkROO2LavMZHNsm1urBg6ycCo=; b=MyTLq/7yi/STM34zezhzsUL5Ufgyd01kT+5G0Qi6/UhWKRNEE6AKau6UNEd4rmuQP+ RXz/Ysc+9kXpUyJILugBDUEe9R2XAMbyBxvlPK9VrgkTU3yqUMI485givqTrKiZpKYfd HKn37onadmGFDp9gRiQqwh3H+DwkBMqVqYbIJPNHqV7pak79zSbmzy5zCS3+oTFqs8I3 t4Y3cGQDu5Ickjt1TVfSfVAMj0Uz1JTQLVUWeug3RiyRPgTt1cxHXj0PbWoFGCZIMjhT Ki+vN8dt7uUQlHbMuYK01ETEAxGkWCWKnYid85teRWq8v3gB2NASjuQpZG0GfYKDiT5Q TG9A== X-Received: by 10.68.143.100 with SMTP id sd4mr57487250pbb.76.1408591401045; Wed, 20 Aug 2014 20:23:21 -0700 (PDT) Received: from ?IPv6:2001:44b8:31ae:7b00:8080:feed:d356:95c0? (2001-44b8-31ae-7b00-8080-feed-d356-95c0.static.ipv6.internode.on.net. [2001:44b8:31ae:7b00:8080:feed:d356:95c0]) by mx.google.com with ESMTPSA id da14sm86160668pac.24.2014.08.20.20.23.19 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Aug 2014 20:23:20 -0700 (PDT) Sender: Kubilay Kocak Message-ID: <53F56621.6030708@FreeBSD.org> Date: Thu, 21 Aug 2014 13:23:13 +1000 From: Kubilay Kocak Reply-To: koobs@FreeBSD.org User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Thunderbird/32.0 MIME-Version: 1.0 To: Carlos Ferreira , Luigi Rizzo Subject: Re: tutorial on Netmap in Mountain View - Aug.28 References: <20140804095528.GA12625@onelab2.iet.unipi.it> In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Aug 2014 03:23:21 -0000 On 18/08/2014 8:29 PM, Carlos Ferreira wrote: > Hi Luigi. > Do you have presentations or tutorial code from that tutorial, that you can > share here? > > > On 4 August 2014 10:55, Luigi Rizzo wrote: > >> In case someone (especially those in the bay area) is interested: >> I will give a half day tutorial on netmap at Hot Interconnects, >> in Mountain View on August 28, 2014 >> >> http://www.hoti.org/hoti22/tutorials/#tut4 >> >> This tutorial targets hardware vendors, network engineers, and >> researchers looking for solutions to: OS support for high speed NICs; >> efficient software packet processing techniques for SDN products; >> high speed networking in VMs. We will show how to achieve these >> results using netmap. >> >> cheers >> luigi >> >> (P.S. I have no financial interest in the event. I am posting the info >> because I think it might be useful to people on this list, and of course >> having a larger audience at the tutorial will generate more interesting >> feedback from participants) >> >> -----------------------------------------+------------------------------- >> Prof. Luigi RIZZO, rizzo@iet.unipi.it . Dip. di Ing. dell'Informazione >> http://www.iet.unipi.it/~luigi/ . Universita` di Pisa >> TEL +39-050-2211611 . via Diotisalvi 2 >> Mobile +39-338-6809875 . 56122 PISA (Italy) >> -----------------------------------------+------------------------------- >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > > > Before I forget and if it hasn't already been organised, Video and Mic (on speaker) too please if possible! :) From owner-freebsd-net@FreeBSD.ORG Thu Aug 21 06:50:38 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id AFBE1C9E; Thu, 21 Aug 2014 06:50:38 +0000 (UTC) Received: from mx12.netapp.com (mx12.netapp.com [216.240.18.77]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "mx12.netapp.com", Issuer "VeriSign Class 3 International Server CA - G3" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 8548D3D09; Thu, 21 Aug 2014 06:50:38 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.01,907,1400050800"; d="asc'?scan'208";a="183251166" Received: from vmwexceht02-prd.hq.netapp.com ([10.106.76.240]) by mx12-out.netapp.com with ESMTP; 20 Aug 2014 23:50:31 -0700 Received: from HIOEXCMBX06-PRD.hq.netapp.com (10.122.105.39) by vmwexceht02-prd.hq.netapp.com (10.106.76.240) with Microsoft SMTP Server (TLS) id 14.3.123.3; Wed, 20 Aug 2014 23:50:30 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com (10.122.105.40) by hioexcmbx06-prd.hq.netapp.com (10.122.105.39) with Microsoft SMTP Server (TLS) id 15.0.913.22; Wed, 20 Aug 2014 23:49:29 -0700 Received: from HIOEXCMBX07-PRD.hq.netapp.com ([::1]) by hioexcmbx07-prd.hq.netapp.com ([fe80::55e3:a7dc:11bd:462%21]) with mapi id 15.00.0913.011; Wed, 20 Aug 2014 23:50:29 -0700 From: "Eggert, Lars" To: Vijay Singh Subject: Re: Regression test suite for TCP Thread-Topic: Regression test suite for TCP Thread-Index: AQHPvLNpx+I3Vd3NC0a31aFpKh9Q7ZvbFFQA Date: Thu, 21 Aug 2014 06:50:29 +0000 Message-ID: <22085087-A4CA-4EAB-A08C-E62B43E68BBA@netapp.com> References: <7evw8ss04fyvhld15h9fdx7n.1408565695689@email.android.com> In-Reply-To: <7evw8ss04fyvhld15h9fdx7n.1408565695689@email.android.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-mailer: Apple Mail (2.1878.6) x-originating-ip: [10.122.56.79] Content-Type: multipart/signed; boundary="Apple-Mail=_037FD8CD-0BC6-46C4-9BFF-A45CDBD879DC"; protocol="application/pgp-signature"; micalg=pgp-sha1 MIME-Version: 1.0 Cc: "freebsd-net@freebsd.org" , Anuranjan Shukla , hiren panchasara X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Aug 2014 06:50:38 -0000 --Apple-Mail=_037FD8CD-0BC6-46C4-9BFF-A45CDBD879DC Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On 2014-8-20, at 22:14, vijju.singh wrote: > Have you looked at packetdrill from Google?=20 packetdrill is great, but Google keeps their (extensive) library of = regression tests private. Lars --Apple-Mail=_037FD8CD-0BC6-46C4-9BFF-A45CDBD879DC Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="signature.asc" Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- iQCVAwUBU/WWtNZcnpRveo1xAQJ/iAP+OmgAVqirINRHm5VzQ3WdUV0FY5pn/t2g Yb99/CaN1+juGl2EXTdL4gYYgVMjnZbDvPD9r9E41CM10IQedehh/mTwTAi7/Q/X 7BhrW0oaXQU42SvFeqpaHco6DHJETOcCmZMVZQOIpVoo1cYD8mErireXQw6NATGc SiY2xAOEAZw= =NWe0 -----END PGP SIGNATURE----- --Apple-Mail=_037FD8CD-0BC6-46C4-9BFF-A45CDBD879DC-- From owner-freebsd-net@FreeBSD.ORG Thu Aug 21 20:20:50 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1D589ECB; Thu, 21 Aug 2014 20:20:50 +0000 (UTC) Received: from mail-lb0-x22a.google.com (mail-lb0-x22a.google.com [IPv6:2a00:1450:4010:c04::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5741B330F; Thu, 21 Aug 2014 20:20:49 +0000 (UTC) Received: by mail-lb0-f170.google.com with SMTP id l4so8619262lbv.1 for ; Thu, 21 Aug 2014 13:20:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Vr282I9ZHx9vjXNJAWizN/xtRltoJGrHELCiTsdK+sQ=; b=nclo+DfA2gQ7E4RDd9AMm/V2O9v8LokYgAe3FXHpYyqDdNS/k4pxfNVXquUEDP7AXv PhncA2T9ZMAKH9bRcGCqSsOt2yqwCU0slNs7yVPOVCTk+PKFM2QzD+g9cQWR22+pfCVb SK2HChiyvWJdYG/1ulAVM82XUiMIVY6ZUFSSMkTt2Q8IZIT5vVIoRvMmHS7S4H6No0rh Xm1Y0TgFfru+dt4wNJL1vF7p2uc+NhWzEatf6M2Ss7hW+9FQvbGUAZ/2O4oSwh35Sesb j/oSN5pgFvrz+gph0W4cb1XnZLjj4XYHSya/ehp5cCpPf7XhzEBeaLQItIlncz9QUwkR +5pQ== MIME-Version: 1.0 X-Received: by 10.152.43.201 with SMTP id y9mr778929lal.54.1408652446770; Thu, 21 Aug 2014 13:20:46 -0700 (PDT) Received: by 10.25.19.139 with HTTP; Thu, 21 Aug 2014 13:20:46 -0700 (PDT) In-Reply-To: <53F56621.6030708@FreeBSD.org> References: <20140804095528.GA12625@onelab2.iet.unipi.it> <53F56621.6030708@FreeBSD.org> Date: Thu, 21 Aug 2014 23:20:46 +0300 Message-ID: Subject: Re: tutorial on Netmap in Mountain View - Aug.28 From: =?UTF-8?B?w5Z6a2FuIEtJUklL?= To: koobs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: FreeBSD Net , Luigi Rizzo , Carlos Ferreira X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Aug 2014 20:20:50 -0000 +1 On Thu, Aug 21, 2014 at 6:23 AM, Kubilay Kocak wrote: > On 18/08/2014 8:29 PM, Carlos Ferreira wrote: > > Hi Luigi. > > Do you have presentations or tutorial code from that tutorial, that you > can > > share here? > > > > > > On 4 August 2014 10:55, Luigi Rizzo wrote: > > > >> In case someone (especially those in the bay area) is interested: > >> I will give a half day tutorial on netmap at Hot Interconnects, > >> in Mountain View on August 28, 2014 > >> > >> http://www.hoti.org/hoti22/tutorials/#tut4 > >> > >> This tutorial targets hardware vendors, network engineers, and > >> researchers looking for solutions to: OS support for high speed NICs; > >> efficient software packet processing techniques for SDN products; > >> high speed networking in VMs. We will show how to achieve these > >> results using netmap. > >> > >> cheers > >> luigi > >> > >> (P.S. I have no financial interest in the event. I am posting the info > >> because I think it might be useful to people on this list, and of course > >> having a larger audience at the tutorial will generate more interesting > >> feedback from participants) > >> > >> > -----------------------------------------+------------------------------- > >> Prof. Luigi RIZZO, rizzo@iet.unipi.it . Dip. di Ing. > dell'Informazione > >> http://www.iet.unipi.it/~luigi/ . Universita` di Pisa > >> TEL +39-050-2211611 . via Diotisalvi 2 > >> Mobile +39-338-6809875 . 56122 PISA (Italy) > >> > -----------------------------------------+------------------------------- > >> _______________________________________________ > >> freebsd-net@freebsd.org mailing list > >> http://lists.freebsd.org/mailman/listinfo/freebsd-net > >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > >> > > > > > > > > Before I forget and if it hasn't already been organised, Video and Mic > (on speaker) too please if possible! :) > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Fri Aug 22 11:34:40 2014 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 378EBB7D; Fri, 22 Aug 2014 11:34:40 +0000 (UTC) Received: from mx0.gentlemail.de (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B5A5135B1; Fri, 22 Aug 2014 11:34:39 +0000 (UTC) Received: from mh0.gentlemail.de (ezra.dcm1.omnilan.net [78.138.80.135]) by mx0.gentlemail.de (8.14.5/8.14.5) with ESMTP id s7MBYYtI037780; Fri, 22 Aug 2014 13:34:34 +0200 (CEST) (envelope-from h.schmalzbauer@omnilan.de) Received: from titan.inop.mo1.omnilan.net (titan.inop.mo1.omnilan.net [IPv6:2001:a60:f0bb:1::3:1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mh0.gentlemail.de (Postfix) with ESMTPSA id 636B93E17; Fri, 22 Aug 2014 13:34:34 +0200 (CEST) Message-ID: <53F72AC4.7040108@omnilan.de> Date: Fri, 22 Aug 2014 13:34:28 +0200 From: Harald Schmalzbauer Organization: OmniLAN User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-DE; rv:1.9.2.8) Gecko/20100906 Lightning/1.0b2 Thunderbird/3.1.2 MIME-Version: 1.0 To: Yuri Subject: Re: [PATCH] Packet loss when 'control' messages are present with large data (sendmsg(2)) References: <522300E3.6050303@rawbw.com> <522419F4.5010605@rawbw.com> In-Reply-To: <522419F4.5010605@rawbw.com> X-Enigmail-Version: 1.1.2 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigBDCF6DF9B7F08E8A477FFBB7" X-Greylist: ACL 119 matched, not delayed by milter-greylist-4.2.7 (mx0.gentlemail.de [78.138.80.130]); Fri, 22 Aug 2014 13:34:35 +0200 (CEST) X-Milter: Spamilter (Reciever: mx0.gentlemail.de; Sender-ip: 78.138.80.135; Sender-helo: mh0.gentlemail.de; ) Cc: current@freebsd.org, net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Aug 2014 11:34:40 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigBDCF6DF9B7F08E8A477FFBB7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Bez=FCglich Yuri's Nachricht vom 02.09.2013 06:54 (localtime): > Please check in this patch: > http://www.freebsd.org/cgi/query-pr.cgi?pr=3D181741 > Please MFC into 9.X > > Description of the problem is within PR. > > Thanks, > Yuri Hello, I guess this fix should make it into 10.1. Can someone check please? Thanks, -Harry --------------enigBDCF6DF9B7F08E8A477FFBB7 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAlP3KskACgkQLDqVQ9VXb8hkZACeMRKTFHVdEG/8uqycKIwJg1j4 howAoMJ1D9CS492eJD8ikl4NIJUfWhu7 =PhYh -----END PGP SIGNATURE----- --------------enigBDCF6DF9B7F08E8A477FFBB7-- From owner-freebsd-net@FreeBSD.ORG Fri Aug 22 14:55:22 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 01EFA61B for ; Fri, 22 Aug 2014 14:55:22 +0000 (UTC) Received: from mail.as41113.net (mail.as41113.net [91.208.177.22]) by mx1.freebsd.org (Postfix) with ESMTP id C336B3B42 for ; Fri, 22 Aug 2014 14:55:20 +0000 (UTC) Received: from [172.21.87.41] (193.98.9.212.in-addr.arpa [212.9.98.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: lists@rewt.org.uk) by mail.as41113.net (Postfix) with ESMTPSA id 3hflty5tpxz1NFXw for ; Fri, 22 Aug 2014 15:47:22 +0100 (BST) Message-ID: <53F757FC.3030006@rewt.org.uk> Date: Fri, 22 Aug 2014 15:47:24 +0100 From: Joe Holden User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: relayd and tls Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Aug 2014 14:55:22 -0000 Hi chaps, I'm playing with relayd on 10.0-R but it appears that when I make a tls connection to a https listener the process handling the connection consumes 100% cpu and stalls, is this a known issue? I'm not doing any pf redirection or anything - simple listener that forwards to http Cheers, Joe From owner-freebsd-net@FreeBSD.ORG Fri Aug 22 23:34:09 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4C5C9F05 for ; Fri, 22 Aug 2014 23:34:09 +0000 (UTC) Received: from sender1.zohomail.com (sender1.zohomail.com [72.5.230.103]) by mx1.freebsd.org (Postfix) with ESMTP id 21F0030A1 for ; Fri, 22 Aug 2014 23:34:08 +0000 (UTC) Received: from [192.168.152.175] (216.239.55.138 [216.239.55.138]) by mx.zohomail.com with SMTPS id 1408747149134586.1548152864194; Fri, 22 Aug 2014 15:39:09 -0700 (PDT) Date: Fri, 22 Aug 2014 15:39:06 -0700 Subject: Set arbitrary protocol for route? Message-ID: From: Josh Moore To: freebsd-net@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: base64 X-ZohoMailClient: External X-Zoho-Virus-Status: 2 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Aug 2014 23:34:09 -0000 SSBhbSB0cnlpbmcgdG8gYWRkIGEgbG9jYWwgcm91dGUgd2l0aCBhbiBhcmJpdHJhcnkgcHJvdG9j b2wgbnVtYmVyLiAgVGhpcyBpcyBkb25lIHdpdGggaXByb3V0ZTIgaW4gTGludXggYnk6CgppcCBy b3V0ZSBhZGQgdG8gbG9jYWwgJGlwLzMyIGRldiBldGgwIHByb3RvICRudW0KCkhvdyBjYW4gSSBk byB0aGlzIGluIEZyZWVCU0QgMTA/CgpKb3No From owner-freebsd-net@FreeBSD.ORG Sat Aug 23 01:20:03 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1D36FF90 for ; Sat, 23 Aug 2014 01:20:03 +0000 (UTC) Received: from mail-qg0-x22f.google.com (mail-qg0-x22f.google.com [IPv6:2607:f8b0:400d:c04::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D329F396D for ; Sat, 23 Aug 2014 01:20:02 +0000 (UTC) Received: by mail-qg0-f47.google.com with SMTP id i50so10813173qgf.20 for ; Fri, 22 Aug 2014 18:20:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=A5nomO7AQGlCjDV+SPrTMPyFSlrU3BRTq8fzAqlWvZY=; b=h466hnHRxH1j8e4qCcwVn6oAaM0hh8YAEHRUSc4bfa2dt78uKJ/SzJBhNCyugSFDg1 j8ercEpEHDiFrdomyDgoV1LOKYFCE6QsDStgK/GWgDuzzEfx5b1eMBaxZ8JocxyE/D7n tZ0IYOsRYTb0egqJfWbfBtgEC6vcVIzeYP9PNV8be3/W5vB44KAWef2hvKa6/IC0T8ws cxCH+Ej3tpb8KXEr4p6OmBwGpYk1U57c3CN9V7r86THGJus079OoAXkro+Z/Qj9QFtzz eQZJZR7t2ZmZseGRGktlXyc43REaomt4ng6B369CWT9h/RCA4gpMB4sJBQECmS6usUow H+Nw== MIME-Version: 1.0 X-Received: by 10.140.22.19 with SMTP id 19mr8701569qgm.18.1408756801854; Fri, 22 Aug 2014 18:20:01 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.39.139 with HTTP; Fri, 22 Aug 2014 18:20:01 -0700 (PDT) In-Reply-To: References: Date: Fri, 22 Aug 2014 18:20:01 -0700 X-Google-Sender-Auth: GUd-UK_QylDlS7v4YBLd9aQ3Cp0 Message-ID: Subject: Re: Set arbitrary protocol for route? From: Adrian Chadd To: Josh Moore Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Aug 2014 01:20:03 -0000 On 22 August 2014 15:39, Josh Moore wrote: > I am trying to add a local route with an arbitrary protocol number. This is done with iproute2 in Linux by: > > ip route add to local $ip/32 dev eth0 proto $num > > How can I do this in FreeBSD 10? > What's that supposed to do? -a From owner-freebsd-net@FreeBSD.ORG Sat Aug 23 05:54:04 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BB08489D; Sat, 23 Aug 2014 05:54:04 +0000 (UTC) Received: from mail-vc0-x235.google.com (mail-vc0-x235.google.com [IPv6:2607:f8b0:400c:c03::235]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 675643248; Sat, 23 Aug 2014 05:54:04 +0000 (UTC) Received: by mail-vc0-f181.google.com with SMTP id lf12so13124445vcb.40 for ; Fri, 22 Aug 2014 22:54:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=vcGGJbdhAvHyh7EmmSYCtHF1+ni7x0lyh5jpfonj7AA=; b=PPx8WBi8834A/psyJ2qR+gxU/4MeiJtqydoIB3jEhuubtDhncWXHcT06UwlUB7Wo9K w0A+PEw/CyAwTDrGegytqbO0otEDS4IOOuYaNAAqrMB8+4xasSuqPabt/hafLQ8JyQb3 kiINEmrR0r/hbtVZOTQ+ZI1PcQZ+gW1TcGbAc2pUePfBu5KH3RDo4GXo36PV2PwPeGV7 bcmHL36IBOI4vn1w6DuGKckBnog1r1+cFn5IT+1TgFSjToFKMzZserrqbSvhqwkDEnqP liBQvqwhZas0flh1ISVdlxxZZon9ao1byxZtqJKlUcalBrwYs/9+0Szds71jU15bmjZ2 ykVg== MIME-Version: 1.0 X-Received: by 10.220.116.196 with SMTP id n4mr7213606vcq.6.1408773243388; Fri, 22 Aug 2014 22:54:03 -0700 (PDT) Sender: ndenev@gmail.com Received: by 10.221.46.133 with HTTP; Fri, 22 Aug 2014 22:54:03 -0700 (PDT) In-Reply-To: References: Date: Sat, 23 Aug 2014 07:54:03 +0200 X-Google-Sender-Auth: TQR9d-6AYyUZO1atOrNk5ZBUcs0 Message-ID: Subject: Re: Set arbitrary protocol for route? From: Nikolay Denev To: Adrian Chadd Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD Net , Josh Moore X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Aug 2014 05:54:04 -0000 On Sat, Aug 23, 2014 at 3:20 AM, Adrian Chadd wrote: > On 22 August 2014 15:39, Josh Moore wrote: >> I am trying to add a local route with an arbitrary protocol number. This is done with iproute2 in Linux by: >> >> ip route add to local $ip/32 dev eth0 proto $num >> >> How can I do this in FreeBSD 10? >> > > What's that supposed to do? > > > > -a > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" Unfortunately I don't think there is a direct equivalent for this on FreeBSD. This route protocol id on Linux allows let's say different routing daemons to install routes with it's own protocol ID, which can be used to select which one has priority. I think similar proposals sparked some discussions about is the kernel routing table a FIB or a RIB in the past. --Nikolay From owner-freebsd-net@FreeBSD.ORG Sat Aug 23 06:49:50 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 461AE37E for ; Sat, 23 Aug 2014 06:49:50 +0000 (UTC) Received: from mail-qg0-x22b.google.com (mail-qg0-x22b.google.com [IPv6:2607:f8b0:400d:c04::22b]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 05F153609 for ; Sat, 23 Aug 2014 06:49:49 +0000 (UTC) Received: by mail-qg0-f43.google.com with SMTP id a108so11172674qge.2 for ; Fri, 22 Aug 2014 23:49:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=jiTaWQRe37W/faGlQsF3QtgfshhsxepbVrezhK3vdmE=; b=bw6kslL4bE8aSaURjq73TSASqLy4p2gPjk37y3kAf8fuVjwYudFxP10hyMWjNPmPx2 aY80cQO4UUyjg71OrYteD3jw0Ae7Wuw+VCEmzV+BVxvWW1OSKikxnsGfwmbuXdx0Xa/h DTy/XQ+StyRFz6v8QwT1YsWL+rqBInwfaOTR/0y8sgZ/RuDHRESkwO0T1X5dz6qjtS9L WcceA/v+haE5aXgmYJR7hPLaxnpiibsAafujt7AzCWr5lwnt9KDSgzW+RHHOaD0CsrKE WW4h/ar5sTROoRdItQWKtXH9DrDVmFLZRr7Ia+2RUt2CVeFb7DfeiDxqKGD5SwF8Z1tL YIcA== MIME-Version: 1.0 X-Received: by 10.224.36.4 with SMTP id r4mr8595884qad.69.1408776588551; Fri, 22 Aug 2014 23:49:48 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.39.139 with HTTP; Fri, 22 Aug 2014 23:49:48 -0700 (PDT) In-Reply-To: References: Date: Fri, 22 Aug 2014 23:49:48 -0700 X-Google-Sender-Auth: PQ2tnymbZeoIvSohksPqQb4g8VA Message-ID: Subject: Re: Set arbitrary protocol for route? From: Adrian Chadd To: Nikolay Denev Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD Net , Josh Moore X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Aug 2014 06:49:50 -0000 Ok, so how does the whole protocol thing implement priority? -a From owner-freebsd-net@FreeBSD.ORG Sat Aug 23 17:33:06 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 465A241E; Sat, 23 Aug 2014 17:33:06 +0000 (UTC) Received: from mail-vc0-x22e.google.com (mail-vc0-x22e.google.com [IPv6:2607:f8b0:400c:c03::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E177F3A28; Sat, 23 Aug 2014 17:33:05 +0000 (UTC) Received: by mail-vc0-f174.google.com with SMTP id la4so13403050vcb.5 for ; Sat, 23 Aug 2014 10:33:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=4drpU7MktphkOArM/ZQMsHb5PW0dZLt9GIJdIpcNrcA=; b=BNAAID74qFyfgGKdfrWjDseDZKQVeJrElnQ0rkM+EklBKhbdiW4fqT2gH0as2yRz+W bBaiYrH7dz3ZvcrA8rUjPvQhfc4MjVDKZt+vBg2SCI/Wa+9E/qWbZ2AcnGWDRgUs3jar uxqvgid3jF2ZTwQoiRPHaNzIN4g9OzKReOyTMEH1gc/2IkMJTMYw1YOCVMyDIyxy+kRL DVVjHFOCEuHT0xLdGgWUME3TI/7BR0H/gU/Y1KtRHVK8shBMLseWpFEUfGFox63EXMq1 NqyTNxuwzAeON6pf648GECU5S5gVNPnfTxUzVO285ullcuisrO9+2C0oyCTrmeJ3ytpr bEeg== MIME-Version: 1.0 X-Received: by 10.52.119.229 with SMTP id kx5mr378276vdb.40.1408815185018; Sat, 23 Aug 2014 10:33:05 -0700 (PDT) Sender: ndenev@gmail.com Received: by 10.221.46.133 with HTTP; Sat, 23 Aug 2014 10:33:04 -0700 (PDT) In-Reply-To: References: Date: Sat, 23 Aug 2014 19:33:04 +0200 X-Google-Sender-Auth: nytzHuUOZGwLHpoPz1PwQFoDbw4 Message-ID: Subject: Re: Set arbitrary protocol for route? From: Nikolay Denev To: Adrian Chadd Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD Net , Josh Moore X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Aug 2014 17:33:06 -0000 On Sat, Aug 23, 2014 at 8:49 AM, Adrian Chadd wrote: > Ok, so how does the whole protocol thing implement priority? > > > -a Ah, sorry, reading again I don't think it does that. For some reason I was under the impression it does. So, it looks like it's just a 8 bit tag applied to each route, not involved in the actual routing, but allows you to filter when displaying etc. >From linux ip-route(8) man page : protocol RTPROTO the routing protocol identifier of this route. RTPROTO may be a number or a string from the file /etc/iproute2/rt_protos. If the routing protocol ID is not given, ip assumes protocol boot (i.e. it assumes the route was added by someone who doesn't understand what they are doing). Several protocol values have a fixed interpretation. Namely: redirect - the route was installed due to an ICMP redirect. kernel - the route was installed by the kernel during autoconfiguration. boot - the route was installed during the bootup sequence. If a routing daemon starts, it will purge all of them. static - the route was installed by the administrator to override dynamic routing. Routing daemon will respect them and, probably, even advertise them to its peers. ra - the route was installed by Router Discovery protocol. The rest of the values are not reserved and the administrator is free to assign (or not to assign) protocol tags. --Nikolay From owner-freebsd-net@FreeBSD.ORG Sat Aug 23 23:02:33 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5F05EBD0; Sat, 23 Aug 2014 23:02:33 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 397FF37B6; Sat, 23 Aug 2014 23:02:32 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s7NN2NL3066023 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 23 Aug 2014 16:02:24 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s7NN2N6d066022; Sat, 23 Aug 2014 16:02:23 -0700 (PDT) (envelope-from jmg) Date: Sat, 23 Aug 2014 16:02:23 -0700 From: John-Mark Gurney To: Nikolay Denev Subject: Re: Set arbitrary protocol for route? Message-ID: <20140823230223.GI71691@funkthat.com> Mail-Followup-To: Nikolay Denev , Adrian Chadd , FreeBSD Net , Josh Moore References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Sat, 23 Aug 2014 16:02:24 -0700 (PDT) Cc: FreeBSD Net , Adrian Chadd , Josh Moore X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Aug 2014 23:02:33 -0000 Nikolay Denev wrote this message on Sat, Aug 23, 2014 at 19:33 +0200: > On Sat, Aug 23, 2014 at 8:49 AM, Adrian Chadd wrote: > > Ok, so how does the whole protocol thing implement priority? > > Ah, sorry, reading again I don't think it does that. For some reason I > was under the impression it does. > So, it looks like it's just a 8 bit tag applied to each route, not > involved in the actual routing, but allows you > to filter when displaying etc. > >From linux ip-route(8) man page : > > protocol RTPROTO > the routing protocol identifier of this route. RTPROTO may be a > number or a string from the file /etc/iproute2/rt_protos. If > the routing protocol ID is not given, ip assumes protocol boot > (i.e. it assumes the route was added by someone who doesn't > understand what they are doing). Several protocol values have a > fixed interpretation. Namely: > > redirect - the route was installed due to an ICMP > redirect. > > kernel - the route was installed by the kernel during > autoconfiguration. > > boot - the route was installed during the bootup > sequence. If a routing daemon starts, it will purge all > of them. > > static - the route was installed by the administrator to > override dynamic routing. Routing daemon will respect > them and, probably, even advertise them to its peers. > > ra - the route was installed by Router Discovery > protocol. > > The rest of the values are not reserved and the administrator is > free to assign (or not to assign) protocol tags. If that's the case, a simple man route would have found the answer: -proto1 RTF_PROTO1 - set protocol specific routing flag #1 -proto2 RTF_PROTO2 - set protocol specific routing flag #2 -proto3 RTF_PROTO3 - set protocol specific routing flag #3 Not as many as Linux, but I do believe some of the routing daemons use this flag to know what routes it can delete or not... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."