From owner-freebsd-pf@FreeBSD.ORG Mon Aug 17 11:07:00 2009 Return-Path: Delivered-To: freebsd-pf@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D3FCE106568E for ; Mon, 17 Aug 2009 11:07:00 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id C1FBE8FC60 for ; Mon, 17 Aug 2009 11:07:00 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n7HB70qX075895 for ; Mon, 17 Aug 2009 11:07:00 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n7HB70Vh075891 for freebsd-pf@FreeBSD.org; Mon, 17 Aug 2009 11:07:00 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 17 Aug 2009 11:07:00 GMT Message-Id: <200908171107.n7HB70Vh075891@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-pf@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-pf@FreeBSD.org X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Aug 2009 11:07:00 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/136781 pf [pf] Packets appear to drop with pf scrub and if_bridg o kern/135948 pf [pf] [gre] pf not natting gre protocol o kern/135162 pf [pfsync] pfsync(4) not usable with GENERIC kernel o kern/134996 pf [pf] Anchor tables not included when pfctl(8) is run w o kern/133732 pf [pf] max-src-conn issue o kern/132769 pf [pf] [lor] 2 LOR's with pf task mtx / ifnet and rtent f kern/132176 pf [pf] pf stalls connection when using route-to [regress o conf/130381 pf [rc.d] [pf] [ip6] ipv6 not fully configured when pf st o kern/129861 pf [pf] [patch] Argument names reversed in pf_table.c:_co o kern/127920 pf [pf] ipv6 and synproxy don't play well together o conf/127814 pf [pf] The flush in pf_reload in /etc/rc.d/pf does not w o kern/127439 pf [pf] deadlock in pf f kern/127345 pf [pf] Problem with PF on FreeBSD7.0 [regression] o kern/127121 pf [pf] [patch] pf incorrect log priority o kern/127042 pf [pf] [patch] pf recursion panic if interface group is o kern/125467 pf [pf] pf keep state bug while handling sessions between s kern/124933 pf [pf] [ip6] pf does not support (drops) IPv6 fragmented o kern/124364 pf [pf] [panic] Kernel panic with pf + bridge o kern/122773 pf [pf] pf doesn't log uid or pid when configured to o kern/122014 pf [pf] [panic] FreeBSD 6.2 panic in pf o kern/121704 pf [pf] PF mangles loopback packets o kern/120281 pf [pf] [request] lost returning packets to PF for a rdr o kern/120057 pf [pf] [patch] Allow proper settings of ALTQ_HFSC. The c o bin/118355 pf [pf] [patch] pfctl(8) help message options order false o kern/114567 pf [pf] [lor] pf_ioctl.c + if.c o kern/114095 pf [carp] carp+pf delay with high state limit o kern/111220 pf [pf] repeatable hangs while manipulating pf tables s conf/110838 pf [pf] tagged parameter on nat not working on FreeBSD 5. o kern/103283 pf pfsync fails to sucessfully transfer some sessions o kern/103281 pf pfsync reports bulk update failures o kern/93825 pf [pf] pf reply-to doesn't work o sparc/93530 pf [pf] Incorrect checksums when using pf's route-to on s o kern/92949 pf [pf] PF + ALTQ problems with latency o bin/86635 pf [patch] pfctl(8): allow new page character (^L) in pf. o kern/82271 pf [pf] cbq scheduler cause bad latency 35 problems total. From owner-freebsd-pf@FreeBSD.ORG Tue Aug 18 11:31:49 2009 Return-Path: Delivered-To: pf@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BFF0D106564A for ; Tue, 18 Aug 2009 11:31:49 +0000 (UTC) (envelope-from ianf@clue.co.za) Received: from inbound01.jnb1.gp-online.net (inbound01.jnb1.gp-online.net [41.161.16.135]) by mx1.freebsd.org (Postfix) with ESMTP id A852E8FC52 for ; Tue, 18 Aug 2009 11:31:48 +0000 (UTC) Received: from [196.7.162.28] (helo=clue.co.za) by inbound01.jnb1.gp-online.net with esmtpsa (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.63) (envelope-from ) id 1MdMNx-0000Pw-I9; Tue, 18 Aug 2009 12:57:49 +0200 Received: from localhost ([127.0.0.1] helo=clue.co.za) by clue.co.za with esmtp (Exim 4.69 (FreeBSD)) (envelope-from ) id 1MdMO7-0000zJ-I5; Tue, 18 Aug 2009 12:57:59 +0200 To: Robert Watson From: Ian FREISLICH In-Reply-To: References: <4A8484E4.6090504@uffner.com> X-Attribution: BOFH Date: Tue, 18 Aug 2009 12:57:59 +0200 Message-Id: Cc: pf@freebsd.org, current@freebsd.org Subject: Re: packet forwarding/firewall performance question X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Aug 2009 11:31:49 -0000 Robert Watson wrote: > > On Thu, 13 Aug 2009, Tom Uffner wrote: > > > i'm hoping a few people will give me estimates on what kind of > > throughput i should theoretically expect before i provide any actual > > test data. also, any suggestions on tuning would be welcome. > > > > so far in preliminary tests, enabling polling on the network > > interfaces reduces my performance slightly both to/from and through > > the box. net.inet.ip.fastforwarding doesn't seem to make much > > difference either way but i haven't done very thorough testing of > > it. increasing net.inet.tcp.sendbuf_max & recvbuf_max may have > > helped, but again, not sufficiently tested. > > I can't speak to absolute numbers, but I wouldn't expect > net.inet.tcp.* changes to make any difference, as they should affect > only locally terminated sockets on the firewall host, not forwarded > packets. > > You might want to try experimenting with net.isr.direct -- try setting > it to 0, as this changes the kernel dispatch model for the network > stack. On a UP box, I would probably anticipate a performance loss > for making that change, or similar configuration changes for multiple > netisr threads using net.isr.maxthreads. > > If you're using firewall code, fast forwarding is unlikely > to make a difference. Depending on the cache/memory/CPU > trade-off, you might find turning off flowtable support helps -- > net.inet.flowtable.enable=0. I found that forwarding made a fantastic difference to the forwarding rate in the past. Even with firewalling - was the difference between 38kpps and 500kpps using RTL8110 gigE interfaces. Perhaps I need to retest the effect on a modern FreeBSD. As to the OP, on a VIA Epia LN - C7-1GHz with vr interfaces maxed out at 100Mbit/s. Putting gigE interfaces in the PCI slot made no difference. The bottle-neck appeared to be the number of interrupts the cards generated and the amount of time servicing interrupts, which was not affected by polling(4). Ian -- Ian Freislich From owner-freebsd-pf@FreeBSD.ORG Tue Aug 18 18:03:09 2009 Return-Path: Delivered-To: pf@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1597B106568B; Tue, 18 Aug 2009 18:03:09 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-ew0-f209.google.com (mail-ew0-f209.google.com [209.85.219.209]) by mx1.freebsd.org (Postfix) with ESMTP id 29C168FC3D; Tue, 18 Aug 2009 18:03:07 +0000 (UTC) Received: by ewy5 with SMTP id 5so296878ewy.36 for ; Tue, 18 Aug 2009 11:03:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=s0TGJjVsxz6rlnOOgwIKTukJLddAbKOYZsNaNpgZbdU=; b=Z/sp760d4yy3FrLQS1+CZSgcQk3S7tujCAh6jkUC+5ObOnXgMzVlouFfUtRE1znMYk OLHFEP0rP4gzXQyXygiysyeeFAJYvJbbZRMClEYmbERIFJAy3xwQIxnB3RT++6h0jfVV 6p3yYdzpmdDPZTaGL5Uj+ByOCfkYNkL0wbyKU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=EL38b5D/ymoj6n994EaCs/dlDYj3oi2afBzeRtCuQuKdUdbz356T0B5qgYMZ0dnoyZ ha1FAwnKeUFthneEtyqA/NmoAN/6mAXYTgqwHYHTaRXQ1/C8Ooal1jphOMt9HGIMPj7U IKwinCSUQsq4KO9ROnHYxO4jsk2/Tywmox4ww= MIME-Version: 1.0 Received: by 10.216.88.209 with SMTP id a59mr1301012wef.50.1250616765863; Tue, 18 Aug 2009 10:32:45 -0700 (PDT) In-Reply-To: References: <4A8484E4.6090504@uffner.com> Date: Tue, 18 Aug 2009 13:32:45 -0400 Message-ID: <5f67a8c40908181032x5b23de27jc01dc1147281a1a6@mail.gmail.com> From: Zaphod Beeblebrox To: Ian FREISLICH Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: pf@freebsd.org, Robert Watson , current@freebsd.org Subject: Re: packet forwarding/firewall performance question X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Aug 2009 18:03:09 -0000 On Tue, Aug 18, 2009 at 6:57 AM, Ian FREISLICH wrote: > Robert Watson wrote: > > > > On Thu, 13 Aug 2009, Tom Uffner wrote: > > > > > i'm hoping a few people will give me estimates on what kind of > > > throughput i should theoretically expect before i provide any actual > > > test data. also, any suggestions on tuning would be welcome. > I havn't tried 800Mhz hardware, but I have extensive experience with the 266Mhz c3 in the WRAP board for comparison. The 266 Mhz part has no heatsink, but I've found it to be flakey in "hot" environments if a heatsink is not rigged to it. So... the WRAP boards have 3 "sis" interfaces. With various combinations of usage, one can expect 30 megabit of average sized packets if they kernel is doing the passing. This makes the board an adequate wireless bridge. If userland is doing the packet passing (pppd vs. mpd as a DSL router), I've had trouble getting more than about 10 megabit through the unit. This makes it an adequate DSL router but poor for aggregating multiple links. From owner-freebsd-pf@FreeBSD.ORG Thu Aug 20 04:16:53 2009 Return-Path: Delivered-To: freebsd-pf@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 73AEC106564A; Thu, 20 Aug 2009 04:16:53 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 4C50D8FC16; Thu, 20 Aug 2009 04:16:53 +0000 (UTC) Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n7K4GrqX081787; Thu, 20 Aug 2009 04:16:53 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n7K4Grbs081783; Thu, 20 Aug 2009 04:16:53 GMT (envelope-from linimon) Date: Thu, 20 Aug 2009 04:16:53 GMT Message-Id: <200908200416.n7K4Grbs081783@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-pf@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/137982: [pf] when pf can hit state limits, random IP failures and no debugging info is provided X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Aug 2009 04:16:53 -0000 Old Synopsis: when pf can hit state limits, random IP failures and no debugging info is provided New Synopsis: [pf] when pf can hit state limits, random IP failures and no debugging info is provided Responsible-Changed-From-To: freebsd-bugs->freebsd-pf Responsible-Changed-By: linimon Responsible-Changed-When: Thu Aug 20 04:16:16 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=137982 From owner-freebsd-pf@FreeBSD.ORG Thu Aug 20 09:21:15 2009 Return-Path: Delivered-To: freebsd-pf@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6BBB3106568E for ; Thu, 20 Aug 2009 09:21:15 +0000 (UTC) (envelope-from max@love2party.net) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.17.9]) by mx1.freebsd.org (Postfix) with ESMTP id 0188B8FC6C for ; Thu, 20 Aug 2009 09:21:14 +0000 (UTC) Received: from vampire.homelinux.org (dslb-088-066-032-025.pools.arcor-ip.net [88.66.32.25]) by mrelayeu.kundenserver.de (node=mrbap2) with ESMTP (Nemesis) id 0MKt72-1Me3dP30qv-000CHP; Thu, 20 Aug 2009 11:08:39 +0200 Received: (qmail 16825 invoked from network); 20 Aug 2009 09:08:39 -0000 Received: from kvm.laiers.local (HELO kvm.localnet) (192.168.4.200) by mx.laiers.local with SMTP; 20 Aug 2009 09:08:39 -0000 From: Max Laier Organization: FreeBSD To: freebsd-net@freebsd.org, d@delphij.net Date: Thu, 20 Aug 2009 11:08:38 +0200 User-Agent: KMail/1.12.0 (Linux/2.6.30-ARCH; KDE/4.3.0; x86_64; ; ) References: <4A8CFDAF.1000309@delphij.net> In-Reply-To: <4A8CFDAF.1000309@delphij.net> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <200908201108.39177.max@love2party.net> X-Provags-ID: V01U2FsdGVkX19NvRadw8WqpIs9L6155pWbO8vvRP7ssinHBfO /2UQlyjvJSfTvmGi8T1/J2EsFOIlTCEyx/RGWyk7yXBfImS+Cu kUdWDbi6w3jmQTntHZs8A== Cc: freebsd-pf@freebsd.org Subject: Re: (just for fun) port of OpenBSD pf's sloppy mode X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Aug 2009 09:21:15 -0000 Nice Work! Thanks a lot! On Thursday 20 August 2009 09:39:27 Xin LI wrote: > Since there is effort undergoing to port a newer pf version to FreeBSD, > I think this work would not be useful for inclusion in -CURRENT. > However, I'd like to share it here as someone may find it useful before > the new pf code hits the tree. The patch can also be downloaded from my I disagree about the usefulness of this. As your patch doesn't affect ABI this could make it into 8.1 (which the all new pf won't). With SVN it is also much simpler to manage the vendor branch differences, now. > website: > > http://www.delphij.net/pf-sloppy.diff freebsd-pf@ test and provide feedback - I know people have asked about this in the past. > About this patch: > > When pf(4) is operating in a manner that not all packet would went > through it, specifically, when being used in a DSR ("Direct Server > Return") network, the strict TCP state tracking would prevent some > packets from being able to pass through. This can exhibit as, when you > upload files, the connection would stall at ~60KB (may differ if you > have special TCP setting), or stalled connections. > > With this change, pf.conf would support a new syntax, i.e. "(sloppy)" as > state flag, e.g.: > > pass in quick on em0 route-to { (em1 $server1), (em1 $server2) } > round-robin proto tcp from any to $ext_ip port 80 keep state (sloppy) > > When enabled, the "sloppy" TCP FSM would be activated, which loosens the > state check. When using this option, the backend server has to use its > own mechanism to prevent ICMP teardown attack and/or insertion attacks, > so please use caution and limit the use in cases where pf(4) won't see > some packets in the connection. -- /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News From owner-freebsd-pf@FreeBSD.ORG Thu Aug 20 16:31:55 2009 Return-Path: Delivered-To: freebsd-pf@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 816A010656B3 for ; Thu, 20 Aug 2009 16:31:55 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outB.internet-mail-service.net (outb.internet-mail-service.net [216.240.47.225]) by mx1.freebsd.org (Postfix) with ESMTP id 1E6EB8FC43 for ; Thu, 20 Aug 2009 16:31:54 +0000 (UTC) Received: from idiom.com (mx0.idiom.com [216.240.32.160]) by out.internet-mail-service.net (Postfix) with ESMTP id 86386A1EA7; Thu, 20 Aug 2009 09:17:04 -0700 (PDT) X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e Received: from julian-mac.elischer.org (home.elischer.org [216.240.48.38]) by idiom.com (Postfix) with ESMTP id 5974E2D6011; Thu, 20 Aug 2009 09:17:03 -0700 (PDT) Message-ID: <4A8D76FE.7040302@elischer.org> Date: Thu, 20 Aug 2009 09:17:02 -0700 From: Julian Elischer User-Agent: Thunderbird 2.0.0.22 (Macintosh/20090605) MIME-Version: 1.0 To: Max Laier References: <4A8CFDAF.1000309@delphij.net> <200908201108.39177.max@love2party.net> In-Reply-To: <200908201108.39177.max@love2party.net> Content-Type: multipart/mixed; boundary="------------010104040109010004090600" Cc: freebsd-net@freebsd.org, d@delphij.net, freebsd-pf@freebsd.org Subject: pf and vimage X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Aug 2009 16:31:55 -0000 This is a multi-part message in MIME format. --------------010104040109010004090600 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit there were some people looking at adding vnet support to pf. Since we discussed it last, the rules of the game have significantly changed for the better. With the addition of some new facilitiesin FreeBSD, the work needed to virtualize a module has significantly decreased. The following doc gives the new rules.. --------------010104040109010004090600 Content-Type: text/plain; name="porting_to_vimage.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="porting_to_vimage.txt" August 17 2009 Julian Elischer =================== Vimage: what is it? =================== Vimage is a framework in the BSD kernel which allows a co-operating module to operate on multiple independent instances of its state so that it can participate in a virtual machine / virtual environment scenario. It refers to a part of the Jail infrastructure in FreeBSD. For historical reasons "Virtual network stack enabled jails"(1) are also known as "vimage enabled jails"(2) or "vnet enabled jails"(3). The currently correct term is the latter, which is a contraction of the first. In the future other parts of the system may be virtualized using the same technology and the term to cover all such components would be VIMAGE enhanced modules. The implementation approach taken by the vimage framework is a redefinition of selected global state variables to evaluate to constructs that allow for the virtualized state to be stored and resolved in appropriate instances of 'jail' specific container storage regions. The code operating on virtualized state has to conform to a set of rules described further below. Among other things in order to allow for all the changes to be conditionally compilable. i.e. permitting the virtualized code to fall back to operation on global state. The rest of this document will discuss NETWORK virtualization though the concepts may be true in the future for other parts of the system. The most visible change throughout the existing code is typically replacement of direct references to global variables with macros; foo_bar thus becomes V_foo_bar. V_foo_bar macros will resolve back to the foo_bar global in default kernel builds, and alternatively to the logical equivalent of some_base_pointer->_foo_bar for "options VIMAGE" kernel configs. Prepending of "V_" prefixes to variable references helps in visual discrimination between global and virtualized state. It is also possible to use an alternative syntax, of VNET(foo_bar) to achieve the same thing. The developers felt that V_foo_bar was less visually distracting while still providing enough clues to the reader that the variable is virtualized. In fact the V_foo_bar macro is locally defined near the definition of foo_bar to be an alias for VNET(foo_bar) so the two are not only equivalent, they are the same. The framework also extends the sysctl infrastructure to support access to virtualized state through introduction of the SYSCTL_VNET family of macros; those also automatically fall back to their standard SYSCTL counterparts in default kernel builds. Transparent libkvm(3) lookups are provided to virtualized variables which permits userland binaries such as netstat to operate unmodified on "options VIMAGE" kernels, though this may have some security implications. Vnets are associated with jails. In 8.0, every process is associated with a jail, usually the default (null) jail, and jails currently hang off of a processes ucred. This relationship defines a process's administrative affinity to a vnet and thus indirectly to all of its state. All network interfaces and sockets hold pointers back to their associated vnets. This relationship is obviously entirely independent from proc->ucred->jail bindings. Hence, when a process opens a socket, the socket will get bound to a vnet instance hanging off of proc->ucred->jail->vnet, but once such a socket->vnet binding gets established, it cannot be changed for the entire socket lifetime. The mapping of a from a thread to a vnet should always be done via the TD_TO_VNET macro as the path may change in the future as we get more experience with using the system. Certain classes of network interfaces (Ethernet in particular) can be reassigned from one vnet to another at any time. By definition all vnets are independent and can communicate only if they are explicitly provided with communication paths. Currently mainly netgraph is used to establish inter-vnet datapaths, though other paths are being explored such as the 'epair' back-to-back virtual interface pair, in which the different sides may exist in different jails. In network traffic processing the vnet affinity is defined either by the inbound interface or by the socket / pcb -> vnet binding. However, there are many functions in the network stack that cannot implicitly fetch the vnet context from their standard arguments. Instead of explicitly extending argument lists of such functions with a struct vnet *, the concept of a "current vnet", a per-thread variable was introduced, which can be fetched efficiently via the curvnet macro. The correct network context has to be set on entry to the network stack (socket operations, packet reception, or timer-driven functions) and cleared on exit. This must be done via provided CURVNET_SET() / CURVNET_RESTORE() family of macros, which allow for "stacking" of curvnet context setting and provide additional debugging info in INVARIANTS kernel configs. In most cases however a developer writing virtualized code will not have to set / restore the curvnet context unless the code would include timer-driven events, given that those are inherently vnet-contextless on entry. The current rule is that when not in networking code, the result of the 'curvnet' macro will return NULL and evaluating a V_xxx (or VNET(xxx)) macro will result in an kernel page-fault error. While this is not strictly necessary, it aids in debugging and assurance of program correctness. Note this does NOT mean that TD_TO_VNET(curthread) is invalid. A thread is always associated with a vnet, but just the efficient "curvnet" access method is disabled along with the ability to resolve virtualized symbols. Converting / virtualizing existing code ======================================= There are several steps need in virtualisation. 1/ Decide whether the module needs to be virtualised. If the module is a driver for specific hardware, it makes sense that there be only one instance of the driver as there is only one piece of physical hardware. There are changes in the networking code to allow physical (or virtual) interfaces to be moved between vnets. This generally requires NO changes to the network drivers of the classes covered (e.g. ethernet). Currently if your module is does not have any networking facet, the answer is "no" by default. 2/ If the module is to be virtualised, decide which attributes of the module should be virtualised. For example, It may make sense that there be a single central pool of "struct foo" and a single uma zone for them to come from, with a single lock guarding it. It might also make sense if the "foo_debug" sysctl controls all the instances at once, while on the other hand, the "foo_mode" sysctl might make better sense if it were controllable on a virtual system by virtual system basis. 3/ Work out what global variables and structures are to be virtualised to achieve the behaviour required for part #2. 4/ Work out for all the code paths through the module, how the thread entering the module can divine which virtual environment it is on. Some examples: * Since interfaces are all assigned to one vnet or another, an incoming packet has a pointer to the receive interface, which in turn has a pointer back to the vnet. Often "curvnet" will already have been set by the time your code is called anyhow. * Similarly, on any request from outside the kernel, (direct or indirect) the current thread has a way to get to the current virtual environment instance via TD_TO_VNET(curthread). For existing sockets the vnet context must be used via so->so_vnet since the thread's vnet might change after socket creation. * Timer initiated actions usually have a (void *) argument which points to some private structure for the module. It should be possible to add a pointer to the appropriate module instance into whatever structure that points to. * Sometimes an action (timer trigerred or trigerred by module load or unload simply has to check all the vimage or module instances. There are macro (pairs) for this which will iterate through all the VNET or instances. (see sample code below). This covers most of the cases, however in some cases it may still be required for the module to stash away the virtual environment instance somewhere, and make associated changes in the code. 5/ Decide which parts of the initialization and teardown are per jail and which parts are global, and separate out the code accordingly. Global initialization is done using the SYSINIT facility. Per jail initialization is done using VNET_SYSINIT(). Per jail teardown is doen using VNET_SYSUNINIT(). Global teardown is done using SYSUNIT(). In addition, the modevent handler is called with various event types before any of these are called. The modevent handler may veto load or teardown. On Shutdown, only the modevent handler is called so it may have to simulate the calling of the other handlers if clean shutdown is a requirement of your module. (see sample code below). Don't forget to unregister event handlers, and destroy locks and condition variables. 6/ Add the code described below to the files that make up the module. Details: (VNET implementation details) Firstly the file must be included. Depending on what code you use you may find you also need one or more of: , and . These requirements may change slightly as the ABI settles. Having decided which variables need to be virtualized, the definition of thosvariables needs to be modified to use the VNET_DEFINE() macro. For example: static int foo = 3; struct bar thebar = { 1,2,3 }; would become: static VNET_DEFINE(int, foo) = 3; VNET_DEFINE(struct bar, thebar) = { 1,2,3 }; extern int foo; in an include file might become: VNET_DECLARE(int foo); Normal rules regarding 'static/extern' apply. The initial values that you give in this way will be stored and used as the initial values for EACH NEW INSTANCE of these variables as new jails/vnets are created. As mentioned above, accesses to virtualized symbols are achieved via macros, which generally are of the same name as the original symbol but with a "V_" prepended, thus the head of the interface list, called 'ifnet' is replaced whereever used with "V_ifnet". We do this, by adding the following lines after the definitions above: #define V_foo VNET(foo) #define V_thebar VNET(thebar) --- side-note --- In SCTP, because the code is shared with other OS's they are replaced with a macro MODULE_GLOBAL(modulename, symbol). (this may simplify in light of recent changes). -------------- In addition, should any of your values need to be changed or viewed via sysctl, the following SYSCTL definitions would be needed: SYSCTL_VNET_PROC(_net_inet, OID_AUTO, thebar, CTLTYPE_?? | CTLFLAG_RW | CTLFLAG_SECURE3, &VNET_NAME(thebar), 0, thebar, "?", "the bar is open"); {[XXX] robert fix this is possible ^^^} SYSCTL_VNET_INT(_net_inet, OID_AUTO, foo, CTLFLAG_RW, &VNET_NAME(foo), 0, "size of foo"); In the current version of vimage, when VIMAGE is not compiled into the kernel, the macros evaluate to a direct reference to the one and only symbol/variable, so that there is no speed penalty for those not using vnets. When VIMAGE is compiled in, the macro will evaluate to an access to an offset into a data structure that is accessed on a per-vet basis. The vnet used for this is always curvnet. For this reason an attempt to access such a variable while curvnet is not valid, will result in an exception. To ensure that curvnet has a valid value when needed one needs to add the following code on all entry code paths into the networking code: int my_func(int arg) { CURVNET_SET(TD_TO_VNET(curthread)); do_my_network_stuff(arg); CURVNET_RESTORE(); return (0); } The initial value is usually something like "TD_TO_VNET(curthread) which in turn is a macro that derives the vnet affinity from the current thread. It could also be (m->m_ifp->if_vnet) if we were receiving an mbuf, or so->so_vnet if we had a socket involved. Usually, when a packet enters the system it is carried through the processing path via a single thread, and that thread will set its virtual environment reference to that indicated by the packet on picking up that new packet. This means that in the normal inbound processing path as well as the outgoing process path the current thread can be used to indicate the current virtual environment and curvet will always be valid once most user supplied code is reached. In timer events, it is sometimes necessary to add an "outer loop" to iterate through all the possible vnets if there is just one timer for all instances. When a new loadable module is virtualised the module definitions and intializers need to be examined. The following example illustrates what is needed in the case that you are not loading a new protocol, or domain. (for that see later) ============= sample skeleton code ========== /* init on boot or module load */ static int mymod_init(void) { return (error); } /**************** * Stuff that must be initialized for every instance * (including the first of course). */ static int mymod_vnet_init(const void *unused) { return (0); } /********************** * Called for the removal of the last instance only on module unload. */ static void mymod_uninit(void) { } /*********************** * Called for the removal of each instance. */ static int mymod_vnet_uninit(const void *unused) { return (0) } mymod_modevent(module_t mod, int type, void *unused) { int err = 0; switch (type) { case MOD_LOAD: /* check that loading is ok */ break; case MOD_UNLOAD: /* check that unloading is ok */ break; case MOD_QUIESCE: /* warning: try stop processing */ /* maybe sleep 1 mSec or something to let threads get out */ break; case MOD_SHUTDOWN: /* * this is called once but you may want to shut down * things in each jail, or something global. * In that case it's up to us to simulate the SYSUNINIT() * or the VNET_SYSUNINIT() */ { VNET_ITERATOR_DECL(vnet_iter); VNET_LIST_RLOCK(); VNET_FOREACH(vnet_iter) { CURVNET_SET(vnet_iter); mymod_vnet_uninit(NULL); CURVNET_RESTORE(); } VNET_LIST_RUNLOCK(); } /* you may need to shutdown something global. */ mymod_uninit(); break; default: err = EOPNOTSUPP; break; } return err; } static moduledata_t mymodmod = { "mymod", mymod_modevent, 0 }; /* define execution order using constants from /sys/sys/kernel.h */ #define MYMOD_MAJOR_ORDER SI_SUB_PROTO_BEGIN /* for example */ #define MYMOD_MODULE_ORDER (SI_ORDER_ANY + 64) /* not fussy */ #define MYMOD_SYSINIT_ORDER (MYMOD_MODULE_ORDER + 1) /* a bit later */ #define MYMOD_VNET_ORDER (MYMOD_MODULE_ORDER + 2) /* later still */ DECLARE_MODULE(mymod, mymodmod, MYMOD_MAJOR_ORDER, MYMOD_MODULE_ORDER); MODULE_DEPEND(mymod, ipfw, 2, 2, 2); /* depend on ipfw version (exactly) 2 */ MODULE_VERSION(mymod, 1); SYSINIT(mymod_init, MYMOD_MAJOR_ORDER, MYMOD_SYSINIT_ORDER, mymod_init, NULL); SYSUNINIT(mymod_uninit, MYMOD_MAJOR_ORDER, MYMOD_SYSINIT_ORDER, mymod_uninit, NULL); VNET_SYSINIT(mymod_vnet_init, MYMOD_MAJOR_ORDER, MYMOD_VNET_ORDER, mymod_vnet_init, NULL); VNET_SYSUNINIT(mymod_vnet_uninit, MYMOD_MAJOR_ORDER, MYMOD_VNET_ORDER, mymod_vnet_uninit, NULL); ========== end sample code ======= On BOOT, the order of evaluation will be: In a NON-VIMAGE kernel where the module is compiled: MODEVENT, SYSINIT and VNET_SYSINIT both runm with order defined by their order declarations. {good foot shooting material if you get it wrong!} In a VIMAGE kernel where the module is compiled in: MODEVNET, SYSINIT and VNET_SYSINIT all run with order defined by their order declarations. AND in addition, the VNET_SYSINIT is repeated once for every existing or new jail/vnet. On loading a vnet enabled kernel module after boot: MODEVENT("event = load"); SYSINIT() VNET_SYSINIT() for every existing jail AND in addition, VNET_SYSINIT being called for each new jail created. On unloading of module: MODEVENT("event = MOD_QUIESCE") MODEVENT("event = MOD_UNLOAD") VNET_SYSUNINIT called for every jail/vnet SYSUNINIT On system shutdown: MODEVENT(shutdown) NOTICE that while the order of the SYSINIT and VNET_SYSINIT is reversed from that of SYSUNINIT and VNET_SYSUNINIT, MODEVENTS do not follow this rule and thus it is dangerous to initialise and uninitialise things which are order dependent using MODEVENTs. Or, put another way, Since MODEVENT is called first during module load, it would, by the assumption that everything is reversed, be easy to assume that MODEVENT is called AFTER the SYSINITS during unload. This is in fact not the case. (and I have the scars to prove it). It might be make some sense if the "QUIESCE" was called before the SYSINIT/SYSUNINIT and the UNLOAD called after.. with a millisecond sleep between them, but this is not the case either. Since initial values are copied into the virtualized variables on each new instantiatin, it is quite possible to have modules for which some of the above methods are not needed, and they may be left out. (but not the modevent). Sometimes there is a need to iterate through the vnets. See the modevent shutdown handler (above) for an example of how to do this. Don't forget the locks. In the case where you are loading a new protocol, or domain (protocol family) there are some "shortcuts" that are in place to allow you to maintain a bit more source compatibility with older revisions of FreeBSD. It must be added that the sample code above works just fine for protocols, however protcols also have an aditional initialization vector which is via the prtocol structure, which has a pr_init() entry. When a protocol is registered using pf_proto_register(), the pr_init() for the protocol is called once for every existing vnet. in addition, it will be called for each new vnet. The pr_destroy() method will be called as well on vnet teardown. The pf_proto_register() funcion can be called either from a modevent handler of from the SYSINIT() if you have one, and the pf_proto_unregister() called from the SYSUNINIT or the unload modevent handler. If you are adding a whole new protocol domain, (protocol family) then you should add the VNET_DOMAIN_SET(domainname) (e,g, inet, inet6) macro. These use VNET_SYSINIT internally to indirectly call the dom_init() and pr_init() functions for each vnet, (and the equivalent for teardown.) In this case one needs to be absolutely sure that both your domain and protocol initializers can be called multiple times, once for each vnet. One can still add SYSINITs for once only initialization, or use the modevent handler. I prefer to do as much explicitly in the SYSINITS and VNET_SYSINITS as then you have no surprises. finally: The command to make a new jail with a new vnet: jail -c host.hostname=test path=/ vnet command=/bin/tcsh jail -c host.hostname=test path=/ children.max=4 vnet command=/bin/tcsh (children.max allows hierarchical jail creation). Note that the command must come last. --------------010104040109010004090600-- From owner-freebsd-pf@FreeBSD.ORG Fri Aug 21 21:36:35 2009 Return-Path: Delivered-To: freebsd-pf@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 78ED3106568B for ; Fri, 21 Aug 2009 21:36:35 +0000 (UTC) (envelope-from barowc@telenet.net) Received: from cdptpa-omtalb.mail.rr.com (cdptpa-omtalb.mail.rr.com [75.180.132.122]) by mx1.freebsd.org (Postfix) with ESMTP id 452958FC16 for ; Fri, 21 Aug 2009 21:36:34 +0000 (UTC) Received: from [127.0.0.1] (really [72.224.2.244]) by cdptpa-omta04.mail.rr.com with ESMTP id <20090821200230545.OQLC6077@cdptpa-omta04.mail.rr.com> for ; Fri, 21 Aug 2009 20:02:30 +0000 Message-ID: <4A8EFD59.6070106@telenet.net> Date: Fri, 21 Aug 2009 16:02:33 -0400 From: barowc@telenet.net User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.1) Gecko/20090715 Thunderbird/3.0b3 MIME-Version: 1.0 To: freebsd-pf@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Antivirus: avast! (VPS 090820-0, 08/20/2009), Outbound message X-Antivirus-Status: Clean X-Mailman-Approved-At: Fri, 21 Aug 2009 21:46:13 +0000 Subject: ALTQ and bandwidth limiting X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Aug 2009 21:36:35 -0000 I am about to install a machine at a co-location facility and I would like to limit the bandwidth on my interface to match the bandwidth i have contracted for. This is necessary because the connection i will have is not limited, but my usage is. I have added the following two lines to my pf.conf file, but this does not appear to be working. altq on $External cbq bandwidth 1Mb queue { std } queue std bandwidth 1Mb cbq(default) I assume all rules on $External will now default to this queue, and should be limited to 1Mb. I hav even specified the queue on the External rules, and there is still no limiting. Any help would be appreciated, Chris From owner-freebsd-pf@FreeBSD.ORG Sat Aug 22 23:53:22 2009 Return-Path: Delivered-To: freebsd-pf@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2835D106568D for ; Sat, 22 Aug 2009 23:53:22 +0000 (UTC) (envelope-from LConrad@Go2France.com) Received: from mgw1.MEIway.com (mgw1.meiway.com [81.255.84.75]) by mx1.freebsd.org (Postfix) with ESMTP id E44BB8FC19 for ; Sat, 22 Aug 2009 23:53:21 +0000 (UTC) Received: from VirusGate.MEIway.com (virusgate.meiway.com [81.255.84.76]) by mgw1.MEIway.com (Postfix Relay Hub) with ESMTP id 63800471FBA for ; Sun, 23 Aug 2009 01:33:25 +0200 (CEST) Received: from mail.Go2France.com (ms1.meiway.com [81.255.84.73]) by VirusGate.MEIway.com (Postfix) with ESMTP id 4D4CC3865B6 for ; Sun, 23 Aug 2009 01:33:25 +0200 (CEST) (envelope-from LConrad@Go2France.com) Received: from W500.Go2France.com [66.90.254.224] by mail.Go2France.com with ESMTP (SMTPD32-7.07) id A015D9B00130; Sun, 23 Aug 2009 01:32:37 +0200 X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Sat, 22 Aug 2009 18:33:22 -0500 To: freebsd-pf@freebsd.org From: Len Conrad Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Message-Id: <200908230132343.SM01728@W500.Go2France.com> Subject: something like bruteblock for pf? X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 22 Aug 2009 23:53:22 -0000 I've used bruteblock, which manages ipfw, for blocking SMTP attackers and reducing smtp connects by 10s of 1000s per day. But bruteblock, which hasn't moved in 3 years, logged a lot of errors like "failed to ..." which didn't seem to bother its effectiveness, but was concerning, and ugly. Anybody know of anything similar for pf? thanks Len