From owner-freebsd-pf@freebsd.org Tue Dec 13 15:18:15 2016 Return-Path: Delivered-To: freebsd-pf@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7AA01C75BFB for ; Tue, 13 Dec 2016 15:18:15 +0000 (UTC) (envelope-from ian.freislich@capeaugusta.com) Received: from mail-yb0-x235.google.com (mail-yb0-x235.google.com [IPv6:2607:f8b0:4002:c09::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4025CB91 for ; Tue, 13 Dec 2016 15:18:14 +0000 (UTC) (envelope-from ian.freislich@capeaugusta.com) Received: by mail-yb0-x235.google.com with SMTP id d59so16758443ybi.1 for ; Tue, 13 Dec 2016 07:18:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=capeaugusta-com.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to; bh=T9ih0UrVQulskWnxk3ju1wCqbZhOAUAfOYG4NYFfWig=; b=OSJTkyhp/PsFLxfqBjwhv5A4QtbbetjmiW5OuoYYZHBNItfs+FRMU833qRHk7m+ULu Fq4mVwRUHwBnDYEShimsgOi/3wjmm+93S//ebev9YG+HMZr0gAHJFoq76tyUK/oPQ4aA 4EDYQDkdLr8ZQ2uk/jJqXOQgvunbyokT6BTTFZvHF+trKu6Ut/rDK4D2FTU39ToJ5jiB DG7YpkVj5+qkXi55Fq8RWd2RALroEtgsnunJdUaqmVmKGbqI+iw7fyWGKX97Ml6BC2AI QnxXdcDw9/1gT1MJ5fmXT1n2dGIx9vtspNASFEFznT1nD0j1QKzfoyDBuIJiNB025kJq OsPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to; bh=T9ih0UrVQulskWnxk3ju1wCqbZhOAUAfOYG4NYFfWig=; b=nD+71YoOF4eCOgugESZFJqxWOM13rPmlwpz52R3heHWE3id/Jef48kc4b64ig4Zjoh wXcII5B7cPqjRBs6YN/bs6batqpfbEUznpwCvXkMAxSEO+Zuh7a43swwUqKOy2/P9eq3 OgyCVijG8le5oNvd2Su990s5Abp30fh56uNWB4Bs5L2W3HNRgRTkgixBaeUPaJcMW0jI foVfhemJ2HQH3JqGtNrV4i4U3Bjkr9Grlcmv7dRPtzILn9FG4Zzy5Ja9/8FwjtwFAzsN W7OMQRn41Kb/nLWh4/xy2kUoeNkdkRFoK2E2kbxCV/n7Uqa30W0FORYxIpAYM8EUTu4f X1YQ== X-Gm-Message-State: AKaTC00zX/ugRsfL+k1V1TIttuMvayaeZbapuXOVgjzhJOI8dgad4zGjQC4TshcWrZquiB48FQCJR+qK8B0GsCKXVn7cTS8yEiL6M9aQ6Swd0d9+Xt0VqshGW9EgkdrU05xNigWp4aPA6yCueQqIB2a7/Le5ESzLSBeI/T2eoykoXYk72PfXnI9SPbfgb6mWMt6OdA== X-Received: by 10.129.55.13 with SMTP id e13mr106252250ywa.344.1481642290317; Tue, 13 Dec 2016 07:18:10 -0800 (PST) Received: from zen.clue.co.za ([64.53.114.237]) by smtp.gmail.com with ESMTPSA id q186sm19751114ywb.2.2016.12.13.07.18.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Dec 2016 07:18:10 -0800 (PST) Subject: Re: Poor PF performance with 2.5k rdr's To: freebsd-pf@freebsd.org References: From: Ian FREISLICH Message-ID: <5d8f9f65-bb3a-ef25-0fbe-bfc28b9025df@capeaugusta.com> Date: Tue, 13 Dec 2016 10:18:09 -0500 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Dec 2016 15:18:15 -0000 Chris, It's been a fairly long time since I ran a FreeBSD router in a production environment, 10-CURRENT at the time. tcp.sendspace/recvspace will have no effect on forwarding performance. Digging around in my email I've found some of my config: --- /etc/pf.conf # Options # ~~~~~~~ set timeout { \ adaptive.start 900000, \ adaptive.end 1800000 \ } set block-policy return set state-policy if-bound set optimization normal set ruleset-optimization basic set limit states 1500000 set limit frags 40000 set limit src-nodes 150000 --- --- /etc/sysctl.conf --- net.inet.ip.fastforwarding=1 net.inet.tcp.blackhole=2 net.inet.udp.blackhole=1 net.inet.ip.fastforwarding=1 net.inet.carp.preempt=1 net.inet.icmp.icmplim_output=0 net.inet.icmp.icmplim=0 kern.random.sys.harvest.interrupt=0 kern.random.sys.harvest.ethernet=0 kern.random.sys.harvest.point_to_point=0 net.route.netisr_maxqlen=8192 --- --- /boot/loader.conf console="comconsole" net.isr.maxthreads="8" net.isr.defaultqlimit="4096" net.isr.maxqlimit="81920" net.isr.direct="0" net.isr.direct_force="0" kern.ipc.nmbclusters="262144" kern.maxusers="1024" hw.bce.rx_pages="8" hw.bce.tx_pages="8" --- Our pfctl -s inf at the time: State Table Total Rate current entries 330022 searches 516720212 91910.4/s inserts 24545254 4365.9/s removals 24215232 4307.2/s Counters match 66166232 11769.2/s We were using a different NIC to you and eventually moved to ixgb(4) and bxe(4) NICs to handle the traffic, but the principle is the same: tune the queues. We didn't have as many rdr rules as you do, but the rule set is only linearly searched when there is no matching state in the state table. This means the rules are linearly searched for the first packet in each flow. In my testing, the other large contributor to forwarding rate is L1 cache size. Intel CPUs have traditionally had very small L1 cache sizes ranging from 12K to 32K and they're almost never quoted in marketing or comparison material. Your CPU has 32K of L1 data and 32K of L1 instruction cache per core. You may want to try disabling HT if that's possible these days to reduce L1 contention with the HT instance on each core. I may be talking total rubbish regarding HT and cache architecture but I think it's worth a try. Ian -- Ian Freislich On 12/11/16 11:22, chris g wrote: > Hello, > > I've decided to write here, as we had no luck troubleshooting PF's > poor performance on 1GE interface. > > Network scheme, given as simplest as possible is: > > ISP <-> BGP ROUTER <-> PF ROUTER with many rdr rules <-> LAN > > Problem is reproducible on any PF ROUTER's connection - to LAN and to BGP ROUTER > > > Both BGP and PF routers' OS versions and tunables, hardware: > > Hardware: E3-1230 V2 with HT on, 8GB RAM, ASUS P8B-E, NICs: Intel I350 on PCIe > > FreeBSD versions tested: 9.2-RELEASE amd64 with Custom kernel, > 10.3-STABLE(compiled 4th Dec 2016) amd64 with Generic kernel. > > Basic tunables (for 9.2-RELEASE): > net.inet.ip.forwarding=1 > net.inet.ip.fastforwarding=1 > kern.ipc.somaxconn=65535 > net.inet.tcp.sendspace=65536 > net.inet.tcp.recvspace=65536 > net.inet.udp.recvspace=65536 > kern.random.sys.harvest.ethernet=0 > kern.random.sys.harvest.point_to_point=0 > kern.random.sys.harvest.interrupt=0 > kern.polling.idle_poll=1 > > BGP router doesn't have any firewall. > > PF options of PF router are: > set state-policy floating > set limit { states 2048000, frags 2000, src-nodes 384000 } > set optimization normal > > > Problem description: > We are experiencing low throughput when PF is enabled with all the > rdr's. If 'skip' is set on benchmarked interface or the rdr rules are > commented (not present) - the bandwidth is flawless. In PF, there is > no scrubbing done, most of roughly 2500 rdr rules look like this, > please note that no interface is specified and it's intentional: > > rdr pass inet proto tcp from any to 1.2.3.4 port 1235 -> 192.168.0.100 port 1235 > > All measurements were taken using iperf 2.0.5 with options "-c " > or "-c -m -t 60 -P 8" on client side and "-s" on server side. We > changed directions too. > Please note that this is a production environment and there was some > other traffic on bencharked interfaces (let's say 20-100Mbps) during > both tests, thus iperf won't show full Gigabit. There is no networking > eqipment between 'client' and 'server' - just 2 NICs independly > connected with Cat6 cable. > > Without further ado, here are benchmark results: > > server's PF enabled with fw rules but without rdr rules: > root@client:~ # iperf -c server > ------------------------------------------------------------ > Client connecting to server, TCP port 5001 > TCP window size: 65.0 KByte (default) > ------------------------------------------------------------ > [ 3] local clients_ip port 51361 connected with server port 5001 > [ ID] Interval Transfer Bandwidth > [ 3] 0.0-10.0 sec 1.09 GBytes 936 Mbits/sec > > > > server's PF enabled with fw rules and around 2500 redirects present: > root@client:~ # iperf -c seerver > ------------------------------------------------------------ > Client connecting to server, TCP port 5001 > TCP window size: 65.0 KByte (default) > ------------------------------------------------------------ > [ 3] local clients_ip port 45671 connected with server port 5001 > [ ID] Interval Transfer Bandwidth > [ 3] 0.0-10.0 sec 402 MBytes 337 Mbits/sec > > > That much of a difference is 100% reproducible on production env. > > Performance depends on hours of day&night, the result is 160-400Mbps > with RDR rules present and always above 900Mbps with RDR rules > disabled. > > > Some additional information: > > # pfctl -s info > Status: Enabled for 267 days 10:25:22 Debug: Urgent > > State Table Total Rate > current entries 132810 > searches 5863318875 253.8/s > inserts 140051669 6.1/s > removals 139918859 6.1/s > Counters > match 1777051606 76.9/s > bad-offset 0 0.0/s > fragment 191 0.0/s > short 518 0.0/s > normalize 0 0.0/s > memory 0 0.0/s > bad-timestamp 0 0.0/s > congestion 0 0.0/s > ip-option 4383 0.0/s > proto-cksum 0 0.0/s > state-mismatch 52574 0.0/s > state-insert 172 0.0/s > state-limit 0 0.0/s > src-limit 0 0.0/s > synproxy 0 0.0/s > > # pfctl -s states | wc -l > 113705 > > # pfctl -s memory > states hard limit 2048000 > src-nodes hard limit 384000 > frags hard limit 2000 > tables hard limit 1000 > table-entries hard limit 200000 > > # pfctl -s Interfaces|wc -l > 75 > > # pfctl -s rules | wc -l > 1226 > > > In our opinion hardware is not too weak as we have only 10-30% of CPU > usage and during the benchmark it doesn't go to 100%. Even any one > vcore isn't filled up to 100% of CPU usage. > > > I would be really grateful if someone could point me at the right direction. > > > Thank you, > Chris > _______________________________________________ > freebsd-pf@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-pf > To unsubscribe, send any mail to "freebsd-pf-unsubscribe@freebsd.org" -- Cape Augusta Digital Properties, LLC a Cape Augusta Company *Breach of confidentiality & accidental breach of confidentiality * This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.