Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 12 Dec 2016 13:23:52 +0100
From:      Jan Bramkamp <crest@rlwinm.de>
To:        freebsd-pf@freebsd.org
Subject:   Re: Poor PF performance with 2.5k rdr's
Message-ID:  <5c14294a-66e8-1f50-0ef0-a5d77a9b656b@rlwinm.de>
In-Reply-To: <CANFey=-4bEtrBatWAQdUWQofTHUy2XseKyokpc3UmpxiUu-GMA@mail.gmail.com>
References:  <CANFey=-4bEtrBatWAQdUWQofTHUy2XseKyokpc3UmpxiUu-GMA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On 11/12/2016 17:22, chris g wrote:
> Hello,
>
> I've decided to write here, as we had no luck troubleshooting PF's
> poor performance on 1GE interface.
>
> Network scheme, given as simplest as possible is:
>
> ISP <-> BGP ROUTER <-> PF ROUTER with many rdr rules <-> LAN
>
> Problem is reproducible on any PF ROUTER's connection - to LAN and to BGP ROUTER
>
>
> Both BGP and PF routers' OS versions and tunables, hardware:
>
> Hardware: E3-1230 V2 with HT on, 8GB RAM, ASUS P8B-E, NICs: Intel I350 on PCIe
>
> FreeBSD versions tested: 9.2-RELEASE amd64 with Custom kernel,
> 10.3-STABLE(compiled 4th Dec 2016) amd64 with Generic kernel.
>
> Basic tunables (for 9.2-RELEASE):
> net.inet.ip.forwarding=1
> net.inet.ip.fastforwarding=1
> kern.ipc.somaxconn=65535
> net.inet.tcp.sendspace=65536
> net.inet.tcp.recvspace=65536
> net.inet.udp.recvspace=65536
> kern.random.sys.harvest.ethernet=0
> kern.random.sys.harvest.point_to_point=0
> kern.random.sys.harvest.interrupt=0
> kern.polling.idle_poll=1
>
> BGP router doesn't have any firewall.
>
> PF options of PF router are:
> set state-policy floating
> set limit { states 2048000, frags 2000, src-nodes 384000 }
> set optimization normal
>
>
> Problem description:
> We are experiencing low throughput when PF is enabled with all the
> rdr's. If 'skip' is set on benchmarked interface or the rdr rules are
> commented (not present) - the bandwidth is flawless. In PF, there is
> no scrubbing done, most of roughly 2500 rdr rules look like this,
> please note that no interface is specified and it's intentional:
>
> rdr pass inet proto tcp from any to 1.2.3.4 port 1235 -> 192.168.0.100 port 1235
>
> All measurements were taken using iperf 2.0.5 with options "-c <IP>"
> or "-c <IP> -m -t 60 -P 8" on client side and "-s" on server side. We
> changed directions too.
> Please note that this is a production environment and there was some
> other traffic on bencharked interfaces (let's say 20-100Mbps) during
> both tests, thus iperf won't show full Gigabit. There is no networking
> eqipment between 'client' and 'server' - just 2 NICs independly
> connected with Cat6 cable.
>
> Without further ado, here are benchmark results:
>
> server's PF enabled with fw rules but without rdr rules:
>   root@client:~ # iperf -c server
> ------------------------------------------------------------
> Client connecting to server, TCP port 5001
> TCP window size: 65.0 KByte (default)
> ------------------------------------------------------------
> [  3] local clients_ip port 51361 connected with server port 5001
> [ ID] Interval       Transfer     Bandwidth
> [  3]  0.0-10.0 sec  1.09 GBytes   936 Mbits/sec
>
>
>
> server's PF enabled with fw rules and around 2500 redirects present:
> root@client:~ # iperf -c seerver
> ------------------------------------------------------------
> Client connecting to server, TCP port 5001
> TCP window size: 65.0 KByte (default)
> ------------------------------------------------------------
> [  3] local clients_ip port 45671 connected with server port 5001
> [ ID] Interval       Transfer     Bandwidth
> [  3]  0.0-10.0 sec   402 MBytes   337 Mbits/sec
>
>
> That much of a  difference is 100% reproducible on production env.
>
> Performance depends on hours of day&night, the result is 160-400Mbps
> with RDR rules present and always above 900Mbps with RDR rules
> disabled.
>
>
> Some additional information:
>
> # pfctl -s info
> Status: Enabled for 267 days 10:25:22         Debug: Urgent
>
> State Table                          Total             Rate
>   current entries                   132810
>   searches                      5863318875          253.8/s
>   inserts                        140051669            6.1/s
>   removals                       139918859            6.1/s
> Counters
>   match                         1777051606           76.9/s
>   bad-offset                             0            0.0/s
>   fragment                             191            0.0/s
>   short                                518            0.0/s
>   normalize                              0            0.0/s
>   memory                                 0            0.0/s
>   bad-timestamp                          0            0.0/s
>   congestion                             0            0.0/s
>   ip-option                           4383            0.0/s
>   proto-cksum                            0            0.0/s
>   state-mismatch                     52574            0.0/s
>   state-insert                         172            0.0/s
>   state-limit                            0            0.0/s
>   src-limit                              0            0.0/s
>   synproxy                               0            0.0/s
>
> # pfctl -s states | wc -l
>   113705
>
> # pfctl  -s memory
> states        hard limit  2048000
> src-nodes     hard limit   384000
> frags         hard limit     2000
> tables        hard limit     1000
> table-entries hard limit   200000
>
> # pfctl -s Interfaces|wc -l
>       75
>
> # pfctl -s rules | wc -l
>     1226
>
>
> In our opinion hardware is not too weak as we have only 10-30% of CPU
> usage and during the benchmark it doesn't go to 100%. Even any one
> vcore isn't filled up to 100% of CPU usage.
>
>
> I would be really grateful if someone could point me at the right direction.

PF uses a linear search (with some optimizations to skip over rules 
which can't match) to establish new flows. If your PF config is really 
that simple give IPFW a try. While PF has a lot nicer syntax IPFW 
supports more powerful tables. IPFW tables are key value maps and the 
value can be used as argument to most actions. It may reduce your 2500 
lookups to one table lookup. If you can afford to loose the source IP 
and port you could use a userspace TCP proxy.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5c14294a-66e8-1f50-0ef0-a5d77a9b656b>