From owner-freebsd-pf@freebsd.org Sun Dec 11 16:22:40 2016 Return-Path: Delivered-To: freebsd-pf@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 88B14C7106C for ; Sun, 11 Dec 2016 16:22:40 +0000 (UTC) (envelope-from cgodspd@gmail.com) Received: from mail-wm0-x22f.google.com (mail-wm0-x22f.google.com [IPv6:2a00:1450:400c:c09::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 198B1B63 for ; Sun, 11 Dec 2016 16:22:40 +0000 (UTC) (envelope-from cgodspd@gmail.com) Received: by mail-wm0-x22f.google.com with SMTP id u144so7104701wmu.1 for ; Sun, 11 Dec 2016 08:22:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=2VZUrsUTJOgD2OtJPngdV7ip8g1Z6mKkWzt5c6KjAFw=; b=bRmra0Uoj7WnPtZ+tHYoQTqfPtpfOUzn6xZbArlQfekUjLBg6l4aGwZ7FhuSyncObM UfNDDWIQtLiApR3bMuNVKx2/Gc+tbCXs5EFePp9gzMVT5TYlyDqcZ70f3y7i6CrFWAvu WKGSl7yU0sDVAd26kicQToQZfp1A4Lyx7Y4/eXDEdLkMYH4xAjBByp3TpKW0kysb2Xq6 dvNTWPX4N3EI7/KQ4xvz3vbhZqYyl+iU+AQYBf2hX0DiZrynVPobFHPAlFeHVqPPvlEc etaGMMBTFI3Tw1wEmIUFL52jaiKPhscRONs5/OwpcYp9ci8v+KfqGRhE1vEgWWVzxE/L vMzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=2VZUrsUTJOgD2OtJPngdV7ip8g1Z6mKkWzt5c6KjAFw=; b=jTr/Csrpo29KGf8fkG3AEswWze/K7XmumntOl4gvcDKmWJJ4QjPqz6+xvFIYc3n61J RxRci/3ih4ssTLBfH3A1DtmKDmlFwn/r2fIOw9oZaR/XQv34JZbBR2C7BfAunzpdTZKH 92XK6dUZjRfQvsRN5BN+PpQacWCY6CgN5YcXoc9veM8BRNPYibBLOIBhohOl4FzoKur8 e86HRyJYqGm1kIm1HO0PntPtr9itf72lT8om4+ynFJoNCB6r8XUFWSCdMS29QSZTFGKx nuPfI8DjPha/kzFZL/AAa74J11GTSbQPFGcsWfKQJhEGfIHPs3deCxxAjFk7ZCrRf83g pRtw== X-Gm-Message-State: AKaTC015Q1DWrFINJXrIobvXZoLRz5lALPBTl82xC3oDM3xFKu6XI2K/1FqS0xTE9/Cw2lfXqmsLfMmV/WZepA== X-Received: by 10.28.97.139 with SMTP id v133mr5632828wmb.117.1481473357953; Sun, 11 Dec 2016 08:22:37 -0800 (PST) MIME-Version: 1.0 Received: by 10.80.154.65 with HTTP; Sun, 11 Dec 2016 08:22:37 -0800 (PST) From: chris g Date: Sun, 11 Dec 2016 17:22:37 +0100 Message-ID: Subject: Poor PF performance with 2.5k rdr's To: freebsd-pf@freebsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Dec 2016 16:22:40 -0000 Hello, I've decided to write here, as we had no luck troubleshooting PF's poor performance on 1GE interface. Network scheme, given as simplest as possible is: ISP <-> BGP ROUTER <-> PF ROUTER with many rdr rules <-> LAN Problem is reproducible on any PF ROUTER's connection - to LAN and to BGP ROUTER Both BGP and PF routers' OS versions and tunables, hardware: Hardware: E3-1230 V2 with HT on, 8GB RAM, ASUS P8B-E, NICs: Intel I350 on PCIe FreeBSD versions tested: 9.2-RELEASE amd64 with Custom kernel, 10.3-STABLE(compiled 4th Dec 2016) amd64 with Generic kernel. Basic tunables (for 9.2-RELEASE): net.inet.ip.forwarding=1 net.inet.ip.fastforwarding=1 kern.ipc.somaxconn=65535 net.inet.tcp.sendspace=65536 net.inet.tcp.recvspace=65536 net.inet.udp.recvspace=65536 kern.random.sys.harvest.ethernet=0 kern.random.sys.harvest.point_to_point=0 kern.random.sys.harvest.interrupt=0 kern.polling.idle_poll=1 BGP router doesn't have any firewall. PF options of PF router are: set state-policy floating set limit { states 2048000, frags 2000, src-nodes 384000 } set optimization normal Problem description: We are experiencing low throughput when PF is enabled with all the rdr's. If 'skip' is set on benchmarked interface or the rdr rules are commented (not present) - the bandwidth is flawless. In PF, there is no scrubbing done, most of roughly 2500 rdr rules look like this, please note that no interface is specified and it's intentional: rdr pass inet proto tcp from any to 1.2.3.4 port 1235 -> 192.168.0.100 port 1235 All measurements were taken using iperf 2.0.5 with options "-c " or "-c -m -t 60 -P 8" on client side and "-s" on server side. We changed directions too. Please note that this is a production environment and there was some other traffic on bencharked interfaces (let's say 20-100Mbps) during both tests, thus iperf won't show full Gigabit. There is no networking eqipment between 'client' and 'server' - just 2 NICs independly connected with Cat6 cable. Without further ado, here are benchmark results: server's PF enabled with fw rules but without rdr rules: root@client:~ # iperf -c server ------------------------------------------------------------ Client connecting to server, TCP port 5001 TCP window size: 65.0 KByte (default) ------------------------------------------------------------ [ 3] local clients_ip port 51361 connected with server port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 1.09 GBytes 936 Mbits/sec server's PF enabled with fw rules and around 2500 redirects present: root@client:~ # iperf -c seerver ------------------------------------------------------------ Client connecting to server, TCP port 5001 TCP window size: 65.0 KByte (default) ------------------------------------------------------------ [ 3] local clients_ip port 45671 connected with server port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 402 MBytes 337 Mbits/sec That much of a difference is 100% reproducible on production env. Performance depends on hours of day&night, the result is 160-400Mbps with RDR rules present and always above 900Mbps with RDR rules disabled. Some additional information: # pfctl -s info Status: Enabled for 267 days 10:25:22 Debug: Urgent State Table Total Rate current entries 132810 searches 5863318875 253.8/s inserts 140051669 6.1/s removals 139918859 6.1/s Counters match 1777051606 76.9/s bad-offset 0 0.0/s fragment 191 0.0/s short 518 0.0/s normalize 0 0.0/s memory 0 0.0/s bad-timestamp 0 0.0/s congestion 0 0.0/s ip-option 4383 0.0/s proto-cksum 0 0.0/s state-mismatch 52574 0.0/s state-insert 172 0.0/s state-limit 0 0.0/s src-limit 0 0.0/s synproxy 0 0.0/s # pfctl -s states | wc -l 113705 # pfctl -s memory states hard limit 2048000 src-nodes hard limit 384000 frags hard limit 2000 tables hard limit 1000 table-entries hard limit 200000 # pfctl -s Interfaces|wc -l 75 # pfctl -s rules | wc -l 1226 In our opinion hardware is not too weak as we have only 10-30% of CPU usage and during the benchmark it doesn't go to 100%. Even any one vcore isn't filled up to 100% of CPU usage. I would be really grateful if someone could point me at the right direction. Thank you, Chris