From owner-freebsd-pf@FreeBSD.ORG Fri Sep 7 12:05:18 2012 Return-Path: Delivered-To: pf@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EA3621065689; Fri, 7 Sep 2012 12:05:17 +0000 (UTC) (envelope-from ianf@clue.co.za) Received: from zcs04.jnb1.cloudseed.co.za (zcs04.jnb1.cloudseed.co.za [41.154.0.161]) by mx1.freebsd.org (Postfix) with ESMTP id 06C358FC15; Fri, 7 Sep 2012 12:05:16 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by zcs04.jnb1.cloudseed.co.za (Postfix) with ESMTP id 688822A82A8B; Fri, 7 Sep 2012 14:05:14 +0200 (SAST) X-Virus-Scanned: amavisd-new at zcs04.jnb1.cloudseed.co.za Received: from zcs04.jnb1.cloudseed.co.za ([127.0.0.1]) by localhost (zcs04.jnb1.cloudseed.co.za [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0taU1UGM5gGR; Fri, 7 Sep 2012 14:05:13 +0200 (SAST) Received: from clue.co.za (l2tp.clue.co.za [41.154.88.20]) by zcs04.jnb1.cloudseed.co.za (Postfix) with ESMTPSA id 4D64D2A829F8; Fri, 7 Sep 2012 14:05:13 +0200 (SAST) Received: from localhost ([127.0.0.1] helo=clue.co.za) by clue.co.za with esmtp (Exim 4.80 (FreeBSD)) (envelope-from ) id 1T9xJ9-0000pZ-Mg; Fri, 07 Sep 2012 14:05:11 +0200 To: =?ISO-8859-1?Q?Ermal_Lu=E7i?= From: Ian FREISLICH In-Reply-To: References: <20120905115140.GF15915@FreeBSD.org> <50476187.8000303@gibfest.dk> <20120905183607.GI15915@glebius.int.ru> <20120906064640.GL15915@glebius.int.ru> X-Attribution: BOFH Date: Fri, 07 Sep 2012 14:05:11 +0200 Message-Id: Cc: pf@freebsd.org Subject: Re: [HEADS UP] merging projects/pf into head X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Sep 2012 12:05:18 -0000 =?ISO-8859-1?Q?Ermal_Lu=E7i?= wrote: > > - the "pf: state key linking mismatch" which affects pf as far back > > as we've been prepared to test (FreeBSD-8.0). Although it only > > became visible in the logs in -CURRENT before 9-RELEASE with the > > pf import then. It manifests as connections stalling randomly. > > > This has been an issue since new pf(4) import. My contention is that this issue is also present in earlier pf. It's just not logged verbosely: [firewall1.jnb1] ~ # uname -a FreeBSD firewall1.jnb1.gp-online.net 8.1-RELEASE FreeBSD 8.1-RELEASE #23: Tue Aug 7 20:21:54 SAST 2012 ianf@firewall1.jnb1.gp-online.net:/usr/obj/usr/src/sys/FIREWALL amd64 [firewall1.jnb1] ~ # pfctl -s inf Status: Enabled for 30 days 16:27:26 Debug: Urgent State Table Total Rate current entries 377102 searches 126189706387 47596.4/s inserts 6358571792 2398.3/s removals 6358194690 2398.2/s Counters match 23798723897 8976.4/s bad-offset 0 0.0/s fragment 29807 0.0/s short 76362 0.0/s normalize 234 0.0/s memory 0 0.0/s bad-timestamp 0 0.0/s congestion 0 0.0/s ip-option 78290 0.0/s proto-cksum 11023818 4.2/s state-mismatch 4799367 1.8/s state-insert 75295 0.0/s state-limit 22 0.0/s src-limit 0 0.0/s synproxy 0 0.0/s Every time the state-mismatch counter increments, the connection stalls. This manifests as as web pages needing to be reloaded sometimes in order to complete downloading, or ssh connections being reset. While 4799367 is a small fraction of the total searches, the chance of your flow being bitten is multiplied by each hop through a FreeBSD router running pf. While composing this email, the state-mismatch counter increased by 11589. We don't see this issue at all with Gleb's patches applied and forwarding performance is greatly improved. Whatever happens I'd like a way forward to be found because pf deployed at the scale we're using it is unuseable post 2011-06-28 (and not ideal before). > > There's not been a fix since it was first reported. We're seeing > > 0.08% of our connections dropped on the floor or about 4 per second. > > As a result, we've been seriously considering replacing our FreeBSD > > routers. > > I have missed the report of this, can you point to details? http://www.freebsd.org/cgi/query-pr.cgi?pr=163208 Comes to mind. I'm sure there were some earlier reports, but I can't find them in a hurry. I'm also pretty sure there have been reports on current@. I posted to current@ http://www.freebsd.org/cgi/getmsg.cgi?fetch=164206+169604+/usr/local/www/db/text/2012/freebsd-current/20120812.freebsd-current Which is how I came to this list on mail from Gleb. I can tell you that this is not peculiar to 9 and later. pf pre-9 was just silent about dropping the flows although the problem occurs less frequently. Ian -- Ian Freislich