From owner-freebsd-pf@FreeBSD.ORG Tue May 3 09:16:23 2011 Return-Path: Delivered-To: freebsd-pf@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0FA39106566C for ; Tue, 3 May 2011 09:16:23 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from QMTA11.westchester.pa.mail.comcast.net (qmta11.westchester.pa.mail.comcast.net [76.96.59.211]) by mx1.freebsd.org (Postfix) with ESMTP id AB5D38FC20 for ; Tue, 3 May 2011 09:16:22 +0000 (UTC) Received: from omta03.westchester.pa.mail.comcast.net ([76.96.62.27]) by QMTA11.westchester.pa.mail.comcast.net with comcast id exFN1g0010bG4ec5BxGNsW; Tue, 03 May 2011 09:16:22 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta03.westchester.pa.mail.comcast.net with comcast id exGL1g00A1t3BNj3PxGMoV; Tue, 03 May 2011 09:16:21 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 486319B418; Tue, 3 May 2011 02:16:19 -0700 (PDT) Date: Tue, 3 May 2011 02:16:19 -0700 From: Jeremy Chadwick To: Daniel Hartmeier Message-ID: <20110503091619.GA39329@icarus.home.lan> References: <20110503015854.GA31444@icarus.home.lan> <20110503084800.GB9657@insomnia.benzedrine.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110503084800.GB9657@insomnia.benzedrine.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org, freebsd-pf@freebsd.org Subject: Re: RELENG_8 pf stack issue (state count spiraling out of control) X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 May 2011 09:16:23 -0000 On Tue, May 03, 2011 at 10:48:00AM +0200, Daniel Hartmeier wrote: > On Mon, May 02, 2011 at 06:58:54PM -0700, Jeremy Chadwick wrote: > > > Status: Enabled for 76 days 06:49:10 Debug: Urgent > > > The "pf uptime" shown above, by the way, matches system uptime. > > > ps -axl > > > > UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND > > 0 422 0 0 -16 0 0 0 pftm DL ?? 1362773081:04.00 [pfpurge] > > This looks weird, too. 1362773081 minutes would be >2500 years. > > Usually, you should see [idle] with almost uptime in minutes, and > [pfpurge] with much less, like in > > # uptime > 10:22AM up 87 days, 19:36, 1 user, load averages: 0.00, 0.03, 0.05 > # echo "((87*24)+19)*60+36" | bc > 126456 > > # ps -axl > UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND > 0 7 0 0 44 0 0 8 pftm DL ?? 0:13.16 [pfpurge] > 0 11 0 0 171 0 0 8 - RL ?? 124311:23.04 [idle] Agreed -- and that's exactly how things look on the same box right now: $ ps -axl | egrep 'UID|pfpurge|idle' UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND 0 11 0 0 171 0 0 64 - RL ?? 2375:15.91 [idle] 0 422 0 0 -16 0 0 16 pftm DL ?? 0:00.28 [pfpurge] The ps -axl output I provided earlier came from /var/crash/core.0.txt. So it's interesting that ps -axl as well as vmstat -i both showed something off-the-wall. I wonder if this can happen when within ddb? Unsure. I do have the core from "call doadump", so I should be able to go back and re-examine it with kgdb. I just wish I knew what to poke around looking for in there. Sadly I don't see a way with bsnmpd(8) to monitor things like interrupt usage, etc. otherwise I'd be graphing that. The more monitoring the better; at least then I could say "wow, interrupts really did shoot through the roof -- the box went crazy!" and RMA the thing. :-) > How is time handled on your machine? ntpdate on boot and then ntpd? Yep, you got it: ntpdate_enable="yes" ntpdate_config="/conf/ME/ntp.conf" ntpd_enable="yes" ntpd_config="/conf/ME/ntp.conf" I don't use ntpd_sync_on_start because I've never had reason to. I always set the system/BIOS clock to UTC time when building a system. I use ntpd's complaint about excessive offset as an indicator that something bad happened. /conf/ME/ntp.conf on this machine syncs from another on the private network (em1) only, and that machine syncs from a series of geographically-diverse stratum 2 servers and one stratum 1 server. I've never seen high delays, offsets, or jitter using "ntpq -c peers" on any box we have. Actual timecounters (not time itself) are handled by ACPI-safe or ACPI-fast (varies per boot; I've talked to jhb@ about this before and it's normal). powerd is in use on all our systems, and on this box use of processor sleep states (lowest state = C2; physical CPU only supports C0-C2 and I wouldn't go any lower than that anyway :-) ). Appropriate /boot/loader.conf entries that pertain to it: # Enable use of P-state CPU frequency throttling. # http://wiki.freebsd.org/TuningPowerConsumption hint.p4tcc.0.disabled="1" hint.acpi_throttle.0.disabled="1" There are numerous other systems exactly like this one (literally same model of hardware, RAM amount, CPU model, BIOS version and settings, and system configuration, including pf) that have much higher load and fire many more interrupts (particularly the NFS server!) that haven't exhibited any problems. This box had an uptime of 72 days, and prior to that around 100 (before being taken down for world/kernel upgrades). All machines have ECC RAM too, and MCA/MCE is in use. You don't know how bad I'd love to blame this on a hardware issue (it's always possible in some way or another), but the way this manifest itself was extremely specific. The problem could be super rare and something triggered it that hasn't been seen before by developers. So far there's only 1 other user who has seen this behaviour but his was attributed to use of "reassemble tcp" which I wasn't using; so the true problem could still be out there. I feel better knowing I'm not the only one who's seen this oddity. Since his post, I've removed all scrub rules from all of our machines as a precaution. If it ever happens again we'll have one more thing to safely rule out. We have other machines (different hardware, running RELENG_7 i386) which have had 1+ year uptimes also using pf, so the possibility of just some "crazy fluke" is plausible to me. > Any manual time changes since the last boot? None unless adjkerntz did something during the PST->PDT switchover, but that would manifest itself as a +1 hour offset difference. Since the machine rebooted the system synced its time without issue and well within acceptable delta (1.075993 sec). I did not power-cycle the box during any of this; pure soft reboots. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB |