Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Jan 2011 20:14:14 +0100
From:      Pawel Tyll <ptyll@nitronet.pl>
To:        Jack Vogel <jfvogel@gmail.com>
Cc:        Brandon Gooch <jamesbrandongooch@gmail.com>, freebsd-ipfw@freebsd.org, Luigi Rizzo <rizzo@iet.unipi.it>, freebsd-net@freebsd.org
Subject:   Re: [Panic] Dummynet/IPFW related recurring crash.
Message-ID:  <814029043.20110124201414@nitronet.pl>
In-Reply-To: <AANLkTim87RE_ZAR3ZKwbJ6KW3xCPze=76fQ9e8Qo1v_D@mail.gmail.com>
References:  <201953550.20110106221821@nitronet.pl> <AANLkTikozeXLQtePk1niH-N58n9f6xAVGVBTmLPux2e-@mail.gmail.com> <57312439.20110107171430@nitronet.pl> <AANLkTikZAHNRu_3=i5_0FTNOMua8Kr2bW7H%2BRJZ4c=PW@mail.gmail.com> <13247006.20110124020848@nitronet.pl> <AANLkTi=UJHtvEm5k-Cxrtdr-2k_3_hyR_F8GuUwmHmE4@mail.gmail.com> <AANLkTim87RE_ZAR3ZKwbJ6KW3xCPze=76fQ9e8Qo1v_D@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
> Just replying so you know I'm seeing it, but something that takes 14 days=
 to
> even happen
> is NOT going to be an easy one to find. As Brandon said, all the info you
> can provide please.
Here's the dump in case you've not seen it before. Somehow 8.1-RELEASE
managed to make a proper dump, which became impossible later on.

http://www.freebsd.org/cgi/query-pr.cgi?pr=3D152360

I strongly feel that it's related to dummynet. Not only panic seems to
always be pointing at it, but also this is the only one of four
identical machines, that crashes (and also only one that uses
dummynet). From it's neighbor:
up 166 days, 18:45 (FreeBSD 8.1-RELEASE)

There's also this problem with fail to reboot after panic, and failure
to dump properly. I think I have one more spare box laying around
somewhere, so I will look into it.

I can trace all this panic business back to one thing I started doing:

# ipfw pipe list | grep flows | wc -l
    2318

# crontab -l
(...)
*/1 * * * * /root/fw/pipestats.sh
(...)

# cat /root/fw/pipestats.sh
/sbin/ipfw pipe list > pipestats-`date "+%Y%m%d-%H%M%S"`

Maybe something overflows here? Don't know :(

Ruleset itself seems to be as simple as it gets:

00010       1397       106004 deny ip from any to any not antispoof in
00020   70490095   5395475808 fwd [...] ip from table(60) to table(61)
00050       3741       173481 allow tcp from any to [...] dst-port 53
00051    1868319    195628380 allow udp from any to [...]
00059      19993      1043277 deny ip from any to [...]
00100     603380     33725224 deny ip from any to table(10) dst-port 131-13=
9,445
00102      21201       874326 fwd [...] tcp from table(1) to not table(5) d=
st-port 80
00103          0            0 fwd [...] tcp from table(2) to not table(5) d=
st-port 80
00104         31         2196 fwd [...] tcp from table(3) to not table(5)
00105       4577       296736 deny ip from table(3) to not table(5)
30000  299626026 144893738712 pipe tablearg ip from table(100) to any in
30001  349984632 312762616666 pipe tablearg ip from any to table(101) out
34900    6724440   1768229912 skipto 35001 ip from table(10) to table(10)
35000  344337771 135015696767 fwd [...] ip from 192.168.0.0/16 to not 192.1=
68.0.0/16
65534 1118791481 888359380351 allow ip from any to any
65535          0            0 allow ip from any to any

Two weeks seems to be rather strange. It never happens much earlier,
and I don't remember panics much later. Too bad I didn't install
uptimed back then, would have at least 10 panics recorded now ;)

There's something elusive somewhere here, and sad part is it doesn't
happen to others. Machine itself isn't really heavy loaded, processing
40-60kpps.

Thank you for your interest, I very much appreciate it.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?814029043.20110124201414>