Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 20 Jul 2006 17:53:29 +0200
From:      Michal Mertl <mime@traveller.cz>
To:        freebsd-stable@freebsd.org
Subject:   Kernel panic with PF
Message-ID:  <1153410809.1126.66.camel@genius.i.cz>

next in thread | raw e-mail | index | archive | help
Hello,

I am deploying FreeBSD based application proxies' based firewall
(www.kernun.com, but not much English there) and am having frequent
panics of RELENG_6_1 under load. The server has IP forwarding disabled.

I've got two machines in a carp cluster and the transparent proxies use
PF to get the data.

I don't know much about kernel internals and PF but from the following
backtrace I understand that the crash happens because rpool->cur on line
2158 in src/sys/contrib/pf/net/pf.c is NULL and is dereferenced. It
probably shouldn't happen yet it does.

The machines are SMP and were running SMP kernel. The only places where
pool.cur (or pool->cur) is assigned to are in pf_ioctl.c. It seems there
are some lock operations though so it is probably believed that the
coder is properly locked.

I have been running with kern.smp.disabled=1 for a moment before I put
the old firewall in place and haven't seen the panic but the time was
deffinitely too short to make me believe it fixes the issue. Can setting
debug.mpsafenet to 0 possibly also help?

I could probably bandaid this particular failure mode by returning
failure instead of panicing but the bug is probably elsewhere.

I've lost the debug kernel from which this backtrace is and can't
therefore continue much :-(. Unfortunately so far I can only reproduce
the problem in production and for obvious reasons I can't put it there.

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x28
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xffffffff801ab528
stack pointer           = 0x10:0xffffffffb1ade650
frame pointer           = 0x10:0xffffff004cc7cc30
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 15 (swi1: net)
trap number             = 12
panic: page fault

#0  doadump () at pcpu.h:172
#1  0x0000000000000004 in ?? ()
#2  0xffffffff803d5137 in boot (howto=260)
    at ../../../kern/kern_shutdown.c:402
#3  0xffffffff803d58a1 in panic (fmt=0xffffff007ba32000 "@\223<A3>{")
    at ../../../kern/kern_shutdown.c:558
#4  0xffffffff80543b3f in trap_fatal (frame=0xffffff007ba32000,
    eva=18446742976272241472) at ../../../amd64/amd64/trap.c:660
#5  0xffffffff80543e5f in trap_pfault (frame=0xffffffffb1ade5a0,
usermode=0)
    at ../../../amd64/amd64/trap.c:573
#6  0xffffffff80544113 in trap (frame=
      {tf_rdi = 2, tf_rsi = -1098223465792, tf_rdx = -1098439497700,
tf_rcx = -1
314002464, tf_r8 = 0, tf_r9 = -1314002776, tf_rax = 0, tf_rbx = 0,
tf_rbp = -109
8223465424, tf_r10 = 1, tf_r11 = 257, tf_r12 = -1098439497700, tf_r13 =
-1314002
776, tf_r14 = 2, tf_r15 = -1314002464, tf_trapno = 12, tf_addr = 40,
tf_flags =
216171684640539392, tf_err = 0, tf_rip = -2145733336, tf_cs = 8,
tf_rflags = 661
18, tf_rsp = -1314003360, tf_ss = 16})
at ../../../amd64/amd64/trap.c:352
#7  0xffffffff8052feab in calltrap ()
at ../../../amd64/amd64/exception.S:168
#8  0xffffffff801ab528 in pf_map_addr (af=2 '\002',
r=0xffffff004cc7cac0,
    saddr=0xffffff003fe7681c, naddr=0xffffffffb1ade9e0, init_addr=0x0,
    sn=0xffffffffb1ade8a8) at ../../../contrib/pf/net/pf.c:2163
#9  0xffffffff801acab6 in pf_get_translation (pd=0xffffffffb1ade9c0,
    m=0xffffff0042ede900, off=20, direction=1, kif=0xffffff007b038a00,
    sn=0xffffffffb1ade8a8, saddr=0xffffff003fe7681c, sport=0,
    daddr=0xffffff003fe76820, dport=50881, naddr=0xffffffffb1ade9e0,
    nport=0xffffffffb1ade8b6) at ../../../contrib/pf/net/pf.c:2618
#10 0xffffffff801b315b in pf_test_tcp (rm=0xffffffffb1ade960,
    sm=0xffffffffb1ade950, direction=1, kif=0xffffff007b038a00,
    m=0xffffff0042ede900, off=20, h=0xffffff003fe76810,
    pd=0xffffffffb1ade9c0, am=0xffffffffb1ade968,
rsm=0xffffffffb1ade970,
    ifq=0x2, inp=0x0) at ../../../contrib/pf/net/pf.c:3013
#11 0xffffffff801b5694 in pf_test (dir=1, ifp=0xffffff0000bee800,
    m0=0xffffffffb1adeaa0, eh=0xffffffffb1ade97e, inp=0x0)
    at ../../../contrib/pf/net/pf.c:6449
#12 0xffffffff801bafb2 in pf_check_in (arg=0x2, m=0xffffffffb1adeaa0,
    ifp=0xffffff004cc7cac0, dir=-1314002464, inp=0xffffffffb1ade9e0)
    at ../../../contrib/pf/net/pf_ioctl.c:3358
#13 0xffffffff80461c2e in pfil_run_hooks (ph=0xffffffff807e0920,
    mp=0xffffffffb1adeb28, ifp=0xffffff0000bee800, dir=1, inp=0x0)
    at ../../../net/pfil.c:139
#14 0xffffffff8048d225 in ip_input (m=0xffffff0042ede900)
    at ../../../netinet/ip_input.c:465
#15 0xffffffff8046180c in netisr_processqueue (ni=0xffffffff807df690)
    at ../../../net/netisr.c:236
#16 0xffffffff80461abd in swi_net (dummy=0x2)
at ../../../net/netisr.c:349
#17 0xffffffff803bbd99 in ithread_loop (arg=0xffffff00000506a0)
    at ../../../kern/kern_intr.c:684
#18 0xffffffff803ba527 in fork_exit (
    callout=0xffffffff803bbc50 <ithread_loop>, arg=0xffffff00000506a0,
    frame=0xffffffffb1adec50) at ../../../kern/kern_fork.c:805
#19 0xffffffff8053020e in fork_trampoline ()
    at ../../../amd64/amd64/exception.S:394
#20 0x0000000000000000 in ?? ()

The firewall also reports lots of PF problems durings operation:

Jul 20 10:44:11 fw1 kernel: Jul 20 10:44:11 fw1 HTTP[7607]: KERN-100-E
[natutil.c:770] ioctl(): Invalid argument (EINVAL=22)
Jul 20 10:44:11 fw1 kernel: Jul 20 10:44:11 fw1 HTTP[7607]: NATT-111-E
add_rule(): PF ioctl DIOCADDRULE failed              
Jul 20 10:44:11 fw1 kernel: Jul 20 10:44:11 fw1 HTTP[7607]: NATT-701-E
addnatmap out(): Adding TCP NAT MAP from [127.0.0.1]:60860 to
[212.80.76.13]:80 -> [193.179.161.10]:60860 failed
Jul 20 10:44:11 fw1 kernel: Jul 20 10:44:11 fw1 HTTP[7607]: NETL-210-E
netbind(server,10): NAT binding failed

Kernel often reports "pool_ticket: 1429 != 1430" (with increasing
numbers over time).

Thank you very much for any advice.

Regards

Michal




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1153410809.1126.66.camel>