Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 17 May 2014 11:50:02 GMT
From:      John Baldwin <jhb@FreeBSD.org>
To:        freebsd-amd64@FreeBSD.org
Subject:   Re: amd64/189741: 9/STABLE panic at em_msix_rx w/ em(4) + PF
Message-ID:  <201405171150.s4HBo27Z007784@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR amd64/189741; it has been noted by GNATS.

From: John Baldwin <jhb@FreeBSD.org>
To: Nick Rogers <ncrogers@gmail.com>
Cc: freebsd-gnats-submit@freebsd.org, Gleb Smirnoff <glebius@freebsd.org>
Subject: Re: amd64/189741: 9/STABLE panic at em_msix_rx w/ em(4) + PF
Date: Sat, 17 May 2014 07:43:26 -0400

 On 5/16/14, 10:51 AM, Nick Rogers wrote:
 > On Thu, May 15, 2014 at 4:32 AM, John Baldwin <jhb@freebsd.org> wrote:
 >> On 5/12/14, 7:43 PM, Nick Rogers wrote:
 >>> GNU gdb 6.1.1 [FreeBSD]
 >>> Copyright 2004 Free Software Foundation, Inc.
 >>> GDB is free software, covered by the GNU General Public License, and you are
 >>> welcome to change it and/or distribute copies of it under certain conditions.
 >>> Type "show copying" to see the conditions.
 >>> There is absolutely no warranty for GDB.  Type "show warranty" for details.
 >>> This GDB was configured as "amd64-marcel-freebsd"...
 >>>
 >>> Unread portion of the kernel message buffer:
 >>>
 >>>
 >>> Fatal trap 12: page fault while in kernel mode
 >>> cpuid = 5; apic id = 05
 >>> fault virtual address = 0x10
 >>> fault code = supervisor read data, page not present
 >>> instruction pointer = 0x20:0xffffffff8033d350
 >>> stack pointer        = 0x28:0xffffff83545384b0
 >>> frame pointer        = 0x28:0xffffff83545384c0
 >>> code segment = base 0x0, limit 0xfffff, type 0x1b
 >>> = DPL 0, pres 1, long 1, def32 0, gran 1
 >>> processor eflags = interrupt enabled, resume, IOPL = 0
 >>> current process = 12 (irq262: em2:rx 0)
 >>> trap number = 12
 >>> panic: page fault
 >>> cpuid = 5
 >>> KDB: stack backtrace:
 >>> #0 0xffffffff80956836 at kdb_backtrace+0x66
 >>> #1 0xffffffff8091c40e at panic+0x1ce
 >>> #2 0xffffffff80d31e70 at trap_fatal+0x290
 >>> #3 0xffffffff80d321d1 at trap_pfault+0x211
 >>> #4 0xffffffff80d327d3 at trap+0x363
 >>> #5 0xffffffff80d1b9d3 at calltrap+0x8
 >>> #6 0xffffffff8034872d at pf_test_rule+0x17ed
 >>> #7 0xffffffff8034ba12 at pf_test+0x1032
 >>> #8 0xffffffff8035112b at pf_check_in+0x2b
 >>> #9 0xffffffff809e952e at pfil_run_hooks+0x9e
 >>> #10 0xffffffff80a5286a at ip_input+0x2ea
 >>> #11 0xffffffff809e8858 at netisr_dispatch_src+0x218
 >>> #12 0xffffffff809df93d at ether_demux+0x14d
 >>> #13 0xffffffff809dfc1e at ether_nh_input+0x1fe
 >>> #14 0xffffffff809e8858 at netisr_dispatch_src+0x218
 >>> #15 0xffffffff809df85f at ether_demux+0x6f
 >>> #16 0xffffffff809dfc1e at ether_nh_input+0x1fe
 >>> #17 0xffffffff809e8858 at netisr_dispatch_src+0x218
 >>> Uptime: 17d7h20m59s
 >>> Dumping 2932 out of 12256 MB: (CTRL-C to abort)
 >>> .1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
 >>>
 >>> Reading symbols from /boot/kernel/aio.ko...Reading symbols from
 >>> /boot/kernel/aio.ko.symbols...done.
 >>> done.
 >>> Loaded symbols for /boot/kernel/aio.ko
 >>> Reading symbols from /boot/kernel/coretemp.ko...Reading symbols from
 >>> /boot/kernel/coretemp.ko.symbols...done.
 >>> done.
 >>> Loaded symbols for /boot/kernel/coretemp.ko
 >>> Reading symbols from /boot/kernel/cc_htcp.ko...Reading symbols from
 >>> /boot/kernel/cc_htcp.ko.symbols...done.
 >>> done.
 >>> Loaded symbols for /boot/kernel/cc_htcp.ko
 >>> #0  doadump (textdump=Variable "textdump" is not available.
 >>> ) at pcpu.h:234
 >>> 234 pcpu.h: No such file or directory.
 >>> in pcpu.h
 >>> (kgdb) list *0xffffffff8033d350
 >>> 0xffffffff8033d350 is in pf_addrcpy (/usr/src/sys/contrib/pf/net/pf.c:512).
 >>> 507 pf_addrcpy(struct pf_addr *dst, struct pf_addr *src, sa_family_t af)
 >>> 508 {
 >>> 509 switch (af) {
 >>> 510 #ifdef INET
 >>> 511 case AF_INET:
 >>> 512 dst->addr32[0] = src->addr32[0];
 >>> 513 break;
 >>> 514 #endif /* INET */
 >>> 515 case AF_INET6:
 >>> 516 dst->addr32[0] = src->addr32[0];
 >>> (kgdb) backtrace
 >>> #0  doadump (textdump=Variable "textdump" is not available.
 >>> ) at pcpu.h:234
 >>> #1  0xffffffff8091bee6 in kern_reboot (howto=260) at
 >>> /usr/src/sys/kern/kern_shutdown.c:454
 >>> #2  0xffffffff8091c3e7 in panic (fmt=0x1 <Address 0x1 out of bounds>)
 >>> at /usr/src/sys/kern/kern_shutdown.c:642
 >>> #3  0xffffffff80d31e70 in trap_fatal (frame=0xc, eva=Variable "eva" is
 >>> not available.
 >>> ) at /usr/src/sys/amd64/amd64/trap.c:878
 >>> #4  0xffffffff80d321d1 in trap_pfault (frame=0xffffff8354538400,
 >>> usermode=0) at /usr/src/sys/amd64/amd64/trap.c:794
 >>> #5  0xffffffff80d327d3 in trap (frame=0xffffff8354538400) at
 >>> /usr/src/sys/amd64/amd64/trap.c:456
 >>> #6  0xffffffff80d1b9d3 in calltrap () at
 >>> /usr/src/sys/amd64/amd64/exception.S:232
 >>> #7  0xffffffff8033d350 in pf_addrcpy (dst=0xfffffe010c6416b8,
 >>> src=0x10, af=2 '\002') at /usr/src/sys/contrib/pf/net/pf.c:522
 >>
 >> A 'src' pointer of 0x10 here would explain the crash (and is consistent
 >> with the fault address).
 >>
 >>> #8  0xffffffff8034872d in pf_test_rule (rm=0xffffff8354538788,
 >>> sm=0xffffff8354538780, direction=1, kif=0xfffffe0007d08100,
 >>> m=0xfffffe0030555d00, off=20, h=0xfffffe0030bad00e,
 >>>     pd=0xffffff83545386c0, am=0xffffff8354538790,
 >>> rsm=0xffffff8354538778, ifq=0x0, inp=0x0) at
 >>> /usr/src/sys/contrib/pf/net/pf.c:3900
 >>
 >> This is actually in pf_create_state(), and it would seem that 'nk' would
 >> have to be NULL for this to happen.  However, 'nsn' would have
 >> to be non-NULL.
 >>
 >> I think I see a possible bug that is fixed in 10.  Try this:
 >>
 >> Index: 9/sys/contrib/pf/net/pf_lb.c
 >> ===================================================================
 >> --- 9/sys/contrib/pf/net/pf_lb.c        (revision 266119)
 >> +++ 9/sys/contrib/pf/net/pf_lb.c        (working copy)
 >> @@ -788,6 +788,7 @@
 >>                         pool_put(&pf_state_key_pl, *skp);
 >>  #endif
 >>                         *skw = *sks = *nkp = *skp = NULL;
 >> +                       *sn = NULL;
 >>                         return (NULL);
 >>                 }
 >>         }
 >>
 > Thank you! I will give that a shot and let you know if the panic continues.
 
 I just checked and this was the fix made to HEAD in r260377 for PR
 182557.  It just needs to be merged.  I'll try to get to that today.
 
 -- 
 John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201405171150.s4HBo27Z007784>