Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 8 Mar 2013 11:32:54 +0900
From:      YongHyeon PYUN <pyunyh@gmail.com>
To:        Jeremy Chadwick <jdc@koitsu.org>
Cc:        yongari@freebsd.org, freebsd-stable@freebsd.org, Lo?c Blot <loic.blot@unix-experience.fr>
Subject:   Re: Strange reboot since 9.1
Message-ID:  <20130308023254.GC3246@michelle.cdnetworks.com>
In-Reply-To: <20130307163827.GA96983@icarus.home.lan>
References:  <1362560123.16808.4.camel@iMac-LBlot.domain.iogs> <CAJ-UWtTA2P26PUa=6%2B3xR4idC5RqeXnK2s-jw3815Y6Dif-Sng@mail.gmail.com> <1362652057.16808.23.camel@iMac-LBlot.domain.iogs> <51388E42.5040500@FreeBSD.org> <1362661965.16808.36.camel@iMac-LBlot.domain.iogs> <51389ED5.6030207@bsdinfo.com.br> <1362670734.16808.48.camel@iMac-LBlot.domain.iogs> <20130307163827.GA96983@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 07, 2013 at 08:38:27AM -0800, Jeremy Chadwick wrote:
> On Thu, Mar 07, 2013 at 04:38:54PM +0100, Lo?c Blot wrote:
> > Hi Marcelo, thanks. Here is a better trace:
> > 
> > ---------------------------------
> > 
> > kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.11
> > GNU gdb 6.1.1 [FreeBSD]
> > Copyright 2004 Free Software Foundation, Inc.
> > GDB is free software, covered by the GNU General Public License, and you
> > are
> > welcome to change it and/or distribute copies of it under certain
> > conditions.
> > Type "show copying" to see the conditions.
> > There is absolutely no warranty for GDB.  Type "show warranty" for
> > details.
> > This GDB was configured as "amd64-marcel-freebsd"...
> > 
> > Unread portion of the kernel message buffer:
> > 
> > 
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; apic id = 00
> > fault virtual address	= 0x0
> > fault code		= supervisor read data, page not present
> > instruction pointer	= 0x20:0xffffffff80a84414
> > stack pointer	        = 0x28:0xffffff822fc267a0
> > frame pointer	        = 0x28:0xffffff822fc26830
> > code segment		= base 0x0, limit 0xfffff, type 0x1b
> > 			= DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags	= interrupt enabled, resume, IOPL = 0
> > current process		= 12 (irq265: bce0)
> > trap number		= 12
> > panic: page fault
> > cpuid = 0
> > KDB: stack backtrace:
> > #0 0xffffffff809208a6 at kdb_backtrace+0x66
> > #1 0xffffffff808ea8be at panic+0x1ce
> > #2 0xffffffff80bd8240 at trap_fatal+0x290
> > #3 0xffffffff80bd857d at trap_pfault+0x1ed
> > #4 0xffffffff80bd8b9e at trap+0x3ce
> > #5 0xffffffff80bc315f at calltrap+0x8
> > #6 0xffffffff80a861d5 at udp_input+0x475
> > #7 0xffffffff80a043dc at ip_input+0xac
> > #8 0xffffffff809adafb at netisr_dispatch_src+0x20b
> > #9 0xffffffff809a35cd at ether_demux+0x14d
> > #10 0xffffffff809a38a4 at ether_nh_input+0x1f4
> > #11 0xffffffff809adafb at netisr_dispatch_src+0x20b
> > #12 0xffffffff80438fd7 at bce_intr+0x487
> > #13 0xffffffff808be8d4 at intr_event_execute_handlers+0x104
> > #14 0xffffffff808c0076 at ithread_loop+0xa6
> > #15 0xffffffff808bb9ef at fork_exit+0x11f
> > #16 0xffffffff80bc368e at fork_trampoline+0xe
> > Uptime: 27m20s
> > Dumping 1265 out of 8162
> > MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..92%
> > 
> > #0  doadump (textdump=Variable "textdump" is not available.
> > ) at pcpu.h:224
> > 224	pcpu.h: No such file or directory.
> > 	in pcpu.h
> > (kgdb) bt f
> > #0  doadump (textdump=Variable "textdump" is not available.
> > ) at pcpu.h:224
> > No locals.
> > #1  0xffffffff808ea3a1 in kern_reboot (howto=260)
> > at /usr/src/sys/kern/kern_shutdown.c:448
> > 	_ep = Variable "_ep" is not available.
> > (kgdb) bt
> > #0  doadump (textdump=Variable "textdump" is not available.
> > ) at pcpu.h:224
> > #1  0xffffffff808ea3a1 in kern_reboot (howto=260)
> > at /usr/src/sys/kern/kern_shutdown.c:448
> > #2  0xffffffff808ea897 in panic (fmt=0x1 <Address 0x1 out of bounds>)
> > at /usr/src/sys/kern/kern_shutdown.c:636
> > #3  0xffffffff80bd8240 in trap_fatal (frame=0xc, eva=Variable "eva" is
> > not available.
> > ) at /usr/src/sys/amd64/amd64/trap.c:857
> > #4  0xffffffff80bd857d in trap_pfault (frame=0xffffff822fc266f0,
> > usermode=0) at /usr/src/sys/amd64/amd64/trap.c:773
> > #5  0xffffffff80bd8b9e in trap (frame=0xffffff822fc266f0)
> > at /usr/src/sys/amd64/amd64/trap.c:456
> > #6  0xffffffff80bc315f in calltrap ()
> > at /usr/src/sys/amd64/amd64/exception.S:228
> > #7  0xffffffff80a84414 in udp_append (inp=0xfffffe019e2a1000,
> > ip=0xfffffe00444b6c80, n=0xfffffe00444b6c00, off=20,
> > udp_in=0xffffff822fc268a0) at /usr/src/sys/netinet/udp_usrreq.c:252
> > #8  0xffffffff80a861d5 in udp_input (m=0xfffffe00444b6c00, off=Variable
> > "off" is not available.
> > ) at /usr/src/sys/netinet/udp_usrreq.c:618
> > #9  0xffffffff80a043dc in ip_input (m=0xfffffe00444b6c00)
> > at /usr/src/sys/netinet/ip_input.c:760
> > #10 0xffffffff809adafb in netisr_dispatch_src (proto=1, source=Variable
> > "source" is not available.
> > ) at /usr/src/sys/net/netisr.c:1013
> > #11 0xffffffff809a35cd in ether_demux (ifp=0xfffffe00053fa000,
> > m=0xfffffe00444b6c00) at /usr/src/sys/net/if_ethersubr.c:940
> > #12 0xffffffff809a38a4 in ether_nh_input (m=Variable "m" is not
> > available.
> > ) at /usr/src/sys/net/if_ethersubr.c:759
> > #13 0xffffffff809adafb in netisr_dispatch_src (proto=9, source=Variable
> > "source" is not available.
> > ) at /usr/src/sys/net/netisr.c:1013
> > #14 0xffffffff80438fd7 in bce_intr (xsc=Variable "xsc" is not available.
> > ) at /usr/src/sys/dev/bce/if_bce.c:6903
> > #15 0xffffffff808be8d4 in intr_event_execute_handlers (p=Variable "p" is
> > not available.
> > ) at /usr/src/sys/kern/kern_intr.c:1262
> > #16 0xffffffff808c0076 in ithread_loop (arg=0xfffffe00057424e0)
> > at /usr/src/sys/kern/kern_intr.c:1275
> > #17 0xffffffff808bb9ef in fork_exit (callout=0xffffffff808bffd0
> > <ithread_loop>, arg=0xfffffe00057424e0, frame=0xffffff822fc26c40)
> > at /usr/src/sys/kern/kern_fork.c:992
> > #18 0xffffffff80bc368e in fork_trampoline ()
> > at /usr/src/sys/amd64/amd64/exception.S:602
> > #19 0x0000000000000000 in ?? ()
> > #20 0x0000000000000000 in ?? ()
> > #21 0x0000000000000001 in ?? ()
> > #22 0x0000000000000000 in ?? ()
> > #23 0x0000000000000000 in ?? ()
> > #24 0x0000000000000000 in ?? ()
> > #25 0x0000000000000000 in ?? ()
> > #26 0x0000000000000000 in ?? ()
> > #27 0x0000000000000000 in ?? ()
> > #28 0x0000000000000000 in ?? ()
> > #29 0x0000000000000000 in ?? ()
> > #30 0x0000000000000000 in ?? ()
> > #31 0x0000000000000000 in ?? ()
> > #32 0x0000000000000000 in ?? ()
> > #33 0x0000000000000000 in ?? ()
> > #34 0x0000000000000000 in ?? ()
> > #35 0x0000000000000000 in ?? ()
> > #36 0x0000000000000000 in ?? ()
> > #37 0x0000000000000000 in ?? ()
> > #38 0x0000000000000000 in ?? ()
> > #39 0x0000000000000000 in ?? ()
> > #40 0x0000000000000000 in ?? ()
> > #41 0x0000000000000000 in ?? ()
> > #42 0x0000000000000000 in ?? ()
> > #43 0x0000000000000002 in ?? ()
> > #44 0xffffffff81241c00 in tdq_cpu ()
> > #45 0xfffffe0005501000 in ?? ()
> > #46 0x0000000000000000 in ?? ()
> > #47 0xffffff822fc266d0 in ?? ()
> > #48 0xffffff822fc26678 in ?? ()
> > #49 0xfffffe019ed11470 in ?? ()
> > #50 0xffffffff8091352e in sched_switch (td=0x0,
> > newtd=0xfffffe00057424e0, flags=Variable "flags" is not available.
> > ) at /usr/src/sys/kern/sched_ule.c:1921
> > Previous frame inner to this frame (corrupt stack?)
> > 

[...]

> CC'ing Yong-Hyeon (yongari@) who helps maintain the bce(4) driver; it
> looks to me the issue is there.  He may have some advice.

I recall there had been a couple of bce(4) related crash reports(
e.g. kern/171739) but the root cause of the issue was not
identified yet. Give that most of crash reports indicate bce(4)'s
RX path, I suspect the driver modifies mbufs passed to upper stack.
I still have to revive one of my box that can host quad-port bce(4)
controllers but couldn't find time and new MB.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130308023254.GC3246>