Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Apr 2009 17:33:49 GMT
From:      Ivan Panachev <ivan.panachev@gmail.com>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/133786: ip_input might cause kernel panic
Message-ID:  <200904161733.n3GHXnRd013108@www.freebsd.org>
Resent-Message-ID: <200904161740.n3GHe7q8054037@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         133786
>Category:       kern
>Synopsis:       ip_input might cause kernel panic
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Apr 16 17:40:07 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     Ivan Panachev
>Release:        7.1-RELEASE-p4
>Organization:
nichego.net
>Environment:
FreeBSD censored 7.1-RELEASE-p4 FreeBSD 7.1-RELEASE-p4 #2: Fri Apr 10 14:04:18 MSD 2009     root@censored:/usr/obj/usr/src/sys/NANO7  i386


>Description:
Some time ago one of my FreeBSD boxes began to panic. I've got two panic dumps and inspected stack traces. Here's the first one (another's the very same):

$ kgdb kernel vmcore.0
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd"...

Unread portion of the kernel message buffer:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xc
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc07f5588
stack pointer           = 0x28:0xe3db8a78
frame pointer           = 0x28:0xe3db8a94
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 14 (swi1: net)
trap number             = 12
panic: page fault
cpuid = 0
Uptime: 1d4h3m47s
Physical memory: 973 MB
Dumping 65 MB: 50 34 18 2

#0  doadump () at pcpu.h:196
196     pcpu.h: No such file or directory.
        in pcpu.h

(kgdb) backtrace
#0  doadump () at pcpu.h:196
#1  0xc07a68b7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418
#2  0xc07a6b89 in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:574
#3  0xc0af377c in trap_fatal (frame=0xe3db8a38, eva=12) at /usr/src/sys/i386/i386/trap.c:939
#4  0xc0af3a00 in trap_pfault (frame=0xe3db8a38, usermode=0, eva=12) at /usr/src/sys/i386/i386/trap.c:849
#5  0xc0af43bc in trap (frame=0xe3db8a38) at /usr/src/sys/i386/i386/trap.c:528
#6  0xc0ada22b in alltraps_with_regs_pushed () at /usr/src/sys/i386/i386/exception.s:155
#7  0xc0850008 in pfil_head_register (ph=0x30) at /usr/src/sys/net/pfil.c:102
#8  0xc089aff8 in ip_forward (m=0xc46e3100, srcrt=0) at /usr/src/sys/netinet/ip_input.c:1308
#9  0xc089c82c in ip_input (m=0xc46e3100) at /usr/src/sys/netinet/ip_input.c:610
#10 0xc084e5d5 in netisr_dispatch (num=2, m=0xc46e3100) at /usr/src/sys/net/netisr.c:185
#11 0xc08423a1 in ether_demux (ifp=0xc4521c00, m=0xc46e3100) at /usr/src/sys/net/if_ethersubr.c:834
#12 0xc0842793 in ether_input (ifp=0xc4521c00, m=0xc46e3100) at /usr/src/sys/net/if_ethersubr.c:692
#13 0xc084d903 in vlan_input (ifp=0xc400b400, m=0xc46e3100) at /usr/src/sys/net/if_vlan.c:946
#14 0xc08422e7 in ether_demux (ifp=0xc400b400, m=0xc46e3100) at /usr/src/sys/net/if_ethersubr.c:743
#15 0xc0842793 in ether_input (ifp=0xc400b400, m=0xc46e3100) at /usr/src/sys/net/if_ethersubr.c:692
#16 0xc09a375e in sis_rxeof (sc=0xc3fdeb00) at /usr/src/sys/pci/if_sis.c:1476
#17 0xc09a4264 in sis_poll (ifp=0xc400b400, cmd=POLL_ONLY, count=5) at /usr/src/sys/pci/if_sis.c:1589
#18 0xc0799c7b in netisr_poll () at /usr/src/sys/kern/kern_poll.c:432
#19 0xc084e842 in swi_net (dummy=0x0) at /usr/src/sys/net/netisr.c:254
#20 0xc07847fb in ithread_loop (arg=0xc3e92230) at /usr/src/sys/kern/kern_intr.c:1088
#21 0xc0781369 in fork_exit (callout=0xc0784640 <ithread_loop>, arg=0xc3e92230, frame=0xe3db8d38) at /usr/src/sys/kern/kern_fork.c:804
#22 0xc0ada2a0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255
#23 0x00000000 in ?? ()

According to core dump and kernel message buffer the panic occured in m_copydata function called from #8 (m_copydata isn't reflected in backtrace, don't know why).

(kgdb) frame 8
#8  0xc089aff8 in ip_forward (m=0xc46e3100, srcrt=0) at /usr/src/sys/netinet/ip_input.c:1308
1308                    m_copydata(m, 0, mcopy->m_len, mtod(mcopy, caddr_t));

Inspecting local vars I found the reason of the problem:

(kgdb) p mcopy->m_hdr.mh_len
$1 = 204
(kgdb) p mcopy->m_hdr.mh_next
$2 = (struct mbuf *) 0x0
(kgdb) p *mcopy
$3 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc4582a34 "E", mh_len = 212, mh_flags = 2, mh_type = 1, pad = "\000"}, M_dat = {MH = {MH_pkthdr = {rcvif = 0xc4521c00, header = 0x0, len = 204,
        csum_flags = 0, csum_data = 0, tso_segsz = 0, ether_vtag = 807, tags = {slh_first = 0x0}}, MH_dat = {MH_ext = {ext_buf = 0x30000045 <Address 0x30000045 out of bounds>, ext_free = 0x409483,
          ext_args = 0xaf760680, ext_size = 2729879744, ref_cnt = 0xa8af90d9, ext_type = -2029987982},
..

(kgdb) p m->m_hdr.mh_len
$4 = 48
(kgdb) p m->m_hdr.mh_next
$5 = (struct mbuf *) 0x0
(kgdb) p *m
$6 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc4705012 "E", mh_len = 48, mh_flags = 3, mh_type = 1, pad = "\000"}, M_dat = {MH = {MH_pkthdr = {rcvif = 0xc4521c00, header = 0x0, len = 48,
        csum_flags = 0, csum_data = 0, tso_segsz = 0, ether_vtag = 807, tags = {slh_first = 0x0}}, MH_dat = {MH_ext = {ext_buf = 0xc4705000 "", ext_free = 0, ext_args = 0x0, ext_size = 2048,
..


So in_forward function called m_copydata with source data m shorter (48 octets) than expected (204 octets) and it cause kernel to panic. After that I've made a quickfix (see patch attached), it seems to solve the problem.

P.S. If any Core Team member wants to get a full core dump, kernel build or something else, contact me via e-mail specified

>How-To-Repeat:
Never tried, it repeats itself :)

>Fix:


Patch attached with submission follows:

--- /usr/src/sys/netinet/ip_input.c.orig	2009-04-10 13:53:58.000000000 +0400
+++ /usr/src/sys/netinet/ip_input.c	2009-04-10 13:56:35.000000000 +0400
@@ -1305,6 +1305,10 @@
 	if (mcopy != NULL) {
 		mcopy->m_len = min(ip->ip_len, M_TRAILINGSPACE(mcopy));
 		mcopy->m_pkthdr.len = mcopy->m_len;
+		if(mcopy->m_len > m->m_len) {
+			/* sometimes it happens :/ */
+			mcopy->m_len = m->m_len;
+		}
 		m_copydata(m, 0, mcopy->m_len, mtod(mcopy, caddr_t));
 	}
 


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200904161733.n3GHXnRd013108>