Date: Mon, 14 Apr 2003 10:05:13 -0700 (PDT) From: Evan Oldford <eoldford@verniernetworks.com> To: FreeBSD-gnats-submit@FreeBSD.org Cc: mark@verniernetworks.com Subject: kern/50951: kernel ran out of mbufs, m_copy() failed and the return value wasn't checked before dereferencing the pointer Message-ID: <200304141705.h3EH5DiX093581@lax.verniernetworks.com> Resent-Message-ID: <200304141710.h3EHA0t5032163@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 50951 >Category: kern >Synopsis: kernel ran out of mbufs, m_copy() failed and the return value wasn't checked before dereferencing the pointer >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Apr 14 10:10:00 PDT 2003 >Closed-Date: >Last-Modified: >Originator: Evan Oldford >Release: FreeBSD 4.7-RELEASE i386 >Organization: Vernier Networks >Environment: System: FreeBSD net7-20.eoldford.com 4.7-p1-RELEASE FreeBSD 4.7-p1-RELEASE #2: Fri Apr 11 12:01:39 PDT 2003 root@lax.verniernetworks.com:/usr/build/ambit2-3.1/freebsd/sys/compile/AMBIT i386 >Description: Summary from Mark Gooderum: This is a FreeBSD bug. The debug info is off by one line on the source - not uncommon for optimized code. The fault is actually on line 352 of file sys/net/if_ethersubr.c. struct mbuf *n = m_copy(m, 0, (int)M_COPYALL); n->m_pkthdr.csum_flags |= csum_flags; <==== Dies Here if (csum_flags & CSUM_DATA_VALID) n->m_pkthdr.csum_data = 0xffff; (void) if_simloop(ifp, n, dst->sa_family, hlen); The fault address is 0x20: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x20 fault code = supervisor write, page not present instruction pointer = 0x8:0xc019137c stack pointer = 0x10:0xc02cc4a8 frame pointer = 0x10:0xc02cc4e4 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = Idle interrupt mask = trap number = 12 panic: page fault All mbufs exhausted, please see tuning(7). The return is always in %eax (x86 ABI) - the compiler moves this to %ebx and then does an indexed reference of 0x20 off of this address - which is the offset of the copy_flags field. The %ebx is holding the registerized csum_flags variable. So that's why we know this is line 352: #6 */ 0xc019137c/* in ether_output (ifp=0xc1d56400, m=0xc0786600, dst=0xc02cc568, rt0=0xc1e67500) at ../../net/if_ethersubr.c:350 0xc0191369 <ether_output+729>: push $0x1 0xc019136b <ether_output+731>: push $0x3b9aca00 0xc0191370 <ether_output+736>: push $0x0 0xc0191372 <ether_output+738>: pushl 0xc(%ebp) 0xc0191375 <ether_output+741>: call 0xc0170b10 <m_copym> 0xc019137a <ether_output+746>: mov %eax,%edx */0xc019137c/* <ether_output+748>: or %ebx,0x20(%edx) 0xc019137f <ether_output+751>: add $0x10,%esp 0xc0191382 <ether_output+754>: test $0x4,%bh struct mbuf *n = m_copy(m, 0, (int)M_COPYALL); */n->m_pkthdr.csum_flags |= csum_flags; <=== 352/* if (csum_flags & CSUM_DATA_VALID) n->m_pkthdr.csum_data = 0xffff; (void) if_simloop(ifp, n, dst->sa_family, hlen); We also have a kernel printf of out of mbufs right before the panic (which would have been emitted by the m_copym() (m_copy is a macro around m_copym). So the kernel ran out of mbufs, m_copy() failed and the return value wasn't checked before dereferencing the pointer. The code of 352-354 was added January 2002 (to 1.70.2.23) - before that it would have triggered a KASSERT() in if_simloop() - if you had a debug load - or just a crash if not. That code was added in June of 1998 when there was general cleanup of the loopback - so the bug has been latent since FreeBSD 3.1. The backtrace: (kgdb) where #0 dumpsys () at ../../kern/kern_shutdown.c:504 #1 0xc01563a5 in boot (howto=260) at ../../kern/kern_shutdown.c:324 #2 0xc0156841 in panic (fmt=0xc02b29ac "%s") at ../../kern/kern_shutdown.c:634 #3 0xc023867f in trap_fatal (frame=0xc02cc468, eva=32) at ../../i386/i386/trap.c:974 #4 0xc023832d in trap_pfault (frame=0xc02cc468, usermode=0, eva=32) at ../../i386/i386/trap.c:867 #5 0xc0237ea7 in trap (frame={tf_fs = 6422544, tf_es = 16, tf_ds = 6422544, tf_edi = 1, tf_esi = -1070807824, tf_ebp = -1070807836, tf_isp = -1070807916, tf_ebx = 0, tf_edx = 0, tf_ecx = 0, tf_eax = 0, tf_trapno = 12, tf_err = 2, tf_eip = -1072098436, tf_cs = 8, tf_eflags = 66118, tf_esp = -1065851392, tf_ss = 0}) at ../../i386/i386/trap.c:466 #6 0xc019137c in ether_output (ifp=0xc1d56400, m=0xc0786600, dst=0xc02cc568, rt0=0xc1e67500) at ../../net/if_ethersubr.c:350 #7 0xc01ae0ac in ip_output (m0=0xc0786600, opt=0x0, ro=0xc02cc564, flags=34, imo=0x0, inp=0x0) at ../../netinet/ip_output.c:1047 #8 0xc0254d7b in ext_iface_write_dsock (pktp=0xc02cc600, freepkt=0, sin=0xc03022c0) at ../../ambitsrc/combined/ext_iface.c:468 #9 0xc0254a9f in ext_iface_send_ip (pkt=0xc2001880) at ../../ambitsrc/combined/ext_iface.c:309 #10 0xc02576e8 in int_iface_deliver (pkt=0xc2001880, client=0xc1e6a200, tun_ip={s_addr = 0}, was_tunneled=0) at ../../ambitsrc/combined/int_iface.c:1846 #11 0xc02574de in int_iface_outgoing_step_3 (client=0xc1e6a200, pkt=0xc2001880, tun_ip={s_addr = 0}) at ../../ambitsrc/combined/int_iface.c:1684 #12 0xc02568ab in int_iface_outgoing_step_2 (client=0xc1e6a200, pkt=0xc2001880, encrypted={s_addr = 0}) at ../../ambitsrc/combined/int_iface.c:993 #13 0xc0256455 in int_iface_outgoing_step_1 (client=0xc1e6a200, pkt=0xc2001880) at ../../ambitsrc/combined/int_iface.c:713 #14 0xc0256178 in int_iface_process_in_pkt (iface=0xc1d97800, pkt=0xc2001880, outgoing=0, vlan_id=65535) at ../../ambitsrc/combined/int_iface.c:563 #15 0xc02628db in ng_ambit_rx_int (hook=0xc1d6d1c0, m=0xc0657500, meta=0x0, link_num=2) at ../../ambitsrc/ng_ambit/ng_ambit.c:1274 #16 0xc0261f52 in ng_ambit_rcvdata (hook=0xc1d6d1c0, m=0xc0657500, meta=0x0) at ../../ambitsrc/ng_ambit/ng_ambit.c:608 #17 0xc0197eb1 in ng_send_dataq (hook=0xc1d6d1e0, m=0xc0657500, meta=0x0) at ../../netgraph/ng_base.c:1676 #18 0xc019836f in ngintr () at ../../netgraph/ng_base.c:2011 #19 0xc022ce19 in swi_net_next () >How-To-Repeat: This is very hard to reproduce. You need your system to run out of mbufs while receiving broadcast messages--dhcp request. >Fix: Here's a patch that does the appropriate error checking. --- sys/net/if_ethersubr.c.orig Wed Feb 12 13:03:23 2003 +++ sys/net/if_ethersubr.c Thu Apr 10 13:05:52 2003 @@ -349,6 +349,12 @@ if ((m->m_flags & M_BCAST) || (loop_copy > 0)) { struct mbuf *n = m_copy(m, 0, (int)M_COPYALL); + /* m_copy failed */ + if (n == NULL) { + error = 0; + goto bad; + } + n->m_pkthdr.csum_flags |= csum_flags; if (csum_flags & CSUM_DATA_VALID) n->m_pkthdr.csum_data = 0xffff; >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200304141705.h3EH5DiX093581>