Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Nov 2014 13:35:15 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-bugs@FreeBSD.org
Subject:   [Bug 195102] New: dummynet_send() may panic the kernel (bad switch -256)
Message-ID:  <bug-195102-8@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195102

            Bug ID: 195102
           Summary: dummynet_send() may panic the kernel (bad switch -256)
           Product: Base System
           Version: 8.4-STABLE
          Hardware: Any
                OS: Any
            Status: Needs Triage
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs@FreeBSD.org
          Reporter: eugen@grosbein.net

Created attachment 149513
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=149513&action=edit
crashdump details

This panic occured in 8.4-STABLE for me. I know stable/8 approaches it EOL but
the HEAD's code in question is the same so I believe the bug is not fixed there
too.

Sometimes I have similar kernel panics with my mpd5/PPPoE access server acting
as traffic shaper using dummynet and io_fast enabled. This time I've got good
crashdump and spent some time reading the code and I believe the culprit is
dummynet_send() function from ipfw/ip_dn_io.c

Kgdb backtrace and additional info are attached.

Here is part of kernel log:

Nov 17 17:02:21 m-19-pc-2 kernel: dummynet: bad switch -256!
Nov 17 17:02:21 m-19-pc-2 kernel:
Nov 17 17:02:21 m-19-pc-2 kernel:
Nov 17 17:02:21 m-19-pc-2 kernel: Fatal trap 12: page fault while in kernel
mode
Nov 17 17:02:21 m-19-pc-2 kernel: cpuid = 0; apic id = 00
Nov 17 17:02:21 m-19-pc-2 kernel: fault virtual address = 0x1
Nov 17 17:02:21 m-19-pc-2 kernel: fault code            = supervisor read
instruction, page not present
Nov 17 17:02:21 m-19-pc-2 kernel: instruction pointer   = 0x20:0x1
Nov 17 17:02:21 m-19-pc-2 kernel: stack pointer         =
0x28:0xffffff8122b0ba20

As one can see from dummynet_send() code, "bad switch" in the log means that
tag = m_tag_first(m) was not NULL at the moment of the check. However, kgdb
shows (see attachment) that is was NULL at the moment of kernel panic.

It seems for me we have some kind of race here, so the mbuf is processed and
freed in between of these two moments and UMA panices due to double free
attempt. I see no protection from this kind of race.

The box has 4 CPU cores (hyperthreading disabled) and these tunnables enabled:

net.isr.bindthreads=1
net.isr.maxthreads=4
net.inet.ip.fastforwarding=1
net.inet.ip.dummynet.pipe_slot_limit=1000
net.inet.ip.dummynet.io_fast=1

sysctls net.isr.direct and net.isr.direct_force are 1 by default

-- 
You are receiving this mail because:
You are the assignee for the bug.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-195102-8>