Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Jun 2009 22:57:45 -0700
From:      Kip Macy <kmacy@freebsd.org>
To:        Andrew Gallatin <gallatin@cs.duke.edu>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, Sam Leffler <sam@freebsd.org>, src-committers@freebsd.org
Subject:   Re: svn commit: r194909 - head/sys/dev/mxge
Message-ID:  <3c1674c90906252257q7fe4c725m3840b2bb3c8aeb51@mail.gmail.com>
In-Reply-To: <4A42D629.8060806@cs.duke.edu>
References:  <200906242109.n5OL9uVb029380@svn.freebsd.org> <4A42983C.6050307@freebsd.org> <4A42D629.8060806@cs.duke.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
This is my bug. Hold off for a day or two.

-Kip

On Wed, Jun 24, 2009 at 6:43 PM, Andrew Gallatin<gallatin@cs.duke.edu> wrot=
e:
> Sam Leffler wrote:
>
>> There's something else wrong. =A0This is just covering up the real bug.
>
> I'm pretty sure the "real bug" is in bpf, but I'm not sure its a bug,
> and I suspect there are probably other, similar, bugs lurking when
> you try to tear down a busy interface.
>
> What I was doing was:
>
> - point a packet generator offering 1.5Mpps at the NIC
>
> - in a tight shell loop, do
>
> while (1)
> =A0 =A0 =A0 =A0tcpdump -ni mxge0 host 172.31.0.1
> end
>
> - in another shell loop:
>
> while (1)
> =A0 =A0 =A0 =A0ifconfig mxge0 192.168.1.22 up
> =A0 =A0 =A0 =A0sleep 1
> =A0 =A0 =A0 =A0kldunload if_mxge
> end
>
> Before the commit, with the old order:
>
> =A0 =A0 =A0 lock()
> =A0 =A0 =A0 close()
> =A0 =A0 =A0 unlock()
> =A0 =A0 =A0 ether_ifdetach()
>
> I'd see either an exhaustion of mbufs because tcpdump snuck in after
> I'd closed the device and re-opened it on me (so I never closed it
> again, resulting in leaked mbufs), or a panic.
>
> I then moved the ether_ifdetach() to the new position:
> =A0 =A0 =A0 ether_ifdetach()
> =A0 =A0 =A0 lock()
> =A0 =A0 =A0 close()
> =A0 =A0 =A0 unlock()
>
> This worked great until I started the packet generator,
> then it crashed. =A0 The stack I saw (which I don't have
> saved, so this is from memory) when I had ether_ifdetach()
> first was:
>
>
> panic: mtx_lock() (don't remember exact text)
> bpf_mtap()
> ether_input()
> mxge_rx_done_small()
> mxge_clean_rx_done()
> mxge_intr()
> <...>
>
> When I looked at the ifp in kgdb, I noticed that all the operations
> (if_input(), if_output(), etc) pointed to ifdead_*
> The machine I'm using for this is a MacPro, and I can't get ddb
> to work on the USB based console, so I'm working purely from dumps.
> I don't know how to get a stack of another process in kgdb on
> amd64, so that's all the information I have.
>
> My assumption is that my interrupt thread was running when
> ether_ifdetach() called bpfdetach(), and was starting bpf_mtap()
> while bpfdetach() was destroying the bpf_if. =A0There doesn't
> seem to be anything to prevent bpfdetach() from racing with
> bpf_mtap().
>
> By calling my close() routine (with a dying flag so nothing can
> sneak in before detach), I'm assured that my NIC is quiescent,
> and cannot be calling into the stack while the interface is being
> torn down. =A0I'd prefer to leave my commit as-is because:
>
> 1) it works, and fixes a bug
> 2) it can be MFC'ed as is
> 3) it just feels wrong to be blasting packets up into the stack
> =A0 while detaching. =A0With this NIC, the best way to make it
> =A0 quiescent is to call close(). =A0There's an interrupt handshake
> =A0 done with the NIC to ensure its is quiescent, so doing something
> =A0 like disabling its interrupt could leave the things in a weird state.
>
> Drew
>



--=20
When bad men combine, the good must associate; else they will fall one
by one, an unpitied sacrifice in a contemptible struggle.

    Edmund Burke



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3c1674c90906252257q7fe4c725m3840b2bb3c8aeb51>