Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 23 Jun 2014 11:32:47 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-net@freebsd.org
Cc:        Roger Pau =?iso-8859-15?q?Monn=E9?= <roger.pau@citrix.com>
Subject:   Re: Ordering problem in if_detach_internal regarding if_bridge
Message-ID:  <201406231132.47487.jhb@freebsd.org>
In-Reply-To: <53A4527F.90008@citrix.com>
References:  <53A4527F.90008@citrix.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Friday, June 20, 2014 11:25:51 am Roger Pau Monn=E9 wrote:
> Hello,
>=20
> I've stumbled across the following panic when testing Xen netback with=20
> if_bridge:
>=20
> Kernel page fault with the following non-sleepable locks held:
> exclusive sleep mutex if_bridge (if_bridge) r =3D 0 (0xfffff80006306c18)=
=20
locked @ /usr/src/sys/m
> KDB: stack backtrace:
> X_db_symbol_values() at X_db_symbol_values+0x10b/frame 0xfffffe0000213490
> kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe0000213540
> witness_warn() at witness_warn+0x4a8/frame 0xfffffe0000213600
> trap() at trap+0xc9d/frame 0xfffffe00002136a0
> trap() at trap+0x669/frame 0xfffffe00002138b0
> calltrap() at calltrap+0x8/frame 0xfffffe00002138b0
> --- trap 0xc, rip =3D 0xffffffff8221a0ef, rsp =3D 0xfffffe0000213970, rbp=
 =3D=20
0xfffffe00002139e0 ---
> bridge_input() at bridge_input+0x5ff/frame 0xfffffe00002139e0
> ether_vlanencap() at ether_vlanencap+0x4a3/frame 0xfffffe0000213a10
> netisr_dispatch_src() at netisr_dispatch_src+0x90/frame 0xfffffe0000213a80
> ether_ifattach() at ether_ifattach+0x19f/frame 0xfffffe0000213ab0
> ath_dfs_get_thresholds() at ath_dfs_get_thresholds+0x81ce/frame=20
0xfffffe0000213b30
> intr_event_execute_handlers() at intr_event_execute_handlers+0x93/frame=20
0xfffffe0000213b70
> db_dump_intr_event() at db_dump_intr_event+0x796/frame 0xfffffe0000213bb0
> fork_exit() at fork_exit+0x84/frame 0xfffffe0000213bf0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000213bf0
> --- trap 0, rip =3D 0, rsp =3D 0xfffffe0000213cb0, rbp =3D 0 ---
>=20
> I've tracked this down to if_detach_internal setting ifp->if_addr to=20
> NULL before calling EVENTHANDLER_INVOKE(ifnet_departure_event..., which=20
> causes a panic in GRAB_OUR_PACKETS in the if_bridge code when it tries=20
> to perform IF_LLADDR on an interface that's in the process of being=20
> destroyed (ifp->if_addr set to NULL, but the ifnet_departure_event event=
=20
> has not fired yet).
>=20
> I have the following naive patch that moves the firing of the event=20
> before if_addr is set to NULL, but I'm not familiar with the ordering=20
> in if_detach_internal, so I'm not sure if this might cause problems in=20
> other parts of the code, could someone familiar with the net stuff=20
> comment on the best way to deal with it?

Hmmm, I have no idea if this is ok or not.  I do think the route message=20
should go out at the same time as the devctl_notify() call however.  My gue=
ss=20
is it is actually better to do this earlier so that we allow outside consum=
ers
to detach from an interface before it is destroyed.  I'm not sure if it wou=
ld
break things, but I would be tempted to move this even earlier right after =
it
is removed from the global ifnet list but before the taskqueue_drain, etc.

=2D-=20
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201406231132.47487.jhb>