Date: Tue, 4 Feb 2014 05:52:29 +0100 From: Marko Zec <zec@fer.hr> To: Vijay Singh <vijju.singh@gmail.com> Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org> Subject: Re: vnet deletion panic Message-ID: <20140204055229.4a52ec15@x23.lan> In-Reply-To: <CALCNsJQSfqyXUuiGUPwmuXH3OCdmMRVSZtZSDQEBTb9csQAe4Q@mail.gmail.com> References: <CALCNsJQSfqyXUuiGUPwmuXH3OCdmMRVSZtZSDQEBTb9csQAe4Q@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 3 Feb 2014 19:33:21 -0800 Vijay Singh <vijju.singh@gmail.com> wrote: > I'm running into a crash due on vnet deletion in the presence of > routing sockets. The root cause seems to originate from(): > > if_detach_internal() -> if_down(ifp) -> if_unroute() -> rt_ifmsg() -> > rt_dispatch() > > In rt_dispatch() we have: > > #ifdef VIMAGE > if (V_loif) > m->m_pkthdr.rcvif = V_loif; > #endif > netisr_queue(NETISR_ROUTE, m); > > Now since this would be processed async, and the ifp alove is the > loopback of the vnet being deleted, we run into accessing a freed > pointer (ifp) when netisr picks up the mbuf. So I am wondering how to > fix this. I am thinking that we could do something like the following > in rt_dispatch(): > > #ifdef VIMAGE > if (V_loif) { > if ((ifp == V_loif) && !IS_DEFAULT_VNET(curvnet)) { > CURVNET_SET_QUIET(vnet0); > m->m_pkthdr.rcvif = V_loif; > CURVNET_RESTORE(); > } else > m->m_pkthdr.rcvif = V_loif; > } > #endif > > So basically switch to the default vnet for the mbuf with the routing > socket message. Thoughts? By design, the vnet teardown procedure should not commence before the last socket attached to that vnet is closed, so I'm suspicious whether the proposed approach could actually appease the panics you're observing. Furthermore, it would certainly cause bogus routing messages to appear in vnet0 and possibly confuse routing socket consumers running there. Plus, in rt_dispatch() there's no ifp context to check against V_loif at all, as you're proposing your patch? Perhaps it could be possible to walk through all the netisr queues just before V_loif gets destroyed, and prune all queued mbufs which have m->m_pkthdr.rcvif pointing to V_loif? Since the vnet teardown procedure cannot be initiated before all (routing) sockets attached to that vnet have been closed, after all other ifnets except V_loif have also been destroyed it should not be possible for new mbufs to be queued with rcvif pointing back to V_loif, so at least conceptually that approach might work correctly. Marko
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140204055229.4a52ec15>