Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Feb 2014 05:52:29 +0100
From:      Marko Zec <zec@fer.hr>
To:        Vijay Singh <vijju.singh@gmail.com>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject:   Re: vnet deletion panic
Message-ID:  <20140204055229.4a52ec15@x23.lan>
In-Reply-To: <CALCNsJQSfqyXUuiGUPwmuXH3OCdmMRVSZtZSDQEBTb9csQAe4Q@mail.gmail.com>
References:  <CALCNsJQSfqyXUuiGUPwmuXH3OCdmMRVSZtZSDQEBTb9csQAe4Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 3 Feb 2014 19:33:21 -0800
Vijay Singh <vijju.singh@gmail.com> wrote:

> I'm running into a crash due on vnet deletion in the presence of
> routing sockets. The root cause seems to originate from():
> 
> if_detach_internal() -> if_down(ifp) -> if_unroute() -> rt_ifmsg() ->
> rt_dispatch()
> 
> In rt_dispatch() we have:
> 
> #ifdef VIMAGE
>         if (V_loif)
>                 m->m_pkthdr.rcvif = V_loif;
> #endif
> netisr_queue(NETISR_ROUTE, m);
> 
> Now since this would be processed async, and the ifp alove is the
> loopback of the vnet being deleted, we run into accessing a freed
> pointer (ifp) when netisr picks up the mbuf. So I am wondering how to
> fix this. I am thinking that we could do something like the following
> in rt_dispatch():
> 
> #ifdef VIMAGE
>         if (V_loif) {
>             if ((ifp == V_loif) && !IS_DEFAULT_VNET(curvnet)) {
>                CURVNET_SET_QUIET(vnet0);
>                m->m_pkthdr.rcvif = V_loif;
>               CURVNET_RESTORE();
>             } else
>                 m->m_pkthdr.rcvif = V_loif;
>         }
> #endif
> 
> So basically switch to the default vnet for the mbuf with the routing
> socket message. Thoughts?

By design, the vnet teardown procedure should not commence before the
last socket attached to that vnet is closed, so I'm suspicious whether
the proposed approach could actually appease the panics you're
observing.  Furthermore, it would certainly cause bogus routing messages
to appear in vnet0 and possibly confuse routing socket consumers
running there.  Plus, in rt_dispatch() there's no ifp context to check
against V_loif at all, as you're proposing your patch?

Perhaps it could be possible to walk through all the netisr queues just
before V_loif gets destroyed, and prune all queued mbufs which have
m->m_pkthdr.rcvif pointing to V_loif?  Since the vnet teardown procedure
cannot be initiated before all (routing) sockets attached to that vnet
have been closed, after all other ifnets except V_loif have also been
destroyed it should not be possible for new mbufs to be queued with
rcvif pointing back to V_loif, so at least conceptually that approach
might work correctly.

Marko



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140204055229.4a52ec15>