From owner-freebsd-net@FreeBSD.ORG Fri Jul 29 19:54:24 2011 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1E7D51065676; Fri, 29 Jul 2011 19:54:24 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id EFEC58FC1B; Fri, 29 Jul 2011 19:54:23 +0000 (UTC) Received: from julian-mac.elischer.org (home-nat.elischer.org [67.100.89.137]) (authenticated bits=0) by vps1.elischer.org (8.14.4/8.14.4) with ESMTP id p6TJRA2t048676 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Fri, 29 Jul 2011 12:27:11 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <4E330989.7030409@freebsd.org> Date: Fri, 29 Jul 2011 12:27:05 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11 MIME-Version: 1.0 To: Gleb Smirnoff References: <20110714154457.GI70776@FreeBSD.org> In-Reply-To: <20110714154457.GI70776@FreeBSD.org> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Cc: gnn@freebsd.org, bz@freebsd.org, rwatson@freebsd.org, net@freebsd.org Subject: Re: m_pkthdr.rcvif dangling pointer problem X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Jul 2011 19:54:24 -0000 On 7/14/11 8:44 AM, Gleb Smirnoff wrote: > Hi! > > This problem is definitely known and is as old as network stack > is parallel. Those, who know the problem may skip to next paragraph. > Short description is the following: every mbuf that is allocated in > an interface receive procedure has a field where a pointer to the > corresponding struct ifnet is stored. Everything is okay, until you > are running a box, where configuration of interfaces is changing > rapidly, for example a PPP access concentrator. Utilizing any facility > that can delay packet processing for example netgraph(4) or dummynet(4) > would make probability of crash much bigger. Running INVARIANTS kernel > would crash a box with dummynet and disappearing interfaces in a few > minutes. > > I see three approaches to this problem: > > 1) Every m_pkthdr.rcvif should if_ref() the interface. Releasing > references can be done in the mbuf allocator: mb_dtor_mbuf(), > mb_dtor_pack(). I'm afraid this approach would degrate performance, > since adding at least a couple of atomic ops on every mbuf for its > lifetime. That is +2 atomic ops per packet. I did that in the 3.x timeframe in a branch.. it's probably in the cvs tree still but was expensive. better idea would be to use ifnet number and a generation number... > 2) kib@ suggested to allocate ifnets from a UMA_ZONE_NOFREE zone. > I've made a compilable& working patch: > > http://people.freebsd.org/~glebius/patches/ifnet.no_free > > But on second though I find this a bad idea, this is just fooling > of INVARIANTS. Yes, we avoid thrashing of freed memory and rewriting > it by some other kernel allocation. But still out pointer point to > invalid ifnet. Even, if we make a check for IFF_DYING flag, we still > can not guarantee that an interface had been re-allocated for a new > instance. This would be not a panic condition, but subtle bugs in > firewalls. > > 3) As we now have a straight if_index table that can grow, what about > storing the if_index in the m_pkthdr? Lookup of interface by index > is fast enough if done lockless. Doing it lockless isn't perfect, but > better than current pointer dereferncing. Optionally it could be > done with locking and with putting a reference. To avoid situation > with with getting to a re-allocated interface with the same index, > we can use a unique cookie, that is incremented in if_alloc(). > Smth like: > > struct ifnet * > mbuf_rcvif(struct mbuf *m) > { > struct ifnet *ifp; > > M_ASSERTPKTHDR(m); > > if (m->m_pkthdr.rcvif_idx == 0) > return (NULL); > > ifp = ifnet_byindex_locked(m->m_pkthdr.rcvif_idx); > if (ifp == NULL) > return (NULL); > if (ifp->if_unique != m->m_pkthdr.rcvif_unique) > return (NULL); > > return (ifp); > } > > struct ifnet * > mbuf_rcvif_ref(struct mbuf *m) > { > struct ifnet *ifp; > > M_ASSERTPKTHDR(m); > > if (m->m_pkthdr.rcvif_idx == 0) > return (NULL); > > ifp = ifnet_byindex_ref(m->m_pkthdr.rcvif_idx); > if (ifp == NULL) > return (NULL); > if (ifp->if_unique != m->m_pkthdr.rcvif_unique) { > if_rele(ifp); > return (NULL); > } > > return (ifp); > } > > The size of struct mbuf isn't going to change on amd64: > > @@ -111,7 +111,8 @@ > * Record/packet header in first mbuf of chain; valid only if M_PKTHDR is set. > */ > struct pkthdr { > - struct ifnet *rcvif; /* rcv interface */ > + uint32_t rcvif_idx; /* rcv interface index */ > + uint32_t rcvif_unique; /* rcv interface unique id */ > > A proof-of-concept patch is available here: > > http://people.freebsd.org/~glebius/patches/pkthdr.rcvif_idx > > It doesn't cover entire kernel, LINT won't compile. It covers kern, netinet, > netinet6, several interface drivers and netgraph. > > One of the problems is ongoing namespace pollution: not all code that include > mbuf.h includes if_var.h and vice versa. I've cut ifnet from m_devget(), but > my new function do declare it in parameter list :(. To deal with that I had > to declare functions in mbuf.h but implement them in if.c. > > Other problem is net80211 code that abuses the rcvif pointer in mbuf packet > header for its own purposes casting it to private type. This can be fixed > utilizing mbuf_tags(9), I think. I haven't made this yet, that's why LINT > doesn't compile. > > Comments are requested! > > Any alternative ideas on solving the problem are welcome! >