From owner-svn-src-stable@FreeBSD.ORG Thu Nov 6 22:11:58 2008 Return-Path: Delivered-To: svn-src-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6730E106567C; Thu, 6 Nov 2008 22:11:58 +0000 (UTC) (envelope-from julian@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 537168FC08; Thu, 6 Nov 2008 22:11:58 +0000 (UTC) (envelope-from julian@FreeBSD.org) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id mA6MBwik066992; Thu, 6 Nov 2008 22:11:58 GMT (envelope-from julian@svn.freebsd.org) Received: (from julian@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id mA6MBvdU066984; Thu, 6 Nov 2008 22:11:57 GMT (envelope-from julian@svn.freebsd.org) Message-Id: <200811062211.mA6MBvdU066984@svn.freebsd.org> From: Julian Elischer Date: Thu, 6 Nov 2008 22:11:57 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-7@freebsd.org X-SVN-Group: stable-7 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r184739 - in stable/7/sys: . modules/cxgb net netinet X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Nov 2008 22:11:58 -0000 Author: julian Date: Thu Nov 6 22:11:57 2008 New Revision: 184739 URL: http://svn.freebsd.org/changeset/base/184739 Log: MFC a rewrite of rt_check(). also revert the addition of rt_check_fib() which we discovered is not needed. fixes some hangs people have seen Approved by: re (ken) Modified: stable/7/sys/ (props changed) stable/7/sys/modules/cxgb/ (props changed) stable/7/sys/net/if_atmsubr.c stable/7/sys/net/if_fwsubr.c stable/7/sys/net/if_iso88025subr.c stable/7/sys/net/route.c stable/7/sys/net/route.h stable/7/sys/netinet/if_ether.c stable/7/sys/netinet/in_rmx.c stable/7/sys/netinet/in_var.h Modified: stable/7/sys/net/if_atmsubr.c ============================================================================== --- stable/7/sys/net/if_atmsubr.c Thu Nov 6 21:47:02 2008 (r184738) +++ stable/7/sys/net/if_atmsubr.c Thu Nov 6 22:11:57 2008 (r184739) @@ -158,8 +158,7 @@ atm_output(struct ifnet *ifp, struct mbu * check route */ if (rt0 != NULL) { - error = rt_check_fib(&rt, &rt0, - dst, rt0->rt_fibnum); + error = rt_check(&rt, &rt0, dst); if (error) goto bad; RT_UNLOCK(rt); Modified: stable/7/sys/net/if_fwsubr.c ============================================================================== --- stable/7/sys/net/if_fwsubr.c Thu Nov 6 21:47:02 2008 (r184738) +++ stable/7/sys/net/if_fwsubr.c Thu Nov 6 22:11:57 2008 (r184739) @@ -103,7 +103,7 @@ firewire_output(struct ifnet *ifp, struc } if (rt0 != NULL) { - error = rt_check_fib(&rt, &rt0, dst, rt0->rt_fibnum); + error = rt_check(&rt, &rt0, dst); if (error) goto bad; RT_UNLOCK(rt); Modified: stable/7/sys/net/if_iso88025subr.c ============================================================================== --- stable/7/sys/net/if_iso88025subr.c Thu Nov 6 21:47:02 2008 (r184738) +++ stable/7/sys/net/if_iso88025subr.c Thu Nov 6 22:11:57 2008 (r184739) @@ -259,8 +259,7 @@ iso88025_output(ifp, m, dst, rt0) /* Calculate routing info length based on arp table entry */ /* XXX any better way to do this ? */ if (rt0 != NULL) { -/* XXX MRT *//* Guess only */ - error = rt_check_fib(&rt, &rt0, dst, rt0->rt_fibnum); + error = rt_check(&rt, &rt0, dst); if (error) goto bad; RT_UNLOCK(rt); Modified: stable/7/sys/net/route.c ============================================================================== --- stable/7/sys/net/route.c Thu Nov 6 21:47:02 2008 (r184738) +++ stable/7/sys/net/route.c Thu Nov 6 22:11:57 2008 (r184739) @@ -1549,84 +1549,120 @@ rtinit(struct ifaddr *ifa, int cmd, int * final destination if directly reachable); * *lrt0 points to the cached route to the final destination; * *lrt is not meaningful; - * fibnum is the index to the correct network fib for this packet + * (*lrt0 has no ref held on it by us so REMREF is not needed. + * Refs only account for major structural references and not usages, + * which is actually a bit of a problem.) * * === Operation === * If the route is marked down try to find a new route. If the route * to the gateway is gone, try to setup a new route. Otherwise, * if the route is marked for packets to be rejected, enforce that. + * Note that rtalloc returns an rtentry with an extra REF that we may + * need to lose. * * === On return === * *dst is unchanged; * *lrt0 points to the (possibly new) route to the final destination - * *lrt points to the route to the next hop + * *lrt points to the route to the next hop [LOCKED] * * Their values are meaningful ONLY if no error is returned. + * + * To follow this you have to remember that: + * RT_REMREF reduces the reference count by 1 but doesn't check it for 0 (!) + * RTFREE_LOCKED includes an RT_REMREF (or an rtfree if refs == 1) + * and an RT_UNLOCK + * RTFREE does an RT_LOCK and an RTFREE_LOCKED + * The gwroute pointer counts as a reference on the rtentry to which it points. + * so when we add it we use the ref that rtalloc gives us and when we lose it + * we need to remove the reference. + * RT_TEMP_UNLOCK does an RT_ADDREF before freeing the lock, and + * RT_RELOCK locks it (it can't have gone away due to the ref) and + * drops the ref, possibly freeing it and zeroing the pointer if + * the ref goes to 0 (unlocking in the process). */ int rt_check(struct rtentry **lrt, struct rtentry **lrt0, struct sockaddr *dst) { - return (rt_check_fib(lrt, lrt0, dst, 0)); -} - -int -rt_check_fib(struct rtentry **lrt, struct rtentry **lrt0, struct sockaddr *dst, - u_int fibnum) -{ struct rtentry *rt; struct rtentry *rt0; - int error; + u_int fibnum; KASSERT(*lrt0 != NULL, ("rt_check")); - rt = rt0 = *lrt0; + rt0 = *lrt0; + rt = NULL; + fibnum = rt0->rt_fibnum; /* NB: the locking here is tortuous... */ - RT_LOCK(rt); - if ((rt->rt_flags & RTF_UP) == 0) { - RT_UNLOCK(rt); - rt = rtalloc1_fib(dst, 1, 0UL, fibnum); - if (rt != NULL) { - RT_REMREF(rt); - /* XXX what about if change? */ - } else + RT_LOCK(rt0); +retry: + if (rt0 && (rt0->rt_flags & RTF_UP) == 0) { + /* Current rt0 is useless, try get a replacement. */ + RT_UNLOCK(rt0); + rt0 = NULL; + } + if (rt0 == NULL) { + rt0 = rtalloc1_fib(dst, 1, 0UL, fibnum); + if (rt0 == NULL) { return (EHOSTUNREACH); - rt0 = rt; + } + RT_REMREF(rt0); /* don't need the reference. */ } - /* XXX BSD/OS checks dst->sa_family != AF_NS */ - if (rt->rt_flags & RTF_GATEWAY) { - if (rt->rt_gwroute == NULL) - goto lookup; - rt = rt->rt_gwroute; - RT_LOCK(rt); /* NB: gwroute */ - if ((rt->rt_flags & RTF_UP) == 0) { - RTFREE_LOCKED(rt); /* unlock gwroute */ - rt = rt0; - rt0->rt_gwroute = NULL; - lookup: - RT_UNLOCK(rt0); -/* XXX MRT link level looked up in table 0 */ - rt = rtalloc1_fib(rt->rt_gateway, 1, 0UL, 0); - if (rt == rt0) { - RT_REMREF(rt0); - RT_UNLOCK(rt0); + + if (rt0->rt_flags & RTF_GATEWAY) { + if ((rt = rt0->rt_gwroute) != NULL) { + RT_LOCK(rt); /* NB: gwroute */ + if ((rt->rt_flags & RTF_UP) == 0) { + /* gw route is dud. ignore/lose it */ + RTFREE_LOCKED(rt); /* unref (&unlock) gwroute */ + rt = rt0->rt_gwroute = NULL; + } + } + + if (rt == NULL) { /* NOT AN ELSE CLAUSE */ + RT_TEMP_UNLOCK(rt0); /* MUST return to undo this */ + rt = rtalloc1_fib(rt0->rt_gateway, 1, 0UL, fibnum); + if ((rt == rt0) || (rt == NULL)) { + /* the best we can do is not good enough */ + if (rt) { + RT_REMREF(rt); /* assumes ref > 0 */ + RT_UNLOCK(rt); + } + RTFREE(rt0); /* lock, unref, (unlock) */ return (ENETUNREACH); } - RT_LOCK(rt0); - if (rt0->rt_gwroute != NULL) - RTFREE(rt0->rt_gwroute); - rt0->rt_gwroute = rt; - if (rt == NULL) { - RT_UNLOCK(rt0); - return (EHOSTUNREACH); + /* + * Relock it and lose the added reference. + * All sorts of things could have happenned while we + * had no lock on it, so check for them. + */ + RT_RELOCK(rt0); + if (rt0 == NULL || ((rt0->rt_flags & RTF_UP) == 0)) + /* Ru-roh.. what we had is no longer any good */ + goto retry; + /* + * While we were away, someone replaced the gateway. + * Since a reference count is involved we can't just + * overwrite it. + */ + if (rt0->rt_gwroute) { + if (rt0->rt_gwroute != rt) { + RTFREE_LOCKED(rt); + goto retry; + } + } else { + rt0->rt_gwroute = rt; } } + RT_LOCK_ASSERT(rt); RT_UNLOCK(rt0); + } else { + /* think of rt as having the lock from now on.. */ + rt = rt0; } /* XXX why are we inspecting rmx_expire? */ - error = (rt->rt_flags & RTF_REJECT) && - (rt->rt_rmx.rmx_expire == 0 || - time_uptime < rt->rt_rmx.rmx_expire); - if (error) { + if ((rt->rt_flags & RTF_REJECT) && + (rt->rt_rmx.rmx_expire == 0 || + time_uptime < rt->rt_rmx.rmx_expire)) { RT_UNLOCK(rt); return (rt == rt0 ? EHOSTDOWN : EHOSTUNREACH); } Modified: stable/7/sys/net/route.h ============================================================================== --- stable/7/sys/net/route.h Thu Nov 6 21:47:02 2008 (r184738) +++ stable/7/sys/net/route.h Thu Nov 6 22:11:57 2008 (r184739) @@ -312,19 +312,35 @@ struct rt_addrinfo { } while (0) #define RTFREE_LOCKED(_rt) do { \ - if ((_rt)->rt_refcnt <= 1) \ - rtfree(_rt); \ - else { \ - RT_REMREF(_rt); \ - RT_UNLOCK(_rt); \ - } \ - /* guard against invalid refs */ \ - _rt = 0; \ - } while (0) + if ((_rt)->rt_refcnt <= 1) \ + rtfree(_rt); \ + else { \ + RT_REMREF(_rt); \ + RT_UNLOCK(_rt); \ + } \ + /* guard against invalid refs */ \ + _rt = 0; \ +} while (0) #define RTFREE(_rt) do { \ - RT_LOCK(_rt); \ - RTFREE_LOCKED(_rt); \ - } while (0) + RT_LOCK(_rt); \ + RTFREE_LOCKED(_rt); \ +} while (0) + +#define RT_TEMP_UNLOCK(_rt) do { \ + RT_ADDREF(_rt); \ + RT_UNLOCK(_rt); \ +} while (0) + +#define RT_RELOCK(_rt) do { \ + RT_LOCK(_rt); \ + if ((_rt)->rt_refcnt <= 1) { \ + rtfree(_rt); \ + _rt = 0; /* signal that it went away */ \ + } else { \ + RT_REMREF(_rt); \ + /* note that _rt is still valid */ \ + } \ +} while (0) extern struct radix_node_head *rt_tables[][AF_MAX+1]; @@ -352,6 +368,7 @@ int rt_setgate(struct rtentry *, struct int rtexpunge(struct rtentry *); void rtfree(struct rtentry *); +int rt_check(struct rtentry **, struct rtentry **, struct sockaddr *); /* XXX MRT COMPAT VERSIONS THAT SET UNIVERSE to 0 */ /* Thes are used by old code not yet converted to use multiple FIBS */ @@ -366,7 +383,6 @@ void rtredirect(struct sockaddr *, stru int rtrequest(int, struct sockaddr *, struct sockaddr *, struct sockaddr *, int, struct rtentry **); int rtrequest1(int, struct rt_addrinfo *, struct rtentry **); -int rt_check(struct rtentry **, struct rtentry **, struct sockaddr *); /* defaults to "all" FIBs */ int rtinit_fib(struct ifaddr *, int, int); @@ -385,7 +401,6 @@ void rtredirect_fib(struct sockaddr *, int rtrequest_fib(int, struct sockaddr *, struct sockaddr *, struct sockaddr *, int, struct rtentry **, u_int); int rtrequest1_fib(int, struct rt_addrinfo *, struct rtentry **, u_int); -int rt_check_fib(struct rtentry **, struct rtentry **, struct sockaddr *, u_int); #include typedef void (*rtevent_arp_update_fn)(void *, struct rtentry *, uint8_t *, struct sockaddr *); Modified: stable/7/sys/netinet/if_ether.c ============================================================================== --- stable/7/sys/netinet/if_ether.c Thu Nov 6 21:47:02 2008 (r184738) +++ stable/7/sys/netinet/if_ether.c Thu Nov 6 22:11:57 2008 (r184739) @@ -362,7 +362,7 @@ arpresolve(struct ifnet *ifp, struct rte struct rtentry *rt = NULL; struct sockaddr_dl *sdl; int error; - int fibnum = 0; + int fibnum = -1; if (m) { @@ -379,7 +379,7 @@ arpresolve(struct ifnet *ifp, struct rte if (rt0 != NULL) { /* Look for a cached arp (ll) entry. */ - error = in_rt_check(&rt, &rt0, dst, fibnum); + error = rt_check(&rt, &rt0, dst); if (error) { m_freem(m); return error; @@ -388,14 +388,23 @@ arpresolve(struct ifnet *ifp, struct rte if (la == NULL) RT_UNLOCK(rt); } + + /* + * If we had no mbuf and no route, then hope the caller + * has a fib in mind because we are running out of ideas. + * I think this should not happen in current code. + * (kmacy would know). + */ + if (fibnum == -1) + fibnum = curthread->td_proc->p_fibnum; /* last gasp */ + if (la == NULL) { /* * We enter this block if rt0 was NULL, - * or if rt found by in_rt_check() didn't have llinfo. - * we should get a cloned route, which since it should - * come from the local interface should have a ll entry. - * if may be incoplete but that's ok. - * XXXMRT if we haven't found a fibnum is that OK? + * or if rt found by rt_check() didn't have llinfo. + * We should get a cloned route from the local interface, + * so it should have an ll entry. + * It may be incomplete but that's ok. */ rt = arplookup(SIN(dst)->sin_addr.s_addr, 1, 0, fibnum); if (rt == NULL) { Modified: stable/7/sys/netinet/in_rmx.c ============================================================================== --- stable/7/sys/netinet/in_rmx.c Thu Nov 6 21:47:02 2008 (r184738) +++ stable/7/sys/netinet/in_rmx.c Thu Nov 6 22:11:57 2008 (r184739) @@ -465,13 +465,6 @@ in_rtalloc1(struct sockaddr *dst, int re return (rtalloc1_fib(dst, report, ignflags, fibnum)); } -int -in_rt_check(struct rtentry **lrt, struct rtentry **lrt0, - struct sockaddr *dst, u_int fibnum) -{ - return (rt_check_fib(lrt, lrt0, dst, fibnum)); -} - void in_rtredirect(struct sockaddr *dst, struct sockaddr *gateway, Modified: stable/7/sys/netinet/in_var.h ============================================================================== --- stable/7/sys/netinet/in_var.h Thu Nov 6 21:47:02 2008 (r184738) +++ stable/7/sys/netinet/in_var.h Thu Nov 6 22:11:57 2008 (r184739) @@ -314,7 +314,6 @@ void in_rtredirect(struct sockaddr *, s struct sockaddr *, int, struct sockaddr *, u_int); int in_rtrequest(int, struct sockaddr *, struct sockaddr *, struct sockaddr *, int, struct rtentry **, u_int); -int in_rt_check(struct rtentry **, struct rtentry **, struct sockaddr *, u_int); #if 0 int in_rt_getifa(struct rt_addrinfo *, u_int fibnum);