From owner-svn-src-stable@FreeBSD.ORG Wed Feb 25 11:00:13 2009 Return-Path: Delivered-To: svn-src-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3A8FE106564A; Wed, 25 Feb 2009 11:00:13 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 665B08FC1B; Wed, 25 Feb 2009 10:59:57 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id n1PAxviH094200; Wed, 25 Feb 2009 10:59:57 GMT (envelope-from rwatson@svn.freebsd.org) Received: (from rwatson@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id n1PAxv4d094194; Wed, 25 Feb 2009 10:59:57 GMT (envelope-from rwatson@svn.freebsd.org) Message-Id: <200902251059.n1PAxv4d094194@svn.freebsd.org> From: Robert Watson Date: Wed, 25 Feb 2009 10:59:57 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-7@freebsd.org X-SVN-Group: stable-7 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r189026 - in stable/7/sys: . contrib/pf dev/ath/ath_hal dev/cxgb net netinet netinet6 X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2009 11:00:15 -0000 Author: rwatson Date: Wed Feb 25 10:59:56 2009 New Revision: 189026 URL: http://svn.freebsd.org/changeset/base/189026 Log: Merge r185747, r185774, r185807, r185849, r185964, r185965, r186051, r186052 from head to stable/7; note that only the locking fixes and invariants checking are added from r185747, but not the move to an rwlock which would modify the kernel binary interface, nor the move to a non-recursible lock, which is still seeing problem reports in head. This corrects, among other things, a deadlock that may occur when processing incoming ICMP redirects. r185747: - convert radix node head lock from mutex to rwlock - make radix node head lock not recursive - fix LOR in rtexpunge - fix LOR in rtredirect Reviewed by: sam r185774: - avoid recursively locking the radix node head lock - assert that it is held if RTF_RNH_LOCKED is not passed r185807: Fix a bug introduced in r185747: rather than dereferencing an uninitialized *rt to something undefined, use the fibnum that came in as function argument. Found with: Coverity Prevent(tm) CID: 4168 r185849: fix a reported panic when adding a route and one hit here when deleting a route - pass RTF_RNH_LOCKED to rtalloc1_fib in 2 cases where the lock is held - make sure the rnh lock is held across rt_setgate and rt_getifa_fib r185964: Pass RTF_RNH_LOCKED to rtalloc1 sunce the node head is locked, this avoids a recursive lock panic on inet6 detach. Reviewed by: kmacy r185965: RTF_RNH_LOCKED needs to be passed in the flags arg not report, apologies to thompsa r186051: in6_addroute is called through rnh_addadr which is always called with the radix node head lock held exclusively. Pass RTF_RNH_LOCKED to rtalloc so that rtalloc1_fib will not try to re-acquire the lock. r186052: don't acquire lock recursively Oiginal commits to head were by kmacy, except r185964 by thompsa and r185807 by bz. A subset of this is a potential errata patch candidate. Reviewed by: bz, kmacy Tested by: Pete French sa_family != AF_INET) /* Only INET supports > 1 fib now */ @@ -290,7 +291,13 @@ rtalloc1_fib(struct sockaddr *dst, int r rtstat.rts_unreach++; goto miss2; } - RADIX_NODE_HEAD_LOCK(rnh); + needlock = !(ignflags & RTF_RNH_LOCKED); + if (needlock) + RADIX_NODE_HEAD_LOCK(rnh); +#ifdef INVARIANTS + else + RADIX_NODE_HEAD_LOCK_ASSERT(rnh); +#endif if ((rn = rnh->rnh_matchaddr(dst, rnh)) && (rn->rn_flags & RNF_ROOT) == 0) { /* @@ -343,7 +350,8 @@ rtalloc1_fib(struct sockaddr *dst, int r RT_LOCK(newrt); RT_ADDREF(newrt); } - RADIX_NODE_HEAD_UNLOCK(rnh); + if (needlock) + RADIX_NODE_HEAD_UNLOCK(rnh); } else { /* * Either we hit the root or couldn't find any match, @@ -352,7 +360,8 @@ rtalloc1_fib(struct sockaddr *dst, int r */ rtstat.rts_unreach++; miss: - RADIX_NODE_HEAD_UNLOCK(rnh); + if (needlock) + RADIX_NODE_HEAD_UNLOCK(rnh); miss2: if (report) { /* * If required, report the failure to the supervising @@ -482,6 +491,8 @@ rtredirect_fib(struct sockaddr *dst, short *stat = NULL; struct rt_addrinfo info; struct ifaddr *ifa; + struct radix_node_head *rnh = + rt_tables[fibnum][dst->sa_family]; /* verify the gateway is directly reachable */ if ((ifa = ifa_ifwithnet(gateway)) == NULL) { @@ -531,6 +542,8 @@ rtredirect_fib(struct sockaddr *dst, info.rti_info[RTAX_NETMASK] = netmask; info.rti_ifa = ifa; info.rti_flags = flags; + if (rt0 != NULL) + RT_UNLOCK(rt0); /* drop lock to avoid LOR with RNH */ error = rtrequest1_fib(RTM_ADD, &info, &rt, fibnum); if (rt != NULL) { RT_LOCK(rt); @@ -538,7 +551,7 @@ rtredirect_fib(struct sockaddr *dst, flags = rt->rt_flags; } if (rt0) - RTFREE_LOCKED(rt0); + RTFREE(rt0); stat = &rtstat.rts_dynamic; } else { @@ -554,8 +567,12 @@ rtredirect_fib(struct sockaddr *dst, /* * add the key and gateway (in one malloc'd chunk). */ + RT_UNLOCK(rt); + RADIX_NODE_HEAD_LOCK(rnh); + RT_LOCK(rt); rt_setgate(rt, rt_key(rt), gateway); - gwrt = rtalloc1(gateway, 1, 0); + gwrt = rtalloc1(gateway, 1, RTF_RNH_LOCKED); + RADIX_NODE_HEAD_UNLOCK(rnh); EVENTHANDLER_INVOKE(route_redirect_event, rt, gwrt, dst); RTFREE_LOCKED(gwrt); } @@ -641,7 +658,7 @@ ifa_ifwithroute_fib(int flags, struct so if (ifa == NULL) ifa = ifa_ifwithnet(gateway); if (ifa == NULL) { - struct rtentry *rt = rtalloc1_fib(gateway, 0, 0UL, fibnum); + struct rtentry *rt = rtalloc1_fib(gateway, 0, RTF_RNH_LOCKED, fibnum); if (rt == NULL) return (NULL); /* @@ -788,7 +805,9 @@ rtexpunge(struct rtentry *rt) struct ifaddr *ifa; int error = 0; + rnh = rt_tables[rt->rt_fibnum][rt_key(rt)->sa_family]; RT_LOCK_ASSERT(rt); + RADIX_NODE_HEAD_LOCK_ASSERT(rnh); #if 0 /* * We cannot assume anything about the reference count @@ -804,8 +823,6 @@ rtexpunge(struct rtentry *rt) if (rnh == NULL) return (EAFNOSUPPORT); - RADIX_NODE_HEAD_LOCK(rnh); - /* * Remove the item from the tree; it should be there, * but when callers invoke us blindly it may not (sigh). @@ -860,7 +877,6 @@ rtexpunge(struct rtentry *rt) */ rttrash++; bad: - RADIX_NODE_HEAD_UNLOCK(rnh); return (error); } @@ -874,7 +890,7 @@ int rtrequest1_fib(int req, struct rt_addrinfo *info, struct rtentry **ret_nrt, u_int fibnum) { - int error = 0; + int error = 0, needlock = 0; register struct rtentry *rt; register struct radix_node *rn; register struct radix_node_head *rnh; @@ -891,7 +907,12 @@ rtrequest1_fib(int req, struct rt_addrin rnh = rt_tables[fibnum][dst->sa_family]; if (rnh == NULL) return (EAFNOSUPPORT); - RADIX_NODE_HEAD_LOCK(rnh); + needlock = ((flags & RTF_RNH_LOCKED) == 0); + flags &= ~RTF_RNH_LOCKED; + if (needlock) + RADIX_NODE_HEAD_LOCK(rnh); + else + RADIX_NODE_HEAD_LOCK_ASSERT(rnh); /* * If we are adding a host route then we don't want to put * a netmask in the tree, nor do we want to clone it. @@ -1036,7 +1057,7 @@ rtrequest1_fib(int req, struct rt_addrin * then we just blow it away and retry the insertion * of the new one. */ - rt2 = rtalloc1_fib(dst, 0, 0, fibnum); + rt2 = rtalloc1_fib(dst, 0, RTF_RNH_LOCKED, fibnum); if (rt2 && rt2->rt_parent) { rtexpunge(rt2); RT_UNLOCK(rt2); @@ -1125,7 +1146,8 @@ rtrequest1_fib(int req, struct rt_addrin error = EOPNOTSUPP; } bad: - RADIX_NODE_HEAD_UNLOCK(rnh); + if (needlock) + RADIX_NODE_HEAD_UNLOCK(rnh); return (error); #undef senderr } @@ -1153,7 +1175,7 @@ rt_fixdelete(struct radix_node *rn, void if (rt->rt_parent == rt0 && !(rt->rt_flags & (RTF_PINNED | RTF_CLONING))) { return rtrequest_fib(RTM_DELETE, rt_key(rt), NULL, rt_mask(rt), - rt->rt_flags, NULL, rt->rt_fibnum); + rt->rt_flags|RTF_RNH_LOCKED, NULL, rt->rt_fibnum); } return 0; } @@ -1230,6 +1252,7 @@ rt_setgate(struct rtentry *rt, struct so again: RT_LOCK_ASSERT(rt); + RADIX_NODE_HEAD_LOCK_ASSERT(rnh); /* * A host route with the destination equal to the gateway @@ -1256,7 +1279,7 @@ again: struct rtentry *gwrt; RT_UNLOCK(rt); /* XXX workaround LOR */ - gwrt = rtalloc1_fib(gate, 1, 0, rt->rt_fibnum); + gwrt = rtalloc1_fib(gate, 1, RTF_RNH_LOCKED, rt->rt_fibnum); if (gwrt == rt) { RT_REMREF(rt); return (EADDRINUSE); /* failure */ @@ -1327,12 +1350,8 @@ again: arg.rnh = rnh; arg.rt0 = rt; - RT_UNLOCK(rt); /* XXX workaround LOR */ - RADIX_NODE_HEAD_LOCK(rnh); - RT_LOCK(rt); rnh->rnh_walktree_from(rnh, rt_key(rt), rt_mask(rt), rt_fixchange, &arg); - RADIX_NODE_HEAD_UNLOCK(rnh); } return 0; Modified: stable/7/sys/net/route.h ============================================================================== --- stable/7/sys/net/route.h Wed Feb 25 10:52:09 2009 (r189025) +++ stable/7/sys/net/route.h Wed Feb 25 10:59:56 2009 (r189026) @@ -172,6 +172,7 @@ struct ortentry { #define RTF_BROADCAST 0x400000 /* route represents a bcast address */ #define RTF_MULTICAST 0x800000 /* route represents a mcast address */ /* 0x1000000 and up unassigned */ +#define RTF_RNH_LOCKED 0x40000000 /* radix node head locked by caller */ /* Mask of RTF flags that are allowed to be modified by RTM_CHANGE. */ #define RTF_FMASK \ Modified: stable/7/sys/net/rtsock.c ============================================================================== --- stable/7/sys/net/rtsock.c Wed Feb 25 10:52:09 2009 (r189025) +++ stable/7/sys/net/rtsock.c Wed Feb 25 10:59:56 2009 (r189026) @@ -628,9 +628,11 @@ route_output(struct mbuf *m, struct sock !sa_equal(info.rti_info[RTAX_IFA], rt->rt_ifa->ifa_addr))) { RT_UNLOCK(rt); + RADIX_NODE_HEAD_LOCK(rnh); if ((error = rt_getifa_fib(&info, rt->rt_fibnum)) != 0) senderr(error); + RADIX_NODE_HEAD_UNLOCK(rnh); RT_LOCK(rt); } if (info.rti_ifa != NULL && @@ -642,8 +644,14 @@ route_output(struct mbuf *m, struct sock IFAFREE(rt->rt_ifa); } if (info.rti_info[RTAX_GATEWAY] != NULL) { - if ((error = rt_setgate(rt, rt_key(rt), - info.rti_info[RTAX_GATEWAY])) != 0) { + RT_UNLOCK(rt); + RADIX_NODE_HEAD_LOCK(rnh); + RT_LOCK(rt); + + error = rt_setgate(rt, rt_key(rt), + info.rti_info[RTAX_GATEWAY]); + RADIX_NODE_HEAD_UNLOCK(rnh); + if (error != 0) { RT_UNLOCK(rt); senderr(error); } Modified: stable/7/sys/netinet/in_rmx.c ============================================================================== --- stable/7/sys/netinet/in_rmx.c Wed Feb 25 10:52:09 2009 (r189025) +++ stable/7/sys/netinet/in_rmx.c Wed Feb 25 10:59:56 2009 (r189026) @@ -111,7 +111,7 @@ in_addroute(void *v_arg, void *n_arg, st * ARP entry and delete it if so. */ rt2 = in_rtalloc1((struct sockaddr *)sin, 0, - RTF_CLONING, rt->rt_fibnum); + RTF_CLONING|RTF_RNH_LOCKED, rt->rt_fibnum); if (rt2) { if (rt2->rt_flags & RTF_LLINFO && rt2->rt_flags & RTF_HOST && Modified: stable/7/sys/netinet6/in6_ifattach.c ============================================================================== --- stable/7/sys/netinet6/in6_ifattach.c Wed Feb 25 10:52:09 2009 (r189025) +++ stable/7/sys/netinet6/in6_ifattach.c Wed Feb 25 10:59:56 2009 (r189026) @@ -823,7 +823,7 @@ in6_ifdetach(struct ifnet *ifp) /* XXX grab lock first to avoid LOR */ if (rt_tables[0][AF_INET6] != NULL) { RADIX_NODE_HEAD_LOCK(rt_tables[0][AF_INET6]); - rt = rtalloc1((struct sockaddr *)&sin6, 0, 0UL); + rt = rtalloc1((struct sockaddr *)&sin6, 0, RTF_RNH_LOCKED); if (rt) { if (rt->rt_ifp == ifp) rtexpunge(rt); Modified: stable/7/sys/netinet6/in6_rmx.c ============================================================================== --- stable/7/sys/netinet6/in6_rmx.c Wed Feb 25 10:52:09 2009 (r189025) +++ stable/7/sys/netinet6/in6_rmx.c Wed Feb 25 10:59:56 2009 (r189026) @@ -154,7 +154,7 @@ in6_addroute(void *v_arg, void *n_arg, s * Find out if it is because of an * ARP entry and delete it if so. */ - rt2 = rtalloc1((struct sockaddr *)sin6, 0, RTF_CLONING); + rt2 = rtalloc1((struct sockaddr *)sin6, 0, RTF_RNH_LOCKED|RTF_CLONING); if (rt2) { if (rt2->rt_flags & RTF_LLINFO && rt2->rt_flags & RTF_HOST && @@ -181,7 +181,7 @@ in6_addroute(void *v_arg, void *n_arg, s * net route entry, 3ffe:0501:: -> if0. * This case should not raise an error. */ - rt2 = rtalloc1((struct sockaddr *)sin6, 0, RTF_CLONING); + rt2 = rtalloc1((struct sockaddr *)sin6, 0, RTF_RNH_LOCKED|RTF_CLONING); if (rt2) { if ((rt2->rt_flags & (RTF_CLONING|RTF_HOST|RTF_GATEWAY)) == RTF_CLONING