From owner-svn-src-all@freebsd.org Wed May 23 21:02:17 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A5BD9EB1F5C; Wed, 23 May 2018 21:02:17 +0000 (UTC) (envelope-from mmacy@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 54A147EA42; Wed, 23 May 2018 21:02:17 +0000 (UTC) (envelope-from mmacy@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 357F11AF; Wed, 23 May 2018 21:02:17 +0000 (UTC) (envelope-from mmacy@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w4NL2H4D073864; Wed, 23 May 2018 21:02:17 GMT (envelope-from mmacy@FreeBSD.org) Received: (from mmacy@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w4NL2FaJ073854; Wed, 23 May 2018 21:02:15 GMT (envelope-from mmacy@FreeBSD.org) Message-Id: <201805232102.w4NL2FaJ073854@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mmacy set sender to mmacy@FreeBSD.org using -f From: Matt Macy Date: Wed, 23 May 2018 21:02:15 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r334118 - in head/sys: compat/linprocfs compat/linux compat/linuxkpi/common/include/linux dev/mlx5/mlx5_ib dev/wtap net net/altq netinet netinet/netdump netinet6 netpfil/pf nfs ofed/dri... X-SVN-Group: head X-SVN-Commit-Author: mmacy X-SVN-Commit-Paths: in head/sys: compat/linprocfs compat/linux compat/linuxkpi/common/include/linux dev/mlx5/mlx5_ib dev/wtap net net/altq netinet netinet/netdump netinet6 netpfil/pf nfs ofed/drivers/infiniband/core X-SVN-Commit-Revision: 334118 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 May 2018 21:02:18 -0000 Author: mmacy Date: Wed May 23 21:02:14 2018 New Revision: 334118 URL: https://svnweb.freebsd.org/changeset/base/334118 Log: UDP: further performance improvements on tx Cumulative throughput while running 64 netperf -H $DUT -t UDP_STREAM -- -m 1 on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps Single stream throughput increases from 910kpps to 1.18Mpps Baseline: https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg - Protect read access to global ifnet list with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg - Protect short lived ifaddr references with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg - Convert if_afdata read lock path to epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg A fix for the inpcbhash contention is pending sufficient time on a canary at LLNW. Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15409 Modified: head/sys/compat/linprocfs/linprocfs.c head/sys/compat/linux/linux_ioctl.c head/sys/compat/linuxkpi/common/include/linux/inetdevice.h head/sys/dev/mlx5/mlx5_ib/mlx5_ib_main.c head/sys/dev/wtap/if_wtap.c head/sys/net/altq/altq_subr.c head/sys/net/bridgestp.c head/sys/net/if.c head/sys/net/if_llatbl.c head/sys/net/if_llatbl.h head/sys/net/if_var.h head/sys/net/route.c head/sys/net/rtsock.c head/sys/netinet/igmp.c head/sys/netinet/in.c head/sys/netinet/in_mcast.c head/sys/netinet/in_pcb.c head/sys/netinet/in_var.h head/sys/netinet/ip_divert.c head/sys/netinet/ip_icmp.c head/sys/netinet/ip_input.c head/sys/netinet/ip_mroute.c head/sys/netinet/ip_options.c head/sys/netinet/ip_output.c head/sys/netinet/netdump/netdump_client.c head/sys/netinet/raw_ip.c head/sys/netinet/sctp_bsd_addr.c head/sys/netinet6/icmp6.c head/sys/netinet6/in6.c head/sys/netinet6/in6_ifattach.c head/sys/netinet6/in6_pcb.c head/sys/netinet6/ip6_input.c head/sys/netinet6/nd6.c head/sys/netinet6/raw_ip6.c head/sys/netpfil/pf/pf_if.c head/sys/nfs/bootp_subr.c head/sys/nfs/nfs_diskless.c head/sys/ofed/drivers/infiniband/core/ib_roce_gid_mgmt.c Modified: head/sys/compat/linprocfs/linprocfs.c ============================================================================== --- head/sys/compat/linprocfs/linprocfs.c Wed May 23 20:55:01 2018 (r334117) +++ head/sys/compat/linprocfs/linprocfs.c Wed May 23 21:02:14 2018 (r334118) @@ -1138,7 +1138,7 @@ linux_ifname(struct ifnet *ifp, char *buffer, size_t b /* Determine the (relative) unit number for ethernet interfaces */ ethno = 0; - TAILQ_FOREACH(ifscan, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(ifscan, &V_ifnet, if_link) { if (ifscan == ifp) return (snprintf(buffer, buflen, "eth%d", ethno)); if (IFP_IS_ETH(ifscan)) @@ -1166,7 +1166,7 @@ linprocfs_donetdev(PFS_FILL_ARGS) CURVNET_SET(TD_TO_VNET(curthread)); IFNET_RLOCK(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { linux_ifname(ifp, ifname, sizeof ifname); sbuf_printf(sb, "%6.6s: ", ifname); sbuf_printf(sb, "%7ju %7ju %4ju %4ju %4lu %5lu %10lu %9ju ", Modified: head/sys/compat/linux/linux_ioctl.c ============================================================================== --- head/sys/compat/linux/linux_ioctl.c Wed May 23 20:55:01 2018 (r334117) +++ head/sys/compat/linux/linux_ioctl.c Wed May 23 21:02:14 2018 (r334118) @@ -2150,7 +2150,7 @@ ifname_linux_to_bsd(struct thread *td, const char *lxn is_eth = (len == 3 && !strncmp(lxname, "eth", len)) ? 1 : 0; CURVNET_SET(TD_TO_VNET(td)); IFNET_RLOCK(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { /* * Allow Linux programs to use FreeBSD names. Don't presume * we never have an interface named "eth", so don't make @@ -2188,7 +2188,7 @@ linux_ioctl_ifname(struct thread *td, struct l_ifreq * index = 1; /* ifr.ifr_ifindex starts from 1 */ ethno = 0; error = ENODEV; - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { if (ifr.ifr_ifindex == index) { if (IFP_IS_ETH(ifp)) snprintf(ifr.ifr_name, LINUX_IFNAMSIZ, @@ -2240,7 +2240,7 @@ linux_ifconf(struct thread *td, struct ifconf *uifc) if ((l_uintptr_t)ifc.ifc_buf == PTROUT(NULL)) { ifc.ifc_len = 0; IFNET_RLOCK(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { CK_STAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { struct sockaddr *sa = ifa->ifa_addr; if (sa->sa_family == AF_INET) @@ -2271,7 +2271,7 @@ again: /* Return all AF_INET addresses of all interfaces */ IFNET_RLOCK(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { int addrs = 0; bzero(&ifr, sizeof(ifr)); Modified: head/sys/compat/linuxkpi/common/include/linux/inetdevice.h ============================================================================== --- head/sys/compat/linuxkpi/common/include/linux/inetdevice.h Wed May 23 20:55:01 2018 (r334117) +++ head/sys/compat/linuxkpi/common/include/linux/inetdevice.h Wed May 23 21:02:14 2018 (r334118) @@ -44,16 +44,17 @@ ip_dev_find(struct vnet *vnet, uint32_t addr) sin.sin_addr.s_addr = addr; sin.sin_len = sizeof(sin); sin.sin_family = AF_INET; + NET_EPOCH_ENTER(); CURVNET_SET_QUIET(vnet); ifa = ifa_ifwithaddr((struct sockaddr *)&sin); CURVNET_RESTORE(); if (ifa) { ifp = ifa->ifa_ifp; if_ref(ifp); - ifa_free(ifa); } else { ifp = NULL; } + NET_EPOCH_EXIT(); return (ifp); } @@ -69,6 +70,7 @@ ip6_dev_find(struct vnet *vnet, struct in6_addr addr) sin6.sin6_addr = addr; sin6.sin6_len = sizeof(sin6); sin6.sin6_family = AF_INET6; + NET_EPOCH_ENTER(); CURVNET_SET_QUIET(vnet); if (IN6_IS_SCOPE_LINKLOCAL(&addr) || IN6_IS_ADDR_MC_INTFACELOCAL(&addr)) { @@ -85,8 +87,8 @@ ip6_dev_find(struct vnet *vnet, struct in6_addr addr) if (ifa != NULL) { ifp = ifa->ifa_ifp; if_ref(ifp); - ifa_free(ifa); } + NET_EPOCH_EXIT(); CURVNET_RESTORE(); return (ifp); } Modified: head/sys/dev/mlx5/mlx5_ib/mlx5_ib_main.c ============================================================================== --- head/sys/dev/mlx5/mlx5_ib/mlx5_ib_main.c Wed May 23 20:55:01 2018 (r334117) +++ head/sys/dev/mlx5/mlx5_ib/mlx5_ib_main.c Wed May 23 21:02:14 2018 (r334118) @@ -2771,7 +2771,7 @@ static int mlx5_enable_roce(struct mlx5_ib_dev *dev) VNET_FOREACH(vnet_iter) { IFNET_RLOCK(); CURVNET_SET_QUIET(vnet_iter); - TAILQ_FOREACH(idev, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(idev, &V_ifnet, if_link) { /* check if network interface belongs to mlx5en */ if (!mlx5_netdev_match(idev, dev->mdev, "mce")) continue; Modified: head/sys/dev/wtap/if_wtap.c ============================================================================== --- head/sys/dev/wtap/if_wtap.c Wed May 23 20:55:01 2018 (r334117) +++ head/sys/dev/wtap/if_wtap.c Wed May 23 21:02:14 2018 (r334118) @@ -108,7 +108,7 @@ wtap_node_write(struct cdev *dev, struct uio *uio, int CURVNET_SET(TD_TO_VNET(curthread)); IFNET_RLOCK_NOSLEEP(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { printf("ifp->if_xname = %s\n", ifp->if_xname); if(strcmp(devtoname(dev), ifp->if_xname) == 0){ printf("found match, correspoding wtap = %s\n", Modified: head/sys/net/altq/altq_subr.c ============================================================================== --- head/sys/net/altq/altq_subr.c Wed May 23 20:55:01 2018 (r334117) +++ head/sys/net/altq/altq_subr.c Wed May 23 21:02:14 2018 (r334118) @@ -434,8 +434,8 @@ tbr_timeout(arg) VNET_LIST_RLOCK_NOSLEEP(); VNET_FOREACH(vnet_iter) { CURVNET_SET(vnet_iter); - for (ifp = TAILQ_FIRST(&V_ifnet); ifp; - ifp = TAILQ_NEXT(ifp, if_link)) { + for (ifp = CK_STAILQ_FIRST(&V_ifnet); ifp; + ifp = CK_STAILQ_NEXT(ifp, if_link)) { /* read from if_snd unlocked */ if (!TBR_IS_ENABLED(&ifp->if_snd)) continue; Modified: head/sys/net/bridgestp.c ============================================================================== --- head/sys/net/bridgestp.c Wed May 23 20:55:01 2018 (r334117) +++ head/sys/net/bridgestp.c Wed May 23 21:02:14 2018 (r334118) @@ -2043,7 +2043,7 @@ bstp_reinit(struct bstp_state *bs) * bridges in the same STP domain. */ IFNET_RLOCK_NOSLEEP(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { if (ifp->if_type != IFT_ETHER) continue; /* Not Ethernet */ Modified: head/sys/net/if.c ============================================================================== --- head/sys/net/if.c Wed May 23 20:55:01 2018 (r334117) +++ head/sys/net/if.c Wed May 23 21:02:14 2018 (r334118) @@ -256,7 +256,7 @@ struct mbuf *(*tbr_dequeue_ptr)(struct ifaltq *, int) static void if_attachdomain(void *); static void if_attachdomain1(struct ifnet *); static int ifconf(u_long, caddr_t); -static void if_grow(void); +static void *if_grow(void); static void if_input_default(struct ifnet *, struct mbuf *); static int if_requestencap_default(struct ifnet *, struct if_encap_req *); static void if_route(struct ifnet *, int flag, int fam); @@ -345,9 +345,7 @@ ifnet_byindex(u_short idx) { struct ifnet *ifp; - IFNET_RLOCK_NOSLEEP(); ifp = ifnet_byindex_locked(idx); - IFNET_RUNLOCK_NOSLEEP(); return (ifp); } @@ -372,12 +370,11 @@ ifnet_byindex_ref(u_short idx) * failure. */ static u_short -ifindex_alloc(void) +ifindex_alloc(void **old) { u_short idx; IFNET_WLOCK_ASSERT(); -retry: /* * Try to find an empty slot below V_if_index. If we fail, take the * next slot. @@ -389,8 +386,8 @@ retry: /* Catch if_index overflow. */ if (idx >= V_if_indexlim) { - if_grow(); - goto retry; + *old = if_grow(); + return (USHRT_MAX); } if (idx > V_if_index) V_if_index = idx; @@ -419,23 +416,12 @@ ifindex_free(u_short idx) } static void -ifnet_setbyindex_locked(u_short idx, struct ifnet *ifp) +ifnet_setbyindex(u_short idx, struct ifnet *ifp) { - IFNET_WLOCK_ASSERT(); - V_ifindex_table[idx] = ifp; } -static void -ifnet_setbyindex(u_short idx, struct ifnet *ifp) -{ - - IFNET_WLOCK(); - ifnet_setbyindex_locked(idx, ifp); - IFNET_WUNLOCK(); -} - struct ifaddr * ifaddr_byindex(u_short idx) { @@ -460,12 +446,15 @@ ifaddr_byindex(u_short idx) static void vnet_if_init(const void *unused __unused) { + void *old; - TAILQ_INIT(&V_ifnet); - TAILQ_INIT(&V_ifg_head); + CK_STAILQ_INIT(&V_ifnet); + CK_STAILQ_INIT(&V_ifg_head); IFNET_WLOCK(); - if_grow(); /* create initial table */ + old = if_grow(); /* create initial table */ IFNET_WUNLOCK(); + epoch_wait_preempt(net_epoch_preempt); + free(old, M_IFNET); vnet_if_clone_init(); } VNET_SYSINIT(vnet_if_init, SI_SUB_INIT_IF, SI_ORDER_SECOND, vnet_if_init, @@ -476,9 +465,9 @@ static void vnet_if_uninit(const void *unused __unused) { - VNET_ASSERT(TAILQ_EMPTY(&V_ifnet), ("%s:%d tailq &V_ifnet=%p " + VNET_ASSERT(CK_STAILQ_EMPTY(&V_ifnet), ("%s:%d tailq &V_ifnet=%p " "not empty", __func__, __LINE__, &V_ifnet)); - VNET_ASSERT(TAILQ_EMPTY(&V_ifg_head), ("%s:%d tailq &V_ifg_head=%p " + VNET_ASSERT(CK_STAILQ_EMPTY(&V_ifg_head), ("%s:%d tailq &V_ifg_head=%p " "not empty", __func__, __LINE__, &V_ifg_head)); free((caddr_t)V_ifindex_table, M_IFNET); @@ -492,7 +481,7 @@ vnet_if_return(const void *unused __unused) struct ifnet *ifp, *nifp; /* Return all inherited interfaces to their parent vnets. */ - TAILQ_FOREACH_SAFE(ifp, &V_ifnet, if_link, nifp) { + CK_STAILQ_FOREACH_SAFE(ifp, &V_ifnet, if_link, nifp) { if (ifp->if_home_vnet != ifp->if_vnet) if_vmove(ifp, ifp->if_home_vnet); } @@ -501,13 +490,16 @@ VNET_SYSUNINIT(vnet_if_return, SI_SUB_VNET_DONE, SI_OR vnet_if_return, NULL); #endif -static void + +static void * if_grow(void) { int oldlim; u_int n; struct ifnet **e; + void *old; + old = NULL; IFNET_WLOCK_ASSERT(); oldlim = V_if_indexlim; IFNET_WUNLOCK(); @@ -516,14 +508,15 @@ if_grow(void) IFNET_WLOCK(); if (V_if_indexlim != oldlim) { free(e, M_IFNET); - return; + return (NULL); } if (V_ifindex_table != NULL) { memcpy((caddr_t)e, (caddr_t)V_ifindex_table, n/2); - free((caddr_t)V_ifindex_table, M_IFNET); + old = V_ifindex_table; } V_if_indexlim <<= 1; V_ifindex_table = e; + return (old); } /* @@ -536,11 +529,19 @@ if_alloc(u_char type) { struct ifnet *ifp; u_short idx; + void *old; ifp = malloc(sizeof(struct ifnet), M_IFNET, M_WAITOK|M_ZERO); + restart: IFNET_WLOCK(); - idx = ifindex_alloc(); - ifnet_setbyindex_locked(idx, IFNET_HOLD); + idx = ifindex_alloc(&old); + if (__predict_false(idx == USHRT_MAX)) { + IFNET_WUNLOCK(); + epoch_wait_preempt(net_epoch_preempt); + free(old, M_IFNET); + goto restart; + } + ifnet_setbyindex(idx, IFNET_HOLD); IFNET_WUNLOCK(); ifp->if_index = idx; ifp->if_type = type; @@ -563,7 +564,7 @@ if_alloc(u_char type) IF_AFDATA_LOCK_INIT(ifp); CK_STAILQ_INIT(&ifp->if_addrhead); CK_STAILQ_INIT(&ifp->if_multiaddrs); - TAILQ_INIT(&ifp->if_groups); + CK_STAILQ_INIT(&ifp->if_groups); #ifdef MAC mac_ifnet_init(ifp); #endif @@ -609,6 +610,15 @@ if_free_internal(struct ifnet *ifp) free(ifp, M_IFNET); } +static void +if_destroy(epoch_context_t ctx) +{ + struct ifnet *ifp; + + ifp = __containerof(ctx, struct ifnet, if_epoch_ctx); + if_free_internal(ifp); +} + /* * Deregister an interface and free the associated storage. */ @@ -627,7 +637,7 @@ if_free(struct ifnet *ifp) IFNET_WUNLOCK(); if (refcount_release(&ifp->if_refcount)) - if_free_internal(ifp); + epoch_call(net_epoch_preempt, &ifp->if_epoch_ctx, if_destroy); CURVNET_RESTORE(); } @@ -650,7 +660,7 @@ if_rele(struct ifnet *ifp) if (!refcount_release(&ifp->if_refcount)) return; - if_free_internal(ifp); + epoch_call(net_epoch_preempt, &ifp->if_epoch_ctx, if_destroy); } void @@ -883,7 +893,7 @@ if_attach_internal(struct ifnet *ifp, int vmove, struc #endif IFNET_WLOCK(); - TAILQ_INSERT_TAIL(&V_ifnet, ifp, if_link); + CK_STAILQ_INSERT_TAIL(&V_ifnet, ifp, if_link); #ifdef VIMAGE curvnet->vnet_ifcnt++; #endif @@ -915,7 +925,7 @@ if_attachdomain(void *dummy) { struct ifnet *ifp; - TAILQ_FOREACH(ifp, &V_ifnet, if_link) + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) if_attachdomain1(ifp); } SYSINIT(domainifattach, SI_SUB_PROTO_IFATTACHDOMAIN, SI_ORDER_SECOND, @@ -1050,9 +1060,9 @@ if_detach_internal(struct ifnet *ifp, int vmove, struc ifp->if_vnet->vnet_state < SI_SUB_VNET_DONE) ? 1 : 0; #endif IFNET_WLOCK(); - TAILQ_FOREACH(iter, &V_ifnet, if_link) + CK_STAILQ_FOREACH(iter, &V_ifnet, if_link) if (iter == ifp) { - TAILQ_REMOVE(&V_ifnet, ifp, if_link); + CK_STAILQ_REMOVE(&V_ifnet, ifp, ifnet, if_link); found = 1; break; } @@ -1080,7 +1090,7 @@ if_detach_internal(struct ifnet *ifp, int vmove, struc #ifdef VIMAGE curvnet->vnet_ifcnt--; #endif - + epoch_wait_preempt(net_epoch_preempt); /* * In any case (destroy or vmove) detach us from the groups * and remove/wait for pending events on the taskq. @@ -1220,6 +1230,7 @@ if_vmove(struct ifnet *ifp, struct vnet *new_vnet) { struct if_clone *ifc; u_int bif_dlt, bif_hdrlen; + void *old; int rc; /* @@ -1260,10 +1271,16 @@ if_vmove(struct ifnet *ifp, struct vnet *new_vnet) * Switch to the context of the target vnet. */ CURVNET_SET_QUIET(new_vnet); - + restart: IFNET_WLOCK(); - ifp->if_index = ifindex_alloc(); - ifnet_setbyindex_locked(ifp->if_index, ifp); + ifp->if_index = ifindex_alloc(&old); + if (__predict_false(ifp->if_index == USHRT_MAX)) { + IFNET_WUNLOCK(); + epoch_wait_preempt(net_epoch_preempt); + free(old, M_IFNET); + goto restart; + } + ifnet_setbyindex(ifp->if_index, ifp); IFNET_WUNLOCK(); if_attach_internal(ifp, 1, ifc); @@ -1400,7 +1417,7 @@ if_addgroup(struct ifnet *ifp, const char *groupname) return (EINVAL); IFNET_WLOCK(); - TAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next) + CK_STAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next) if (!strcmp(ifgl->ifgl_group->ifg_group, groupname)) { IFNET_WUNLOCK(); return (EEXIST); @@ -1419,7 +1436,7 @@ if_addgroup(struct ifnet *ifp, const char *groupname) return (ENOMEM); } - TAILQ_FOREACH(ifg, &V_ifg_head, ifg_next) + CK_STAILQ_FOREACH(ifg, &V_ifg_head, ifg_next) if (!strcmp(ifg->ifg_group, groupname)) break; @@ -1433,8 +1450,8 @@ if_addgroup(struct ifnet *ifp, const char *groupname) } strlcpy(ifg->ifg_group, groupname, sizeof(ifg->ifg_group)); ifg->ifg_refcnt = 0; - TAILQ_INIT(&ifg->ifg_members); - TAILQ_INSERT_TAIL(&V_ifg_head, ifg, ifg_next); + CK_STAILQ_INIT(&ifg->ifg_members); + CK_STAILQ_INSERT_TAIL(&V_ifg_head, ifg, ifg_next); new = 1; } @@ -1443,8 +1460,8 @@ if_addgroup(struct ifnet *ifp, const char *groupname) ifgm->ifgm_ifp = ifp; IF_ADDR_WLOCK(ifp); - TAILQ_INSERT_TAIL(&ifg->ifg_members, ifgm, ifgm_next); - TAILQ_INSERT_TAIL(&ifp->if_groups, ifgl, ifgl_next); + CK_STAILQ_INSERT_TAIL(&ifg->ifg_members, ifgm, ifgm_next); + CK_STAILQ_INSERT_TAIL(&ifp->if_groups, ifgl, ifgl_next); IF_ADDR_WUNLOCK(ifp); IFNET_WUNLOCK(); @@ -1464,9 +1481,10 @@ if_delgroup(struct ifnet *ifp, const char *groupname) { struct ifg_list *ifgl; struct ifg_member *ifgm; + int freeifgl; IFNET_WLOCK(); - TAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next) + CK_STAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next) if (!strcmp(ifgl->ifgl_group->ifg_group, groupname)) break; if (ifgl == NULL) { @@ -1474,27 +1492,30 @@ if_delgroup(struct ifnet *ifp, const char *groupname) return (ENOENT); } + freeifgl = 0; IF_ADDR_WLOCK(ifp); - TAILQ_REMOVE(&ifp->if_groups, ifgl, ifgl_next); + CK_STAILQ_REMOVE(&ifp->if_groups, ifgl, ifg_list, ifgl_next); IF_ADDR_WUNLOCK(ifp); - TAILQ_FOREACH(ifgm, &ifgl->ifgl_group->ifg_members, ifgm_next) + CK_STAILQ_FOREACH(ifgm, &ifgl->ifgl_group->ifg_members, ifgm_next) if (ifgm->ifgm_ifp == ifp) break; - if (ifgm != NULL) { - TAILQ_REMOVE(&ifgl->ifgl_group->ifg_members, ifgm, ifgm_next); - free(ifgm, M_TEMP); - } + if (ifgm != NULL) + CK_STAILQ_REMOVE(&ifgl->ifgl_group->ifg_members, ifgm, ifg_member, ifgm_next); if (--ifgl->ifgl_group->ifg_refcnt == 0) { - TAILQ_REMOVE(&V_ifg_head, ifgl->ifgl_group, ifg_next); - IFNET_WUNLOCK(); + CK_STAILQ_REMOVE(&V_ifg_head, ifgl->ifgl_group, ifg_group, ifg_next); + freeifgl = 1; + } + IFNET_WUNLOCK(); + + epoch_wait_preempt(net_epoch_preempt); + if (freeifgl) { EVENTHANDLER_INVOKE(group_detach_event, ifgl->ifgl_group); free(ifgl->ifgl_group, M_TEMP); - } else - IFNET_WUNLOCK(); - + } + free(ifgm, M_TEMP); free(ifgl, M_TEMP); EVENTHANDLER_INVOKE(group_change_event, groupname); @@ -1511,38 +1532,38 @@ if_delgroups(struct ifnet *ifp) struct ifg_list *ifgl; struct ifg_member *ifgm; char groupname[IFNAMSIZ]; + int ifglfree; IFNET_WLOCK(); - while (!TAILQ_EMPTY(&ifp->if_groups)) { - ifgl = TAILQ_FIRST(&ifp->if_groups); + while (!CK_STAILQ_EMPTY(&ifp->if_groups)) { + ifgl = CK_STAILQ_FIRST(&ifp->if_groups); strlcpy(groupname, ifgl->ifgl_group->ifg_group, IFNAMSIZ); IF_ADDR_WLOCK(ifp); - TAILQ_REMOVE(&ifp->if_groups, ifgl, ifgl_next); + CK_STAILQ_REMOVE(&ifp->if_groups, ifgl, ifg_list, ifgl_next); IF_ADDR_WUNLOCK(ifp); - TAILQ_FOREACH(ifgm, &ifgl->ifgl_group->ifg_members, ifgm_next) + CK_STAILQ_FOREACH(ifgm, &ifgl->ifgl_group->ifg_members, ifgm_next) if (ifgm->ifgm_ifp == ifp) break; - if (ifgm != NULL) { - TAILQ_REMOVE(&ifgl->ifgl_group->ifg_members, ifgm, + if (ifgm != NULL) + CK_STAILQ_REMOVE(&ifgl->ifgl_group->ifg_members, ifgm, ifg_member, ifgm_next); - free(ifgm, M_TEMP); + ifglfree = 0; + if (--ifgl->ifgl_group->ifg_refcnt == 0) { + CK_STAILQ_REMOVE(&V_ifg_head, ifgl->ifgl_group, ifg_group, ifg_next); + ifglfree = 1; } - if (--ifgl->ifgl_group->ifg_refcnt == 0) { - TAILQ_REMOVE(&V_ifg_head, ifgl->ifgl_group, ifg_next); - IFNET_WUNLOCK(); + epoch_wait_preempt(net_epoch_preempt); + free(ifgm, M_TEMP); + if (ifglfree) { EVENTHANDLER_INVOKE(group_detach_event, - ifgl->ifgl_group); + ifgl->ifgl_group); free(ifgl->ifgl_group, M_TEMP); - } else - IFNET_WUNLOCK(); - - free(ifgl, M_TEMP); - + } EVENTHANDLER_INVOKE(group_change_event, groupname); IFNET_WLOCK(); @@ -1589,7 +1610,7 @@ if_getgroup(struct ifgroupreq *ifgr, struct ifnet *ifp if (ifgr->ifgr_len == 0) { IF_ADDR_RLOCK(ifp); - TAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next) + CK_STAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next) ifgr->ifgr_len += sizeof(struct ifg_req); IF_ADDR_RUNLOCK(ifp); return (0); @@ -1599,7 +1620,7 @@ if_getgroup(struct ifgroupreq *ifgr, struct ifnet *ifp ifgp = ifgr_groups_get(ifgr); /* XXX: wire */ IF_ADDR_RLOCK(ifp); - TAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next) { + CK_STAILQ_FOREACH(ifgl, &ifp->if_groups, ifgl_next) { if (len < sizeof(ifgrq)) { IF_ADDR_RUNLOCK(ifp); return (EINVAL); @@ -1631,7 +1652,7 @@ if_getgroupmembers(struct ifgroupreq *ifgr) int len, error; IFNET_RLOCK(); - TAILQ_FOREACH(ifg, &V_ifg_head, ifg_next) + CK_STAILQ_FOREACH(ifg, &V_ifg_head, ifg_next) if (!strcmp(ifg->ifg_group, ifgr->ifgr_name)) break; if (ifg == NULL) { @@ -1640,7 +1661,7 @@ if_getgroupmembers(struct ifgroupreq *ifgr) } if (ifgr->ifgr_len == 0) { - TAILQ_FOREACH(ifgm, &ifg->ifg_members, ifgm_next) + CK_STAILQ_FOREACH(ifgm, &ifg->ifg_members, ifgm_next) ifgr->ifgr_len += sizeof(ifgrq); IFNET_RUNLOCK(); return (0); @@ -1648,7 +1669,7 @@ if_getgroupmembers(struct ifgroupreq *ifgr) len = ifgr->ifgr_len; ifgp = ifgr_groups_get(ifgr); - TAILQ_FOREACH(ifgm, &ifg->ifg_members, ifgm_next) { + CK_STAILQ_FOREACH(ifgm, &ifg->ifg_members, ifgm_next) { if (len < sizeof(ifgrq)) { IFNET_RUNLOCK(); return (EINVAL); @@ -1897,22 +1918,18 @@ ifa_switch_loopback_route(struct ifaddr *ifa, struct s * Locate an interface based on a complete address. */ /*ARGSUSED*/ -static struct ifaddr * -ifa_ifwithaddr_internal(const struct sockaddr *addr, int getref) +struct ifaddr * +ifa_ifwithaddr(const struct sockaddr *addr) { struct ifnet *ifp; struct ifaddr *ifa; - IFNET_RLOCK_NOSLEEP(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { - IF_ADDR_RLOCK(ifp); + MPASS(in_epoch()); + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { CK_STAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { if (ifa->ifa_addr->sa_family != addr->sa_family) continue; if (sa_equal(addr, ifa->ifa_addr)) { - if (getref) - ifa_ref(ifa); - IF_ADDR_RUNLOCK(ifp); goto done; } /* IP6 doesn't have broadcast */ @@ -1920,32 +1937,24 @@ ifa_ifwithaddr_internal(const struct sockaddr *addr, i ifa->ifa_broadaddr && ifa->ifa_broadaddr->sa_len != 0 && sa_equal(ifa->ifa_broadaddr, addr)) { - if (getref) - ifa_ref(ifa); - IF_ADDR_RUNLOCK(ifp); goto done; } } - IF_ADDR_RUNLOCK(ifp); } ifa = NULL; done: - IFNET_RUNLOCK_NOSLEEP(); return (ifa); } -struct ifaddr * -ifa_ifwithaddr(const struct sockaddr *addr) -{ - - return (ifa_ifwithaddr_internal(addr, 1)); -} - int ifa_ifwithaddr_check(const struct sockaddr *addr) { + int rc; - return (ifa_ifwithaddr_internal(addr, 0) != NULL); + NET_EPOCH_ENTER(); + rc = (ifa_ifwithaddr(addr) != NULL); + NET_EPOCH_EXIT(); + return (rc); } /* @@ -1958,11 +1967,10 @@ ifa_ifwithbroadaddr(const struct sockaddr *addr, int f struct ifnet *ifp; struct ifaddr *ifa; - IFNET_RLOCK_NOSLEEP(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + MPASS(in_epoch()); + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { if ((fibnum != RT_ALL_FIBS) && (ifp->if_fib != fibnum)) continue; - IF_ADDR_RLOCK(ifp); CK_STAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { if (ifa->ifa_addr->sa_family != addr->sa_family) continue; @@ -1970,16 +1978,12 @@ ifa_ifwithbroadaddr(const struct sockaddr *addr, int f ifa->ifa_broadaddr && ifa->ifa_broadaddr->sa_len != 0 && sa_equal(ifa->ifa_broadaddr, addr)) { - ifa_ref(ifa); - IF_ADDR_RUNLOCK(ifp); goto done; } } - IF_ADDR_RUNLOCK(ifp); } ifa = NULL; done: - IFNET_RUNLOCK_NOSLEEP(); return (ifa); } @@ -1993,28 +1997,23 @@ ifa_ifwithdstaddr(const struct sockaddr *addr, int fib struct ifnet *ifp; struct ifaddr *ifa; - IFNET_RLOCK_NOSLEEP(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + MPASS(in_epoch()); + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { if ((ifp->if_flags & IFF_POINTOPOINT) == 0) continue; if ((fibnum != RT_ALL_FIBS) && (ifp->if_fib != fibnum)) continue; - IF_ADDR_RLOCK(ifp); CK_STAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { if (ifa->ifa_addr->sa_family != addr->sa_family) continue; if (ifa->ifa_dstaddr != NULL && sa_equal(addr, ifa->ifa_dstaddr)) { - ifa_ref(ifa); - IF_ADDR_RUNLOCK(ifp); goto done; } } - IF_ADDR_RUNLOCK(ifp); } ifa = NULL; done: - IFNET_RUNLOCK_NOSLEEP(); return (ifa); } @@ -2031,6 +2030,7 @@ ifa_ifwithnet(const struct sockaddr *addr, int ignore_ u_int af = addr->sa_family; const char *addr_data = addr->sa_data, *cplim; + MPASS(in_epoch()); /* * AF_LINK addresses can be looked up directly by their index number, * so do that if we can. @@ -2047,11 +2047,9 @@ ifa_ifwithnet(const struct sockaddr *addr, int ignore_ * on ifa_maybe once we find one, as we release the IF_ADDR_RLOCK() that * kept it stable when we move onto the next interface. */ - IFNET_RLOCK_NOSLEEP(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { if ((fibnum != RT_ALL_FIBS) && (ifp->if_fib != fibnum)) continue; - IF_ADDR_RLOCK(ifp); CK_STAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { const char *cp, *cp2, *cp3; @@ -2069,7 +2067,6 @@ next: continue; */ if (ifa->ifa_dstaddr != NULL && sa_equal(addr, ifa->ifa_dstaddr)) { - ifa_ref(ifa); IF_ADDR_RUNLOCK(ifp); goto done; } @@ -2103,21 +2100,14 @@ next: continue; ifa_preferred(ifa_maybe, ifa) || rn_refines((caddr_t)ifa->ifa_netmask, (caddr_t)ifa_maybe->ifa_netmask)) { - if (ifa_maybe != NULL) - ifa_free(ifa_maybe); ifa_maybe = ifa; - ifa_ref(ifa_maybe); } } } - IF_ADDR_RUNLOCK(ifp); } ifa = ifa_maybe; ifa_maybe = NULL; done: - IFNET_RUNLOCK_NOSLEEP(); - if (ifa_maybe != NULL) - ifa_free(ifa_maybe); return (ifa); } @@ -2136,7 +2126,7 @@ ifaof_ifpforaddr(const struct sockaddr *addr, struct i if (af >= AF_MAX) return (NULL); - IF_ADDR_RLOCK(ifp); + MPASS(in_epoch()); CK_STAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { if (ifa->ifa_addr->sa_family != af) continue; @@ -2166,9 +2156,6 @@ ifaof_ifpforaddr(const struct sockaddr *addr, struct i } ifa = ifa_maybe; done: - if (ifa != NULL) - ifa_ref(ifa); - IF_ADDR_RUNLOCK(ifp); return (ifa); } @@ -2204,14 +2191,15 @@ link_rtrequest(int cmd, struct rtentry *rt, struct rt_ if (cmd != RTM_ADD || ((ifa = rt->rt_ifa) == NULL) || ((ifp = ifa->ifa_ifp) == NULL) || ((dst = rt_key(rt)) == NULL)) return; + NET_EPOCH_ENTER(); ifa = ifaof_ifpforaddr(dst, ifp); if (ifa) { oifa = rt->rt_ifa; rt->rt_ifa = ifa; - ifa_free(oifa); if (ifa->ifa_rtrequest && ifa->ifa_rtrequest != link_rtrequest) ifa->ifa_rtrequest(cmd, rt, info); } + NET_EPOCH_EXIT(); } struct sockaddr_dl * @@ -2414,7 +2402,7 @@ ifunit_ref(const char *name) struct ifnet *ifp; IFNET_RLOCK_NOSLEEP(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { if (strncmp(name, ifp->if_xname, IFNAMSIZ) == 0 && !(ifp->if_flags & IFF_DYING)) break; @@ -2431,7 +2419,7 @@ ifunit(const char *name) struct ifnet *ifp; IFNET_RLOCK_NOSLEEP(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { if (strncmp(name, ifp->if_xname, IFNAMSIZ) == 0) break; } @@ -3256,7 +3244,7 @@ again: valid_len = 0; IFNET_RLOCK(); - TAILQ_FOREACH(ifp, &V_ifnet, if_link) { + CK_STAILQ_FOREACH(ifp, &V_ifnet, if_link) { int addrs; /* @@ -3609,7 +3597,7 @@ if_delmulti(struct ifnet *ifp, struct sockaddr *sa) struct ifnet *oifp; IFNET_RLOCK_NOSLEEP(); - TAILQ_FOREACH(oifp, &V_ifnet, if_link) + CK_STAILQ_FOREACH(oifp, &V_ifnet, if_link) if (ifp == oifp) break; if (ifp != oifp) @@ -3683,7 +3671,7 @@ if_delmulti_ifma_flags(struct ifmultiaddr *ifma, int f struct ifnet *oifp; IFNET_RLOCK_NOSLEEP(); - TAILQ_FOREACH(oifp, &V_ifnet, if_link) + CK_STAILQ_FOREACH(oifp, &V_ifnet, if_link) if (ifp == oifp) break; if (ifp != oifp) { @@ -3810,23 +3798,24 @@ if_setlladdr(struct ifnet *ifp, const u_char *lladdr, struct sockaddr_dl *sdl; struct ifaddr *ifa; struct ifreq ifr; + int rc; - IF_ADDR_RLOCK(ifp); + rc = 0; + NET_EPOCH_ENTER(); ifa = ifp->if_addr; if (ifa == NULL) { - IF_ADDR_RUNLOCK(ifp); - return (EINVAL); + rc = EINVAL; + goto out; } - ifa_ref(ifa); - IF_ADDR_RUNLOCK(ifp); + sdl = (struct sockaddr_dl *)ifa->ifa_addr; if (sdl == NULL) { - ifa_free(ifa); - return (EINVAL); + rc = EINVAL; + goto out; } if (len != sdl->sdl_alen) { /* don't allow length to change */ - ifa_free(ifa); - return (EINVAL); + rc = EINVAL; + goto out; } switch (ifp->if_type) { case IFT_ETHER: @@ -3835,11 +3824,10 @@ if_setlladdr(struct ifnet *ifp, const u_char *lladdr, case IFT_BRIDGE: case IFT_IEEE8023ADLAG: bcopy(lladdr, LLADDR(sdl), len); - ifa_free(ifa); break; default: - ifa_free(ifa); - return (ENODEV); + rc = ENODEV; + goto out; } /* @@ -3860,7 +3848,9 @@ if_setlladdr(struct ifnet *ifp, const u_char *lladdr, } } EVENTHANDLER_INVOKE(iflladdr_event, ifp); - return (0); + out: + NET_EPOCH_EXIT(); + return (rc); } /* Modified: head/sys/net/if_llatbl.c ============================================================================== --- head/sys/net/if_llatbl.c Wed May 23 20:55:01 2018 (r334117) +++ head/sys/net/if_llatbl.c Wed May 23 21:02:14 2018 (r334118) @@ -146,7 +146,7 @@ htable_foreach_lle(struct lltable *llt, llt_foreach_cb error = 0; for (i = 0; i < llt->llt_hsize; i++) { - LIST_FOREACH_SAFE(lle, &llt->lle_head[i], lle_next, next) { + CK_LIST_FOREACH_SAFE(lle, &llt->lle_head[i], lle_next, next) { error = f(llt, lle, farg); if (error != 0) break; @@ -173,7 +173,7 @@ htable_link_entry(struct lltable *llt, struct llentry lle->lle_tbl = llt; lle->lle_head = lleh; lle->la_flags |= LLE_LINKED; - LIST_INSERT_HEAD(lleh, lle, lle_next); + CK_LIST_INSERT_HEAD(lleh, lle, lle_next); } static void @@ -182,7 +182,7 @@ htable_unlink_entry(struct llentry *lle) if ((lle->la_flags & LLE_LINKED) != 0) { IF_AFDATA_WLOCK_ASSERT(lle->lle_tbl->llt_ifp); - LIST_REMOVE(lle, lle_next); + CK_LIST_REMOVE(lle, lle_next); lle->la_flags &= ~(LLE_VALID | LLE_LINKED); #if 0 lle->lle_tbl = NULL; @@ -224,7 +224,7 @@ htable_prefix_free(struct lltable *llt, const struct s pmd.addr = addr; pmd.mask = mask; pmd.flags = flags; - LIST_INIT(&pmd.dchain); + CK_LIST_INIT(&pmd.dchain); IF_AFDATA_WLOCK(llt->llt_ifp); /* Push matching lles to chain */ @@ -514,7 +514,7 @@ lltable_free(struct lltable *llt) lltable_unlink(llt); - LIST_INIT(&dchain); + CK_LIST_INIT(&dchain); IF_AFDATA_WLOCK(llt->llt_ifp); /* Push all lles to @dchain */ lltable_foreach_lle(llt, lltable_free_cb, &dchain); @@ -544,7 +544,7 @@ lltable_drain(int af) continue; for (i=0; i < llt->llt_hsize; i++) { - LIST_FOREACH(lle, &llt->lle_head[i], lle_next) { + CK_LIST_FOREACH(lle, &llt->lle_head[i], lle_next) { LLE_WLOCK(lle); if (lle->la_hold) { m_freem(lle->la_hold); @@ -620,7 +620,7 @@ lltable_allocate_htbl(uint32_t hsize) M_LLTABLE, M_WAITOK | M_ZERO); for (i = 0; i < llt->llt_hsize; i++) - LIST_INIT(&llt->lle_head[i]); + CK_LIST_INIT(&llt->lle_head[i]); /* Set some default callbacks */ llt->llt_link_entry = htable_link_entry; @@ -917,7 +917,7 @@ llatbl_llt_show(struct lltable *llt) *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***