From owner-dev-commits-src-all@freebsd.org Wed Sep 29 20:44:04 2021 Return-Path: Delivered-To: dev-commits-src-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id EBC6067BE65; Wed, 29 Sep 2021 20:44:04 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HKSyr5dZlz4rfx; Wed, 29 Sep 2021 20:44:04 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 926BA1EEE6; Wed, 29 Sep 2021 20:44:04 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 18TKi4WG065420; Wed, 29 Sep 2021 20:44:04 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 18TKi43M065419; Wed, 29 Sep 2021 20:44:04 GMT (envelope-from git) Date: Wed, 29 Sep 2021 20:44:04 GMT Message-Id: <202109292044.18TKi43M065419@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Marko Zec Subject: git: 94ad8d7c7a35 - stable/13 - [fib_algo][dxr] Split unused range chunk list in multiple buckets MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: zec X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: 94ad8d7c7a35ff5eadada2d542df1b23ba69fed0 Auto-Submitted: auto-generated X-BeenThere: dev-commits-src-all@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Commit messages for all branches of the src repository List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2021 20:44:05 -0000 The branch stable/13 has been updated by zec: URL: https://cgit.FreeBSD.org/src/commit/?id=94ad8d7c7a35ff5eadada2d542df1b23ba69fed0 commit 94ad8d7c7a35ff5eadada2d542df1b23ba69fed0 Author: Marko Zec AuthorDate: 2021-09-25 04:29:48 +0000 Commit: Marko Zec CommitDate: 2021-09-29 20:40:56 +0000 [fib_algo][dxr] Split unused range chunk list in multiple buckets Traversing a single list of unused range chunks in search for a block of optimal size was suboptimal. The experience with real-world BGP workloads has shown that on average unused range chunks are tiny, mostly in length from 1 to 4 or 5, when DXR is configured with K = 20 which is the current default (D16X4R). Therefore, introduce a limited amount of buckets to accomodate descriptors of empty blocks of fixed (small) size, so that those can be found in O(1) time. If no empty chunks of the requested size can be found in fixed-size buckets, the search continues in an unsorted list of empty chunks of variable lengths, which should only happen infrequently. This change should permit us to manage significantly more empty range chunks without sacrifying the speed of incremental range table updating. MFC after: 3 days --- sys/netinet/in_fib_dxr.c | 47 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 32 insertions(+), 15 deletions(-) diff --git a/sys/netinet/in_fib_dxr.c b/sys/netinet/in_fib_dxr.c index c4f42abdda27..6a9f414c3ab0 100644 --- a/sys/netinet/in_fib_dxr.c +++ b/sys/netinet/in_fib_dxr.c @@ -115,6 +115,8 @@ CTASSERT(DXR_TRIE_BITS >= 16 && DXR_TRIE_BITS <= 24); #define XTBL_SIZE_INCR (DIRECT_TBL_SIZE / 16) +#define UNUSED_BUCKETS 8 + /* Lookup structure elements */ struct direct_entry { @@ -181,7 +183,7 @@ struct dxr_aux { struct trie_desc *trietbl[D_TBL_SIZE]; LIST_HEAD(, chunk_desc) chunk_hashtbl[CHUNK_HASH_SIZE]; LIST_HEAD(, chunk_desc) all_chunks; - LIST_HEAD(, chunk_desc) unused_chunks; /* abuses hash link entry */ + LIST_HEAD(, chunk_desc) unused_chunks[UNUSED_BUCKETS]; LIST_HEAD(, trie_desc) trie_hashtbl[TRIE_HASH_SIZE]; LIST_HEAD(, trie_desc) all_trie; LIST_HEAD(, trie_desc) unused_trie; /* abuses hash link entry */ @@ -387,6 +389,7 @@ chunk_ref(struct dxr_aux *da, uint32_t chunk) uint32_t base = fdesc->base; uint32_t size = chunk_size(da, fdesc); uint32_t hash = chunk_hash(da, fdesc); + int i; /* Find an existing descriptor */ LIST_FOREACH(cdp, &da->chunk_hashtbl[hash & CHUNK_HASH_MASK], @@ -401,15 +404,18 @@ chunk_ref(struct dxr_aux *da, uint32_t chunk) return (0); } - /* No matching chunks found. Recycle an empty or allocate a new one */ - cdp = NULL; - LIST_FOREACH(empty_cdp, &da->unused_chunks, cd_hash_le) - if (empty_cdp->cd_max_size >= size && (cdp == NULL || - empty_cdp->cd_max_size < cdp->cd_max_size)) { - cdp = empty_cdp; - if (empty_cdp->cd_max_size == size) - break; - } + /* No matching chunks found. Find an empty one to recycle. */ + for (cdp = NULL, i = size; cdp == NULL && i < UNUSED_BUCKETS; i++) + cdp = LIST_FIRST(&da->unused_chunks[i]); + + if (cdp == NULL) + LIST_FOREACH(empty_cdp, &da->unused_chunks[0], cd_hash_le) + if (empty_cdp->cd_max_size >= size && (cdp == NULL || + empty_cdp->cd_max_size < cdp->cd_max_size)) { + cdp = empty_cdp; + if (empty_cdp->cd_max_size == size) + break; + } if (cdp != NULL) { /* Copy from heap into the recycled chunk */ @@ -423,11 +429,17 @@ chunk_ref(struct dxr_aux *da, uint32_t chunk) empty_cdp = uma_zalloc(chunk_zone, M_NOWAIT); if (empty_cdp == NULL) return (1); + LIST_INSERT_BEFORE(cdp, empty_cdp, cd_all_le); + empty_cdp->cd_base = cdp->cd_base + size; empty_cdp->cd_cur_size = 0; empty_cdp->cd_max_size = cdp->cd_max_size - size; - empty_cdp->cd_base = cdp->cd_base + size; - LIST_INSERT_BEFORE(cdp, empty_cdp, cd_all_le); - LIST_INSERT_AFTER(cdp, empty_cdp, cd_hash_le); + + i = empty_cdp->cd_max_size; + if (i >= UNUSED_BUCKETS) + i = 0; + LIST_INSERT_HEAD(&da->unused_chunks[i], empty_cdp, + cd_hash_le); + da->all_chunks_cnt++; da->unused_chunks_cnt++; cdp->cd_max_size = size; @@ -480,6 +492,7 @@ chunk_unref(struct dxr_aux *da, uint32_t chunk) uint32_t base = fdesc->base; uint32_t size = chunk_size(da, fdesc); uint32_t hash = chunk_hash(da, fdesc); + int i; /* Find the corresponding descriptor */ LIST_FOREACH(cdp, &da->chunk_hashtbl[hash & CHUNK_HASH_MASK], @@ -538,7 +551,10 @@ chunk_unref(struct dxr_aux *da, uint32_t chunk) return; } - LIST_INSERT_HEAD(&da->unused_chunks, cdp, cd_hash_le); + i = cdp->cd_max_size; + if (i >= UNUSED_BUCKETS) + i = 0; + LIST_INSERT_HEAD(&da->unused_chunks[i], cdp, cd_hash_le); } #ifdef DXR2 @@ -891,7 +907,8 @@ dxr_build(struct dxr *dxr) LIST_REMOVE(cdp, cd_all_le); uma_zfree(chunk_zone, cdp); } - LIST_INIT(&da->unused_chunks); + for (i = 0; i < UNUSED_BUCKETS; i++) + LIST_INIT(&da->unused_chunks[i]); da->all_chunks_cnt = da->unused_chunks_cnt = 0; da->rtbl_top = 0; da->updates_low = 0;