From owner-svn-src-all@FreeBSD.ORG Thu Jun 12 12:43:49 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A89A72C6; Thu, 12 Jun 2014 12:43:49 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7BF27274C; Thu, 12 Jun 2014 12:43:49 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.8/8.14.8) with ESMTP id s5CChnaS026981; Thu, 12 Jun 2014 12:43:49 GMT (envelope-from mav@svn.freebsd.org) Received: (from mav@localhost) by svn.freebsd.org (8.14.8/8.14.8/Submit) id s5CChnfg026979; Thu, 12 Jun 2014 12:43:49 GMT (envelope-from mav@svn.freebsd.org) Message-Id: <201406121243.s5CChnfg026979@svn.freebsd.org> From: Alexander Motin Date: Thu, 12 Jun 2014 12:43:49 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r267392 - head/sys/kern X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jun 2014 12:43:49 -0000 Author: mav Date: Thu Jun 12 12:43:48 2014 New Revision: 267392 URL: http://svnweb.freebsd.org/changeset/base/267392 Log: Implement simple direct-mapped cache for popular filesystem identifiers to avoid congestion on global mountlist_mtx mutex in vfs_busyfs(), while traversing through the list of mount points. This change significantly improves NFS server scalability, since it had to do this translation for every request, and the global lock becomes quite congested. This code is more optimized for relatively small number of mount points. On systems with hundreds of active mount points this simple cache may have many collisions. But the original traversal code in that case should also behave much worse, so we are not loosing much. Reviewed by: attilio MFC after: 2 weeks Sponsored by: iXsystems, Inc. Modified: head/sys/kern/vfs_subr.c Modified: head/sys/kern/vfs_subr.c ============================================================================== --- head/sys/kern/vfs_subr.c Thu Jun 12 11:57:07 2014 (r267391) +++ head/sys/kern/vfs_subr.c Thu Jun 12 12:43:48 2014 (r267392) @@ -500,23 +500,53 @@ vfs_getvfs(fsid_t *fsid) /* * Lookup a mount point by filesystem identifier, busying it before * returning. + * + * To avoid congestion on mountlist_mtx, implement simple direct-mapped + * cache for popular filesystem identifiers. The cache is lockess, using + * the fact that struct mount's are never freed. In worst case we may + * get pointer to unmounted or even different filesystem, so we have to + * check what we got, and go slow way if so. */ struct mount * vfs_busyfs(fsid_t *fsid) { +#define FSID_CACHE_SIZE 256 + typedef struct mount * volatile vmp_t; + static vmp_t cache[FSID_CACHE_SIZE]; struct mount *mp; int error; + uint32_t hash; CTR2(KTR_VFS, "%s: fsid %p", __func__, fsid); + hash = fsid->val[0] ^ fsid->val[1]; + hash = (hash >> 16 ^ hash) & (FSID_CACHE_SIZE - 1); + mp = cache[hash]; + if (mp == NULL || + mp->mnt_stat.f_fsid.val[0] != fsid->val[0] || + mp->mnt_stat.f_fsid.val[1] != fsid->val[1]) + goto slow; + if (vfs_busy(mp, 0) != 0) { + cache[hash] = NULL; + goto slow; + } + if (mp->mnt_stat.f_fsid.val[0] == fsid->val[0] && + mp->mnt_stat.f_fsid.val[1] == fsid->val[1]) + return (mp); + else + vfs_unbusy(mp); + +slow: mtx_lock(&mountlist_mtx); TAILQ_FOREACH(mp, &mountlist, mnt_list) { if (mp->mnt_stat.f_fsid.val[0] == fsid->val[0] && mp->mnt_stat.f_fsid.val[1] == fsid->val[1]) { error = vfs_busy(mp, MBF_MNTLSTLOCK); if (error) { + cache[hash] = NULL; mtx_unlock(&mountlist_mtx); return (NULL); } + cache[hash] = mp; return (mp); } }