Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 13 Mar 2015 21:07:05 +0100
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        Tiwei Bie <btw@mail.ustc.edu.cn>
Cc:        "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>, Ryan Stone <rysto32@gmail.com>, wca@freebsd.org
Subject:   Re: [PATCH] Finish the task 'Convert mountlist_mtx to rwlock'
Message-ID:  <20150313200705.GA32157@dft-labs.eu>
In-Reply-To: <20150312132338.GA57932@freebsd>
References:  <1426079434-51568-1-git-send-email-btw@mail.ustc.edu.cn> <CAFMmRNyLJgLuk-VPSuyBCpO1bkdmyGf3s89X-An1vCCMwn=B=A@mail.gmail.com> <20150312132338.GA57932@freebsd>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 12, 2015 at 09:26:00PM +0800, Tiwei Bie wrote:
> On Thu, Mar 12, 2015 at 12:06:02AM -0400, Ryan Stone wrote:
> > On Wed, Mar 11, 2015 at 9:10 AM, Tiwei Bie <btw@mail.ustc.edu.cn> wrote:
> > > Hi, Mateusz!
> > >
> > > I have finished the task: Convert mountlist_mtx to rwlock [1].
> > 
> > My first comment is, are we sure that we actually want an rwlock here
> > instead of an rmlock?  An rmlock will offer much better performance in
> > workloads that mostly only take read locks, and rmlocks do not suffer
> > the priority inversion problems that rwlocks do.  From the description
> > on the wiki page, it sounds like an rmlock would be ideal here:
> > 
> > > Interested person can upgrade this task to non-junior by coming up with a
> > > solution exploiting rare need to modify the list. Example approaches include
> > > designing a locking primitive with cheap shared locking (think: per-cpu) at
> > > the expense of exclusive locking.
> > 
> 
> I think rmlock is okay. But one more argument needs to be added
> to vfs_busy(), that is 'struct rm_priotracker *tracker' which is
> used to track read owners of mountlist_lock in vfs_busy().

So, this is not a simple s/rw_lock/rm_rlock and the like after all, sorry. :)

I'll have to go over this and possibly gather some ACKs before
committing.

> 
> Following is my patch:
> 
> ---
>  share/man/man9/vfs_busy.9                          |  7 ++-
>  .../compat/opensolaris/kern/opensolaris_lookup.c   |  2 +-
>  sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c |  5 ++-
>  .../opensolaris/uts/common/fs/zfs/zfs_ioctl.c      |  6 ++-
>  .../opensolaris/uts/common/fs/zfs/zfs_vfsops.c     |  6 ++-
>  sys/compat/linprocfs/linprocfs.c                   |  6 ++-
>  sys/fs/fuse/fuse_vnops.c                           |  4 +-
>  sys/fs/nfsclient/nfs_clstate.c                     |  2 +-
>  sys/fs/nfsclient/nfs_clvfsops.c                    |  2 +-
>  sys/fs/nfsclient/nfs_clvnops.c                     |  4 +-
>  sys/fs/nfsserver/nfs_nfsdport.c                    |  4 +-
>  sys/fs/nfsserver/nfs_nfsdserv.c                    |  2 +-
>  sys/fs/pseudofs/pseudofs_vnops.c                   |  6 +--
>  sys/fs/smbfs/smbfs_vnops.c                         |  4 +-
>  sys/fs/tmpfs/tmpfs_vnops.c                         |  2 +-
>  sys/geom/journal/g_journal.c                       | 13 +++---
>  sys/kern/vfs_extattr.c                             |  2 +-
>  sys/kern/vfs_lookup.c                              |  2 +-
>  sys/kern/vfs_mount.c                               | 26 ++++++-----
>  sys/kern/vfs_mountroot.c                           | 14 +++---
>  sys/kern/vfs_subr.c                                | 51 ++++++++++++----------
>  sys/kern/vfs_syscalls.c                            | 33 +++++++-------
>  sys/kern/vfs_vnops.c                               |  4 +-
>  sys/sys/mount.h                                    |  8 ++--
>  sys/ufs/ffs/ffs_softdep.c                          |  8 ++--
>  sys/ufs/ffs/ffs_suspend.c                          |  2 +-
>  sys/ufs/ufs/ufs_quota.c                            |  2 +-
>  27 files changed, 127 insertions(+), 100 deletions(-)
> 
> diff --git a/share/man/man9/vfs_busy.9 b/share/man/man9/vfs_busy.9
> index 8b9ba86..155e1e6 100644
> --- a/share/man/man9/vfs_busy.9
> +++ b/share/man/man9/vfs_busy.9
> @@ -36,7 +36,7 @@
>  .In sys/param.h
>  .In sys/mount.h
>  .Ft int
> -.Fn vfs_busy "struct mount *mp" "int flags"
> +.Fn vfs_busy "struct mount *mp" "int flags" "struct rm_priotracker *tracker"
>  .Sh DESCRIPTION
>  The
>  .Fn vfs_busy
> @@ -68,8 +68,11 @@ do not sleep if
>  .Dv MNTK_UNMOUNT
>  is set.
>  .It Dv MBF_MNTLSTLOCK
> -drop the mountlist_mtx in the critical path.
> +drop the mountlist_lock in the critical path.
>  .El
> +.It Fa tracker
> +The tracker used to track read owners of mountlist_lock
> +for priority propagation.
>  .El
>  .Sh RETURN VALUES
>  A 0 value is returned on success.
> diff --git a/sys/cddl/compat/opensolaris/kern/opensolaris_lookup.c b/sys/cddl/compat/opensolaris/kern/opensolaris_lookup.c
> index 848007e..8e1c657b 100644
> --- a/sys/cddl/compat/opensolaris/kern/opensolaris_lookup.c
> +++ b/sys/cddl/compat/opensolaris/kern/opensolaris_lookup.c
> @@ -88,7 +88,7 @@ traverse(vnode_t **cvpp, int lktype)
>  		vfsp = vn_mountedvfs(cvp);
>  		if (vfsp == NULL)
>  			break;
> -		error = vfs_busy(vfsp, 0);
> +		error = vfs_busy(vfsp, 0, NULL);
>  		/*
>  		 * tvp is NULL for *cvpp vnode, which we can't unlock.
>  		 * At least some callers expect the reference to be
> diff --git a/sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c b/sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c
> index a2532f8..9476641 100644
> --- a/sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c
> +++ b/sys/cddl/compat/opensolaris/kern/opensolaris_vfs.c
> @@ -36,6 +36,7 @@ __FBSDID("$FreeBSD$");
>  #include <sys/vfs.h>
>  #include <sys/priv.h>
>  #include <sys/libkern.h>
> +#include <sys/rmlock.h>
>  
>  MALLOC_DECLARE(M_MOUNT);
>  
> @@ -222,9 +223,9 @@ mount_snapshot(kthread_t *td, vnode_t **vpp, const char *fstype, char *fspath,
>  
>  	vp->v_mountedhere = mp;
>  	/* Put the new filesystem on the mount list. */
> -	mtx_lock(&mountlist_mtx);
> +	rm_wlock(&mountlist_lock);
>  	TAILQ_INSERT_TAIL(&mountlist, mp, mnt_list);
> -	mtx_unlock(&mountlist_mtx);
> +	rm_wunlock(&mountlist_lock);
>  	vfs_event_signal(NULL, VQ_MOUNT, 0);
>  	if (VFS_ROOT(mp, LK_EXCLUSIVE, &mvp))
>  		panic("mount: lost mount");
> diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
> index a829b06..a684516 100644
> --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
> +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
> @@ -142,6 +142,7 @@
>  #include <sys/lock.h>
>  #include <sys/malloc.h>
>  #include <sys/mutex.h>
> +#include <sys/rmlock.h>
>  #include <sys/proc.h>
>  #include <sys/errno.h>
>  #include <sys/uio.h>
> @@ -3014,15 +3015,16 @@ static vfs_t *
>  zfs_get_vfs(const char *resource)
>  {
>  	vfs_t *vfsp;
> +	struct rm_priotracker tracker;
>  
> -	mtx_lock(&mountlist_mtx);
> +	rm_rlock(&mountlist_lock, &tracker);
>  	TAILQ_FOREACH(vfsp, &mountlist, mnt_list) {
>  		if (strcmp(refstr_value(vfsp->vfs_resource), resource) == 0) {
>  			VFS_HOLD(vfsp);
>  			break;
>  		}
>  	}
> -	mtx_unlock(&mountlist_mtx);
> +	rm_runlock(&mountlist_lock, &tracker);
>  	return (vfsp);
>  }
>  
> diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
> index 415db9e..55e11a1 100644
> --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
> +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
> @@ -62,6 +62,7 @@
>  #include <sys/dmu_objset.h>
>  #include <sys/spa_boot.h>
>  #include <sys/jail.h>
> +#include <sys/rmlock.h>
>  #include "zfs_comutil.h"
>  
>  struct mtx zfs_debug_mtx;
> @@ -2532,12 +2533,13 @@ zfsvfs_update_fromname(const char *oldname, const char *newname)
>  {
>  	char tmpbuf[MAXPATHLEN];
>  	struct mount *mp;
> +	struct rm_priotracker tracker;
>  	char *fromname;
>  	size_t oldlen;
>  
>  	oldlen = strlen(oldname);
>  
> -	mtx_lock(&mountlist_mtx);
> +	rm_rlock(&mountlist_lock, &tracker);
>  	TAILQ_FOREACH(mp, &mountlist, mnt_list) {
>  		fromname = mp->mnt_stat.f_mntfromname;
>  		if (strcmp(fromname, oldname) == 0) {
> @@ -2554,6 +2556,6 @@ zfsvfs_update_fromname(const char *oldname, const char *newname)
>  			continue;
>  		}
>  	}
> -	mtx_unlock(&mountlist_mtx);
> +	rm_runlock(&mountlist_lock, &tracker);
>  }
>  #endif
> diff --git a/sys/compat/linprocfs/linprocfs.c b/sys/compat/linprocfs/linprocfs.c
> index 8607646..2d5a538 100644
> --- a/sys/compat/linprocfs/linprocfs.c
> +++ b/sys/compat/linprocfs/linprocfs.c
> @@ -63,6 +63,7 @@ __FBSDID("$FreeBSD$");
>  #include <sys/proc.h>
>  #include <sys/ptrace.h>
>  #include <sys/resourcevar.h>
> +#include <sys/rmlock.h>
>  #include <sys/sbuf.h>
>  #include <sys/sem.h>
>  #include <sys/smp.h>
> @@ -327,6 +328,7 @@ linprocfs_domtab(PFS_FILL_ARGS)
>  {
>  	struct nameidata nd;
>  	struct mount *mp;
> +	struct rm_priotracker tracker;
>  	const char *lep;
>  	char *dlep, *flep, *mntto, *mntfrom, *fstype;
>  	size_t lep_len;
> @@ -344,7 +346,7 @@ linprocfs_domtab(PFS_FILL_ARGS)
>  	}
>  	lep_len = strlen(lep);
>  
> -	mtx_lock(&mountlist_mtx);
> +	rm_rlock(&mountlist_lock, &tracker);
>  	error = 0;
>  	TAILQ_FOREACH(mp, &mountlist, mnt_list) {
>  		/* determine device name */
> @@ -387,7 +389,7 @@ linprocfs_domtab(PFS_FILL_ARGS)
>  		/* a real Linux mtab will also show NFS options */
>  		sbuf_printf(sb, " 0 0\n");
>  	}
> -	mtx_unlock(&mountlist_mtx);
> +	rm_runlock(&mountlist_lock, &tracker);
>  	free(flep, M_TEMP);
>  	return (error);
>  }
> diff --git a/sys/fs/fuse/fuse_vnops.c b/sys/fs/fuse/fuse_vnops.c
> index 12b9778..b9827df 100644
> --- a/sys/fs/fuse/fuse_vnops.c
> +++ b/sys/fs/fuse/fuse_vnops.c
> @@ -902,11 +902,11 @@ calldaemon:
>  			 */
>  			mp = dvp->v_mount;
>  			ltype = VOP_ISLOCKED(dvp);
> -			err = vfs_busy(mp, MBF_NOWAIT);
> +			err = vfs_busy(mp, MBF_NOWAIT, NULL);
>  			if (err != 0) {
>  				vfs_ref(mp);
>  				VOP_UNLOCK(dvp, 0);
> -				err = vfs_busy(mp, 0);
> +				err = vfs_busy(mp, 0, NULL);
>  				vn_lock(dvp, ltype | LK_RETRY);
>  				vfs_rel(mp);
>  				if (err)
> diff --git a/sys/fs/nfsclient/nfs_clstate.c b/sys/fs/nfsclient/nfs_clstate.c
> index 2600b80..8737003 100644
> --- a/sys/fs/nfsclient/nfs_clstate.c
> +++ b/sys/fs/nfsclient/nfs_clstate.c
> @@ -3621,7 +3621,7 @@ nfscl_getmnt(int minorvers, uint8_t *sessionid, u_int32_t cbident,
>  	mp = clp->nfsc_nmp->nm_mountp;
>  	vfs_ref(mp);
>  	NFSUNLOCKCLSTATE();
> -	error = vfs_busy(mp, 0);
> +	error = vfs_busy(mp, 0, NULL);
>  	vfs_rel(mp);
>  	if (error != 0)
>  		return (NULL);
> diff --git a/sys/fs/nfsclient/nfs_clvfsops.c b/sys/fs/nfsclient/nfs_clvfsops.c
> index 9758b4c..aa8f8bc 100644
> --- a/sys/fs/nfsclient/nfs_clvfsops.c
> +++ b/sys/fs/nfsclient/nfs_clvfsops.c
> @@ -287,7 +287,7 @@ nfs_statfs(struct mount *mp, struct statfs *sbp)
>  
>  	td = curthread;
>  
> -	error = vfs_busy(mp, MBF_NOWAIT);
> +	error = vfs_busy(mp, MBF_NOWAIT, NULL);
>  	if (error)
>  		return (error);
>  	error = ncl_nget(mp, nmp->nm_fh, nmp->nm_fhsize, &np, LK_EXCLUSIVE);
> diff --git a/sys/fs/nfsclient/nfs_clvnops.c b/sys/fs/nfsclient/nfs_clvnops.c
> index 513abf4..64a2eea 100644
> --- a/sys/fs/nfsclient/nfs_clvnops.c
> +++ b/sys/fs/nfsclient/nfs_clvnops.c
> @@ -1228,11 +1228,11 @@ nfs_lookup(struct vop_lookup_args *ap)
>  
>  	if (flags & ISDOTDOT) {
>  		ltype = NFSVOPISLOCKED(dvp);
> -		error = vfs_busy(mp, MBF_NOWAIT);
> +		error = vfs_busy(mp, MBF_NOWAIT, NULL);
>  		if (error != 0) {
>  			vfs_ref(mp);
>  			NFSVOPUNLOCK(dvp, 0);
> -			error = vfs_busy(mp, 0);
> +			error = vfs_busy(mp, 0, NULL);
>  			NFSVOPLOCK(dvp, ltype | LK_RETRY);
>  			vfs_rel(mp);
>  			if (error == 0 && (dvp->v_iflag & VI_DOOMED)) {
> diff --git a/sys/fs/nfsserver/nfs_nfsdport.c b/sys/fs/nfsserver/nfs_nfsdport.c
> index 0ea48cd..c8a80c8 100644
> --- a/sys/fs/nfsserver/nfs_nfsdport.c
> +++ b/sys/fs/nfsserver/nfs_nfsdport.c
> @@ -1985,7 +1985,7 @@ again:
>  	mp = vp->v_mount;
>  	vfs_ref(mp);
>  	NFSVOPUNLOCK(vp, 0);
> -	nd->nd_repstat = vfs_busy(mp, 0);
> +	nd->nd_repstat = vfs_busy(mp, 0, NULL);
>  	vfs_rel(mp);
>  	if (nd->nd_repstat != 0) {
>  		vrele(vp);
> @@ -2134,7 +2134,7 @@ again:
>  					    nvp->v_type == VDIR &&
>  					    nvp->v_mountedhere != NULL) {
>  						new_mp = nvp->v_mountedhere;
> -						r = vfs_busy(new_mp, 0);
> +						r = vfs_busy(new_mp, 0, NULL);
>  						vput(nvp);
>  						nvp = NULL;
>  						if (r == 0) {
> diff --git a/sys/fs/nfsserver/nfs_nfsdserv.c b/sys/fs/nfsserver/nfs_nfsdserv.c
> index e6e02d7..39f87c4 100644
> --- a/sys/fs/nfsserver/nfs_nfsdserv.c
> +++ b/sys/fs/nfsserver/nfs_nfsdserv.c
> @@ -271,7 +271,7 @@ nfsrvd_getattr(struct nfsrv_descript *nd, int isdgram,
>  						at_root = 0;
>  				}
>  				if (nd->nd_repstat == 0)
> -					nd->nd_repstat = vfs_busy(mp, 0);
> +					nd->nd_repstat = vfs_busy(mp, 0, NULL);
>  				vfs_rel(mp);
>  				if (nd->nd_repstat == 0) {
>  					(void)nfsvno_fillattr(nd, mp, vp, &nva,
> diff --git a/sys/fs/pseudofs/pseudofs_vnops.c b/sys/fs/pseudofs/pseudofs_vnops.c
> index f00b4b2..a0852ce 100644
> --- a/sys/fs/pseudofs/pseudofs_vnops.c
> +++ b/sys/fs/pseudofs/pseudofs_vnops.c
> @@ -392,7 +392,7 @@ pfs_vptocnp(struct vop_vptocnp_args *ap)
>  	pfs_unlock(pd);
>  
>  	mp = vp->v_mount;
> -	error = vfs_busy(mp, 0);
> +	error = vfs_busy(mp, 0, NULL);
>  	if (error)
>  		return (error);
>  
> @@ -481,11 +481,11 @@ pfs_lookup(struct vop_cachedlookup_args *va)
>  	if (cnp->cn_flags & ISDOTDOT) {
>  		if (pd->pn_type == pfstype_root)
>  			PFS_RETURN (EIO);
> -		error = vfs_busy(mp, MBF_NOWAIT);
> +		error = vfs_busy(mp, MBF_NOWAIT, NULL);
>  		if (error != 0) {
>  			vfs_ref(mp);
>  			VOP_UNLOCK(vn, 0);
> -			error = vfs_busy(mp, 0);
> +			error = vfs_busy(mp, 0, NULL);
>  			vn_lock(vn, LK_EXCLUSIVE | LK_RETRY);
>  			vfs_rel(mp);
>  			if (error != 0)
> diff --git a/sys/fs/smbfs/smbfs_vnops.c b/sys/fs/smbfs/smbfs_vnops.c
> index 8ea1198..e0736fd 100644
> --- a/sys/fs/smbfs/smbfs_vnops.c
> +++ b/sys/fs/smbfs/smbfs_vnops.c
> @@ -1324,11 +1324,11 @@ smbfs_lookup(ap)
>  	}
>  	if (flags & ISDOTDOT) {
>  		mp = dvp->v_mount;
> -		error = vfs_busy(mp, MBF_NOWAIT);
> +		error = vfs_busy(mp, MBF_NOWAIT, NULL);
>  		if (error != 0) {
>  			vfs_ref(mp);
>  			VOP_UNLOCK(dvp, 0);
> -			error = vfs_busy(mp, 0);
> +			error = vfs_busy(mp, 0, NULL);
>  			vn_lock(dvp, LK_EXCLUSIVE | LK_RETRY);
>  			vfs_rel(mp);
>  			if (error) {
> diff --git a/sys/fs/tmpfs/tmpfs_vnops.c b/sys/fs/tmpfs/tmpfs_vnops.c
> index 885f84c..a366170 100644
> --- a/sys/fs/tmpfs/tmpfs_vnops.c
> +++ b/sys/fs/tmpfs/tmpfs_vnops.c
> @@ -789,7 +789,7 @@ tmpfs_rename(struct vop_rename_args *v)
>  	if (fdvp != tdvp && fdvp != tvp) {
>  		if (vn_lock(fdvp, LK_EXCLUSIVE | LK_NOWAIT) != 0) {
>  			mp = tdvp->v_mount;
> -			error = vfs_busy(mp, 0);
> +			error = vfs_busy(mp, 0, NULL);
>  			if (error != 0) {
>  				mp = NULL;
>  				goto out;
> diff --git a/sys/geom/journal/g_journal.c b/sys/geom/journal/g_journal.c
> index 9cc324c..3c2066e 100644
> --- a/sys/geom/journal/g_journal.c
> +++ b/sys/geom/journal/g_journal.c
> @@ -34,6 +34,7 @@ __FBSDID("$FreeBSD$");
>  #include <sys/limits.h>
>  #include <sys/lock.h>
>  #include <sys/mutex.h>
> +#include <sys/rmlock.h>
>  #include <sys/bio.h>
>  #include <sys/sysctl.h>
>  #include <sys/malloc.h>
> @@ -2867,6 +2868,7 @@ g_journal_do_switch(struct g_class *classp)
>  	const struct g_journal_desc *desc;
>  	struct g_geom *gp;
>  	struct mount *mp;
> +	struct rm_priotracker tracker;
>  	struct bintime bt;
>  	char *mountpoint;
>  	int error, save;
> @@ -2888,7 +2890,7 @@ g_journal_do_switch(struct g_class *classp)
>  	g_topology_unlock();
>  	PICKUP_GIANT();
>  
> -	mtx_lock(&mountlist_mtx);
> +	rm_rlock(&mountlist_lock, &tracker);
>  	TAILQ_FOREACH(mp, &mountlist, mnt_list) {
>  		if (mp->mnt_gjprovider == NULL)
>  			continue;
> @@ -2897,9 +2899,10 @@ g_journal_do_switch(struct g_class *classp)
>  		desc = g_journal_find_desc(mp->mnt_stat.f_fstypename);
>  		if (desc == NULL)
>  			continue;
> -		if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK))
> +		if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK, &tracker))
>  			continue;
> -		/* mtx_unlock(&mountlist_mtx) was done inside vfs_busy() */
> +		/* rm_runlock(&mountlist_lock, &tracker) was done inside
> +		 * vfs_busy() */
>  
>  		DROP_GIANT();
>  		g_topology_lock();
> @@ -2977,10 +2980,10 @@ g_journal_do_switch(struct g_class *classp)
>  
>  		vfs_write_resume(mp, 0);
>  next:
> -		mtx_lock(&mountlist_mtx);
> +		rm_rlock(&mountlist_lock, &tracker);
>  		vfs_unbusy(mp);
>  	}
> -	mtx_unlock(&mountlist_mtx);
> +	rm_runlock(&mountlist_lock, &tracker);
>  
>  	sc = NULL;
>  	for (;;) {
> diff --git a/sys/kern/vfs_extattr.c b/sys/kern/vfs_extattr.c
> index 24935ce..a82eddaf 100644
> --- a/sys/kern/vfs_extattr.c
> +++ b/sys/kern/vfs_extattr.c
> @@ -104,7 +104,7 @@ sys_extattrctl(td, uap)
>  	if (error)
>  		goto out;
>  	mp = nd.ni_vp->v_mount;
> -	error = vfs_busy(mp, 0);
> +	error = vfs_busy(mp, 0, NULL);
>  	if (error) {
>  		NDFREE(&nd, 0);
>  		mp = NULL;
> diff --git a/sys/kern/vfs_lookup.c b/sys/kern/vfs_lookup.c
> index f2ffab2..610aa5f 100644
> --- a/sys/kern/vfs_lookup.c
> +++ b/sys/kern/vfs_lookup.c
> @@ -779,7 +779,7 @@ unionlookup:
>  	 */
>  	while (dp->v_type == VDIR && (mp = dp->v_mountedhere) &&
>  	       (cnp->cn_flags & NOCROSSMOUNT) == 0) {
> -		if (vfs_busy(mp, 0))
> +		if (vfs_busy(mp, 0, NULL))
>  			continue;
>  		vput(dp);
>  		if (dp != ndp->ni_dvp)
> diff --git a/sys/kern/vfs_mount.c b/sys/kern/vfs_mount.c
> index 09fa7ed..636a55f 100644
> --- a/sys/kern/vfs_mount.c
> +++ b/sys/kern/vfs_mount.c
> @@ -51,6 +51,7 @@ __FBSDID("$FreeBSD$");
>  #include <sys/proc.h>
>  #include <sys/filedesc.h>
>  #include <sys/reboot.h>
> +#include <sys/rmlock.h>
>  #include <sys/sbuf.h>
>  #include <sys/syscallsubr.h>
>  #include <sys/sysproto.h>
> @@ -85,8 +86,8 @@ static uma_zone_t mount_zone;
>  struct mntlist mountlist = TAILQ_HEAD_INITIALIZER(mountlist);
>  
>  /* For any iteration/modification of mountlist */
> -struct mtx mountlist_mtx;
> -MTX_SYSINIT(mountlist, &mountlist_mtx, "mountlist", MTX_DEF);
> +struct rmlock mountlist_lock;
> +RM_SYSINIT(mountlist, &mountlist_lock, "mountlist");
>  
>  /*
>   * Global opts, taken by all filesystems
> @@ -462,7 +463,7 @@ vfs_mount_alloc(struct vnode *vp, struct vfsconf *vfsp, const char *fspath,
>  	TAILQ_INIT(&mp->mnt_activevnodelist);
>  	mp->mnt_activevnodelistsize = 0;
>  	mp->mnt_ref = 0;
> -	(void) vfs_busy(mp, MBF_NOWAIT);
> +	(void) vfs_busy(mp, MBF_NOWAIT, NULL);
>  	atomic_add_acq_int(&vfsp->vfc_refcount, 1);
>  	mp->mnt_op = vfsp->vfc_vfsops;
>  	mp->mnt_vfc = vfsp;
> @@ -852,9 +853,9 @@ vfs_domount_first(
>  	VI_UNLOCK(vp);
>  	vp->v_mountedhere = mp;
>  	/* Place the new filesystem at the end of the mount list. */
> -	mtx_lock(&mountlist_mtx);
> +	rm_wlock(&mountlist_lock);
>  	TAILQ_INSERT_TAIL(&mountlist, mp, mnt_list);
> -	mtx_unlock(&mountlist_mtx);
> +	rm_wunlock(&mountlist_lock);
>  	vfs_event_signal(NULL, VQ_MOUNT, 0);
>  	if (VFS_ROOT(mp, LK_EXCLUSIVE, &newdp))
>  		panic("mount: lost mount");
> @@ -918,7 +919,7 @@ vfs_domount_update(
>  		vput(vp);
>  		return (error);
>  	}
> -	if (vfs_busy(mp, MBF_NOWAIT)) {
> +	if (vfs_busy(mp, MBF_NOWAIT, NULL)) {
>  		vput(vp);
>  		return (EBUSY);
>  	}
> @@ -1137,6 +1138,7 @@ sys_unmount(td, uap)
>  {
>  	struct nameidata nd;
>  	struct mount *mp;
> +	struct rm_priotracker tracker;
>  	char *pathbuf;
>  	int error, id0, id1;
>  
> @@ -1161,13 +1163,13 @@ sys_unmount(td, uap)
>  			return (EINVAL);
>  		}
>  
> -		mtx_lock(&mountlist_mtx);
> +		rm_rlock(&mountlist_lock, &tracker);
>  		TAILQ_FOREACH_REVERSE(mp, &mountlist, mntlist, mnt_list) {
>  			if (mp->mnt_stat.f_fsid.val[0] == id0 &&
>  			    mp->mnt_stat.f_fsid.val[1] == id1)
>  				break;
>  		}
> -		mtx_unlock(&mountlist_mtx);
> +		rm_runlock(&mountlist_lock, &tracker);
>  	} else {
>  		/*
>  		 * Try to find global path for path argument.
> @@ -1181,12 +1183,12 @@ sys_unmount(td, uap)
>  			if (error == 0 || error == ENODEV)
>  				vput(nd.ni_vp);
>  		}
> -		mtx_lock(&mountlist_mtx);
> +		rm_rlock(&mountlist_lock, &tracker);
>  		TAILQ_FOREACH_REVERSE(mp, &mountlist, mntlist, mnt_list) {
>  			if (strcmp(mp->mnt_stat.f_mntonname, pathbuf) == 0)
>  				break;
>  		}
> -		mtx_unlock(&mountlist_mtx);
> +		rm_runlock(&mountlist_lock, &tracker);
>  	}
>  	free(pathbuf, M_TEMP);
>  	if (mp == NULL) {
> @@ -1353,9 +1355,9 @@ dounmount(mp, flags, td)
>  			VOP_UNLOCK(coveredvp, 0);
>  		return (error);
>  	}
> -	mtx_lock(&mountlist_mtx);
> +	rm_wlock(&mountlist_lock);
>  	TAILQ_REMOVE(&mountlist, mp, mnt_list);
> -	mtx_unlock(&mountlist_mtx);
> +	rm_wunlock(&mountlist_lock);
>  	EVENTHANDLER_INVOKE(vfs_unmounted, mp, td);
>  	if (coveredvp != NULL) {
>  		coveredvp->v_mountedhere = NULL;
> diff --git a/sys/kern/vfs_mountroot.c b/sys/kern/vfs_mountroot.c
> index a050099..ef3e9c9 100644
> --- a/sys/kern/vfs_mountroot.c
> +++ b/sys/kern/vfs_mountroot.c
> @@ -55,6 +55,7 @@ __FBSDID("$FreeBSD$");
>  #include <sys/proc.h>
>  #include <sys/filedesc.h>
>  #include <sys/reboot.h>
> +#include <sys/rmlock.h>
>  #include <sys/sbuf.h>
>  #include <sys/stat.h>
>  #include <sys/syscallsubr.h>
> @@ -231,9 +232,9 @@ vfs_mountroot_devfs(struct thread *td, struct mount **mpp)
>  	TAILQ_INIT(opts);
>  	mp->mnt_opt = opts;
>  
> -	mtx_lock(&mountlist_mtx);
> +	rm_wlock(&mountlist_lock);
>  	TAILQ_INSERT_HEAD(&mountlist, mp, mnt_list);
> -	mtx_unlock(&mountlist_mtx);
> +	rm_wunlock(&mountlist_lock);
>  
>  	*mpp = mp;
>  	set_rootvnode();
> @@ -257,7 +258,7 @@ vfs_mountroot_shuffle(struct thread *td, struct mount *mpdevfs)
>  	mpnroot = TAILQ_NEXT(mpdevfs, mnt_list);
>  
>  	/* Shuffle the mountlist. */
> -	mtx_lock(&mountlist_mtx);
> +	rm_wlock(&mountlist_lock);
>  	mporoot = TAILQ_FIRST(&mountlist);
>  	TAILQ_REMOVE(&mountlist, mpdevfs, mnt_list);
>  	if (mporoot != mpdevfs) {
> @@ -265,7 +266,7 @@ vfs_mountroot_shuffle(struct thread *td, struct mount *mpdevfs)
>  		TAILQ_INSERT_HEAD(&mountlist, mpnroot, mnt_list);
>  	}
>  	TAILQ_INSERT_TAIL(&mountlist, mpdevfs, mnt_list);
> -	mtx_unlock(&mountlist_mtx);
> +	rm_wunlock(&mountlist_lock);
>  
>  	cache_purgevfs(mporoot);
>  	if (mporoot != mpdevfs)
> @@ -933,6 +934,7 @@ void
>  vfs_mountroot(void)
>  {
>  	struct mount *mp;
> +	struct rm_priotracker tracker;
>  	struct sbuf *sb;
>  	struct thread *td;
>  	time_t timebase;
> @@ -968,14 +970,14 @@ vfs_mountroot(void)
>  	 * timestamps we encounter.
>  	 */
>  	timebase = 0;
> -	mtx_lock(&mountlist_mtx);
> +	rm_rlock(&mountlist_lock, &tracker);
>  	mp = TAILQ_FIRST(&mountlist);
>  	while (mp != NULL) {
>  		if (mp->mnt_time > timebase)
>  			timebase = mp->mnt_time;
>  		mp = TAILQ_NEXT(mp, mnt_list);
>  	}
> -	mtx_unlock(&mountlist_mtx);
> +	rm_runlock(&mountlist_lock, &tracker);
>  	inittodr(timebase);
>  
>  	/* Keep prison0's root in sync with the global rootvnode. */
> diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
> index fda80c9..3733178 100644
> --- a/sys/kern/vfs_subr.c
> +++ b/sys/kern/vfs_subr.c
> @@ -68,6 +68,7 @@ __FBSDID("$FreeBSD$");
>  #include <sys/pctrie.h>
>  #include <sys/priv.h>
>  #include <sys/reboot.h>
> +#include <sys/rmlock.h>
>  #include <sys/rwlock.h>
>  #include <sys/sched.h>
>  #include <sys/sleepqueue.h>
> @@ -375,7 +376,7 @@ SYSINIT(vfs, SI_SUB_VFS, SI_ORDER_FIRST, vntblinit, NULL);
>  
>  /*
>   * Mark a mount point as busy. Used to synchronize access and to delay
> - * unmounting. Eventually, mountlist_mtx is not released on failure.
> + * unmounting. Eventually, mountlist_lock is not released on failure.
>   *
>   * vfs_busy() is a custom lock, it can block the caller.
>   * vfs_busy() only sleeps if the unmount is active on the mount point.
> @@ -410,7 +411,7 @@ SYSINIT(vfs, SI_SUB_VFS, SI_ORDER_FIRST, vntblinit, NULL);
>   * dounmount() locks B while F is drained.
>   */
>  int
> -vfs_busy(struct mount *mp, int flags)
> +vfs_busy(struct mount *mp, int flags, struct rm_priotracker *tracker)
>  {
>  
>  	MPASS((flags & ~MBF_MASK) == 0);
> @@ -439,15 +440,15 @@ vfs_busy(struct mount *mp, int flags)
>  			return (ENOENT);
>  		}
>  		if (flags & MBF_MNTLSTLOCK)
> -			mtx_unlock(&mountlist_mtx);
> +			rm_runlock(&mountlist_lock, tracker);
>  		mp->mnt_kern_flag |= MNTK_MWAIT;
>  		msleep(mp, MNT_MTX(mp), PVFS | PDROP, "vfs_busy", 0);
>  		if (flags & MBF_MNTLSTLOCK)
> -			mtx_lock(&mountlist_mtx);
> +			rm_rlock(&mountlist_lock, tracker);
>  		MNT_ILOCK(mp);
>  	}
>  	if (flags & MBF_MNTLSTLOCK)
> -		mtx_unlock(&mountlist_mtx);
> +		rm_runlock(&mountlist_lock, tracker);
>  	mp->mnt_lockref++;
>  	MNT_IUNLOCK(mp);
>  	return (0);
> @@ -481,18 +482,19 @@ struct mount *
>  vfs_getvfs(fsid_t *fsid)
>  {
>  	struct mount *mp;
> +	struct rm_priotracker tracker;
>  
>  	CTR2(KTR_VFS, "%s: fsid %p", __func__, fsid);
> -	mtx_lock(&mountlist_mtx);
> +	rm_rlock(&mountlist_lock, &tracker);
>  	TAILQ_FOREACH(mp, &mountlist, mnt_list) {
>  		if (mp->mnt_stat.f_fsid.val[0] == fsid->val[0] &&
>  		    mp->mnt_stat.f_fsid.val[1] == fsid->val[1]) {
>  			vfs_ref(mp);
> -			mtx_unlock(&mountlist_mtx);
> +			rm_runlock(&mountlist_lock, &tracker);
>  			return (mp);
>  		}
>  	}
> -	mtx_unlock(&mountlist_mtx);
> +	rm_runlock(&mountlist_lock, &tracker);
>  	CTR2(KTR_VFS, "%s: lookup failed for %p id", __func__, fsid);
>  	return ((struct mount *) 0);
>  }
> @@ -501,7 +503,7 @@ vfs_getvfs(fsid_t *fsid)
>   * Lookup a mount point by filesystem identifier, busying it before
>   * returning.
>   *
> - * To avoid congestion on mountlist_mtx, implement simple direct-mapped
> + * To avoid congestion on mountlist_lock, implement simple direct-mapped
>   * cache for popular filesystem identifiers.  The cache is lockess, using
>   * the fact that struct mount's are never freed.  In worst case we may
>   * get pointer to unmounted or even different filesystem, so we have to
> @@ -514,6 +516,7 @@ vfs_busyfs(fsid_t *fsid)
>  	typedef struct mount * volatile vmp_t;
>  	static vmp_t cache[FSID_CACHE_SIZE];
>  	struct mount *mp;
> +	struct rm_priotracker tracker;
>  	int error;
>  	uint32_t hash;
>  
> @@ -525,7 +528,7 @@ vfs_busyfs(fsid_t *fsid)
>  	    mp->mnt_stat.f_fsid.val[0] != fsid->val[0] ||
>  	    mp->mnt_stat.f_fsid.val[1] != fsid->val[1])
>  		goto slow;
> -	if (vfs_busy(mp, 0) != 0) {
> +	if (vfs_busy(mp, 0, NULL) != 0) {
>  		cache[hash] = NULL;
>  		goto slow;
>  	}
> @@ -536,14 +539,14 @@ vfs_busyfs(fsid_t *fsid)
>  	    vfs_unbusy(mp);
>  
>  slow:
> -	mtx_lock(&mountlist_mtx);
> +	rm_rlock(&mountlist_lock, &tracker);
>  	TAILQ_FOREACH(mp, &mountlist, mnt_list) {
>  		if (mp->mnt_stat.f_fsid.val[0] == fsid->val[0] &&
>  		    mp->mnt_stat.f_fsid.val[1] == fsid->val[1]) {
> -			error = vfs_busy(mp, MBF_MNTLSTLOCK);
> +			error = vfs_busy(mp, MBF_MNTLSTLOCK, &tracker);
>  			if (error) {
>  				cache[hash] = NULL;
> -				mtx_unlock(&mountlist_mtx);
> +				rm_runlock(&mountlist_lock, &tracker);
>  				return (NULL);
>  			}
>  			cache[hash] = mp;
> @@ -551,7 +554,7 @@ slow:
>  		}
>  	}
>  	CTR2(KTR_VFS, "%s: lookup failed for %p id", __func__, fsid);
> -	mtx_unlock(&mountlist_mtx);
> +	rm_runlock(&mountlist_lock, &tracker);
>  	return ((struct mount *) 0);
>  }
>  
> @@ -891,6 +894,7 @@ vnlru_proc(void)
>  	struct mount *mp, *nmp;
>  	int done;
>  	struct proc *p = vnlruproc;
> +	struct rm_priotracker tracker;
>  
>  	EVENTHANDLER_REGISTER(shutdown_pre_sync, kproc_shutdown, p,
>  	    SHUTDOWN_PRI_FIRST);
> @@ -909,18 +913,18 @@ vnlru_proc(void)
>  		}
>  		mtx_unlock(&vnode_free_list_mtx);
>  		done = 0;
> -		mtx_lock(&mountlist_mtx);
> +		rm_rlock(&mountlist_lock, &tracker);
>  		for (mp = TAILQ_FIRST(&mountlist); mp != NULL; mp = nmp) {
> -			if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK)) {
> +			if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK, &tracker)) {
>  				nmp = TAILQ_NEXT(mp, mnt_list);
>  				continue;
>  			}
>  			done += vlrureclaim(mp);
> -			mtx_lock(&mountlist_mtx);
> +			rm_rlock(&mountlist_lock, &tracker);
>  			nmp = TAILQ_NEXT(mp, mnt_list);
>  			vfs_unbusy(mp);
>  		}
> -		mtx_unlock(&mountlist_mtx);
> +		rm_runlock(&mountlist_lock, &tracker);
>  		if (done == 0) {
>  #if 0
>  			/* These messages are temporary debugging aids */
> @@ -3397,6 +3401,7 @@ sysctl_vnode(SYSCTL_HANDLER_ARGS)
>  	struct xvnode *xvn;
>  	struct mount *mp;
>  	struct vnode *vp;
> +	struct rm_priotracker tracker;
>  	int error, len, n;
>  
>  	/*
> @@ -3413,9 +3418,9 @@ sysctl_vnode(SYSCTL_HANDLER_ARGS)
>  		return (error);
>  	xvn = malloc(len, M_TEMP, M_ZERO | M_WAITOK);
>  	n = 0;
> -	mtx_lock(&mountlist_mtx);
> +	rm_rlock(&mountlist_lock, &tracker);
>  	TAILQ_FOREACH(mp, &mountlist, mnt_list) {
> -		if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK))
> +		if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK, &tracker))
>  			continue;
>  		MNT_ILOCK(mp);
>  		TAILQ_FOREACH(vp, &mp->mnt_nvnodelist, v_nmntvnodes) {
> @@ -3465,12 +3470,12 @@ sysctl_vnode(SYSCTL_HANDLER_ARGS)
>  			++n;
>  		}
>  		MNT_IUNLOCK(mp);
> -		mtx_lock(&mountlist_mtx);
> +		rm_rlock(&mountlist_lock, &tracker);
>  		vfs_unbusy(mp);
>  		if (n == len)
>  			break;
>  	}
> -	mtx_unlock(&mountlist_mtx);
> +	rm_runlock(&mountlist_lock, &tracker);
>  
>  	error = SYSCTL_OUT(req, xvn, n * sizeof *xvn);
>  	free(xvn, M_TEMP);
> @@ -3760,7 +3765,7 @@ sync_fsync(struct vop_fsync_args *ap)
>  	 * Walk the list of vnodes pushing all that are dirty and
>  	 * not already on the sync list.
>  	 */
> -	if (vfs_busy(mp, MBF_NOWAIT) != 0)
> +	if (vfs_busy(mp, MBF_NOWAIT, NULL) != 0)
>  		return (0);
>  	if (vn_start_write(NULL, &mp, V_NOWAIT) != 0) {
>  		vfs_unbusy(mp);
> diff --git a/sys/kern/vfs_syscalls.c b/sys/kern/vfs_syscalls.c
> index 14be379..0d3ae31 100644
> --- a/sys/kern/vfs_syscalls.c
> +++ b/sys/kern/vfs_syscalls.c
> @@ -60,6 +60,7 @@ __FBSDID("$FreeBSD$");
>  #include <sys/filio.h>
>  #include <sys/limits.h>
>  #include <sys/linker.h>
> +#include <sys/rmlock.h>
>  #include <sys/rwlock.h>
>  #include <sys/sdt.h>
>  #include <sys/stat.h>
> @@ -129,11 +130,12 @@ sys_sync(td, uap)
>  	struct sync_args *uap;
>  {
>  	struct mount *mp, *nmp;
> +	struct rm_priotracker tracker;
>  	int save;
>  
> -	mtx_lock(&mountlist_mtx);
> +	rm_rlock(&mountlist_lock, &tracker);
>  	for (mp = TAILQ_FIRST(&mountlist); mp != NULL; mp = nmp) {
> -		if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK)) {
> +		if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK, &tracker)) {
>  			nmp = TAILQ_NEXT(mp, mnt_list);
>  			continue;
>  		}
> @@ -145,11 +147,11 @@ sys_sync(td, uap)
>  			curthread_pflags_restore(save);
>  			vn_finished_write(mp);
>  		}
> -		mtx_lock(&mountlist_mtx);
> +		rm_rlock(&mountlist_lock, &tracker);
>  		nmp = TAILQ_NEXT(mp, mnt_list);
>  		vfs_unbusy(mp);
>  	}
> -	mtx_unlock(&mountlist_mtx);
> +	rm_runlock(&mountlist_lock, &tracker);
>  	return (0);
>  }
>  
> @@ -190,7 +192,7 @@ sys_quotactl(td, uap)
>  	mp = nd.ni_vp->v_mount;
>  	vfs_ref(mp);
>  	vput(nd.ni_vp);
> -	error = vfs_busy(mp, 0);
> +	error = vfs_busy(mp, 0, NULL);
>  	vfs_rel(mp);
>  	if (error != 0)
>  		return (error);
> @@ -297,7 +299,7 @@ kern_statfs(struct thread *td, char *path, enum uio_seg pathseg,
>  	vfs_ref(mp);
>  	NDFREE(&nd, NDF_ONLY_PNBUF);
>  	vput(nd.ni_vp);
> -	error = vfs_busy(mp, 0);
> +	error = vfs_busy(mp, 0, NULL);
>  	vfs_rel(mp);
>  	if (error != 0)
>  		return (error);
> @@ -383,7 +385,7 @@ kern_fstatfs(struct thread *td, int fd, struct statfs *buf)
>  		error = EBADF;
>  		goto out;
>  	}
> -	error = vfs_busy(mp, 0);
> +	error = vfs_busy(mp, 0, NULL);
>  	vfs_rel(mp);
>  	if (error != 0)
>  		return (error);
> @@ -449,6 +451,7 @@ kern_getfsstat(struct thread *td, struct statfs **buf, size_t bufsize,
>      enum uio_seg bufseg, int flags)
>  {
>  	struct mount *mp, *nmp;
> +	struct rm_priotracker tracker;
>  	struct statfs *sfsp, *sp, sb;
>  	size_t count, maxcount;
>  	int error;
> @@ -460,18 +463,18 @@ kern_getfsstat(struct thread *td, struct statfs **buf, size_t bufsize,
>  		sfsp = *buf;
>  	else /* if (bufseg == UIO_SYSSPACE) */ {
>  		count = 0;
> -		mtx_lock(&mountlist_mtx);
> +		rm_rlock(&mountlist_lock, &tracker);
>  		TAILQ_FOREACH(mp, &mountlist, mnt_list) {
>  			count++;
>  		}
> -		mtx_unlock(&mountlist_mtx);
> +		rm_runlock(&mountlist_lock, &tracker);
>  		if (maxcount > count)
>  			maxcount = count;
>  		sfsp = *buf = malloc(maxcount * sizeof(struct statfs), M_TEMP,
>  		    M_WAITOK);
>  	}
>  	count = 0;
> -	mtx_lock(&mountlist_mtx);
> +	rm_rlock(&mountlist_lock, &tracker);
>  	for (mp = TAILQ_FIRST(&mountlist); mp != NULL; mp = nmp) {
>  		if (prison_canseemount(td->td_ucred, mp) != 0) {
>  			nmp = TAILQ_NEXT(mp, mnt_list);
> @@ -483,7 +486,7 @@ kern_getfsstat(struct thread *td, struct statfs **buf, size_t bufsize,
>  			continue;
>  		}
>  #endif
> -		if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK)) {
> +		if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK, &tracker)) {
>  			nmp = TAILQ_NEXT(mp, mnt_list);
>  			continue;
>  		}
> @@ -504,7 +507,7 @@ kern_getfsstat(struct thread *td, struct statfs **buf, size_t bufsize,
>  			if (((flags & (MNT_LAZY|MNT_NOWAIT)) == 0 ||
>  			    (flags & MNT_WAIT)) &&
>  			    (error = VFS_STATFS(mp, sp))) {
> -				mtx_lock(&mountlist_mtx);
> +				rm_rlock(&mountlist_lock, &tracker);
>  				nmp = TAILQ_NEXT(mp, mnt_list);
>  				vfs_unbusy(mp);
>  				continue;
> @@ -527,11 +530,11 @@ kern_getfsstat(struct thread *td, struct statfs **buf, size_t bufsize,
>  			sfsp++;
>  		}
>  		count++;
> -		mtx_lock(&mountlist_mtx);
> +		rm_rlock(&mountlist_lock, &tracker);
>  		nmp = TAILQ_NEXT(mp, mnt_list);
>  		vfs_unbusy(mp);
>  	}
> -	mtx_unlock(&mountlist_mtx);
> +	rm_runlock(&mountlist_lock, &tracker);
>  	if (sfsp && count > maxcount)
>  		td->td_retval[0] = maxcount;
>  	else
> @@ -741,7 +744,7 @@ sys_fchdir(td, uap)
>  	AUDIT_ARG_VNODE1(vp);
>  	error = change_dir(vp, td);
>  	while (!error && (mp = vp->v_mountedhere) != NULL) {
> -		if (vfs_busy(mp, 0))
> +		if (vfs_busy(mp, 0, NULL))
>  			continue;
>  		error = VFS_ROOT(mp, LK_SHARED, &tdp);
>  		vfs_unbusy(mp);
> diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c
> index ed4ad4d..8d38305 100644
> --- a/sys/kern/vfs_vnops.c
> +++ b/sys/kern/vfs_vnops.c
> @@ -2059,11 +2059,11 @@ vn_vget_ino_gen(struct vnode *vp, vn_get_ino_t alloc, void *alloc_arg,
>  	ltype = VOP_ISLOCKED(vp);
>  	KASSERT(ltype == LK_EXCLUSIVE || ltype == LK_SHARED,
>  	    ("vn_vget_ino: vp not locked"));
> -	error = vfs_busy(mp, MBF_NOWAIT);
> +	error = vfs_busy(mp, MBF_NOWAIT, NULL);
>  	if (error != 0) {
>  		vfs_ref(mp);
>  		VOP_UNLOCK(vp, 0);
> -		error = vfs_busy(mp, 0);
> +		error = vfs_busy(mp, 0, NULL);
>  		vn_lock(vp, ltype | LK_RETRY);
>  		vfs_rel(mp);
>  		if (error != 0)
> diff --git a/sys/sys/mount.h b/sys/sys/mount.h
> index 6fb2d08..6c47d7f 100644
> --- a/sys/sys/mount.h
> +++ b/sys/sys/mount.h
> @@ -38,7 +38,9 @@
>  #ifdef _KERNEL
>  #include <sys/lock.h>
>  #include <sys/lockmgr.h>
> +#include <sys/pcpu.h>
>  #include <sys/_mutex.h>
> +#include <sys/_rmlock.h>
>  #include <sys/_sx.h>
>  #endif
>  
> @@ -147,7 +149,7 @@ struct vfsopt {
>   * put on a doubly linked list.
>   *
>   * Lock reference:
> - *	m - mountlist_mtx
> + *	m - mountlist_lock
>   *	i - interlock
>   *	v - vnode freelist mutex
>   *
> @@ -864,7 +866,7 @@ int	vfs_setopts(struct vfsoptlist *opts, const char *name,
>  int	vfs_setpublicfs			    /* set publicly exported fs */
>  	    (struct mount *, struct netexport *, struct export_args *);
>  void	vfs_msync(struct mount *, int);
> -int	vfs_busy(struct mount *, int);
> +int	vfs_busy(struct mount *, int, struct rm_priotracker *);
>  int	vfs_export			 /* process mount export info */
>  	    (struct mount *, struct export_args *);
>  void	vfs_allocate_syncvnode(struct mount *);
> @@ -890,7 +892,7 @@ int	vfs_suser(struct mount *, struct thread *);
>  void	vfs_unbusy(struct mount *);
>  void	vfs_unmountall(void);
>  extern	TAILQ_HEAD(mntlist, mount) mountlist;	/* mounted filesystem list */
> -extern	struct mtx mountlist_mtx;
> +extern	struct rmlock mountlist_lock;
>  extern	struct nfs_public nfs_pub;
>  extern	struct sx vfsconf_sx;
>  #define	vfsconf_lock()		sx_xlock(&vfsconf_sx)
> diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c
> index 700854e..8545df0 100644
> --- a/sys/ufs/ffs/ffs_softdep.c
> +++ b/sys/ufs/ffs/ffs_softdep.c
> @@ -12274,11 +12274,11 @@ restart:
>  		FREE_LOCK(ump);
>  		if (ffs_vgetf(mp, parentino, LK_NOWAIT | LK_EXCLUSIVE, &pvp,
>  		    FFSV_FORCEINSMQ)) {
> -			error = vfs_busy(mp, MBF_NOWAIT);
> +			error = vfs_busy(mp, MBF_NOWAIT, NULL);
>  			if (error != 0) {
>  				vfs_ref(mp);
>  				VOP_UNLOCK(vp, 0);
> -				error = vfs_busy(mp, 0);
> +				error = vfs_busy(mp, 0, NULL);
>  				vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
>  				vfs_rel(mp);
>  				if (error != 0)
> @@ -13427,7 +13427,7 @@ clear_remove(mp)
>  			/*
>  			 * Let unmount clear deps
>  			 */
> -			error = vfs_busy(mp, MBF_NOWAIT);
> +			error = vfs_busy(mp, MBF_NOWAIT, NULL);
>  			if (error != 0)
>  				goto finish_write;
>  			error = ffs_vgetf(mp, ino, LK_EXCLUSIVE, &vp,
> @@ -13503,7 +13503,7 @@ clear_inodedeps(mp)
>  		if (vn_start_write(NULL, &mp, V_NOWAIT) != 0)
>  			continue;
>  		FREE_LOCK(ump);
> -		error = vfs_busy(mp, MBF_NOWAIT); /* Let unmount clear deps */
> +		error = vfs_busy(mp, MBF_NOWAIT, NULL); /* Let unmount clear deps */
>  		if (error != 0) {
>  			vn_finished_write(mp);
>  			ACQUIRE_LOCK(ump);
> diff --git a/sys/ufs/ffs/ffs_suspend.c b/sys/ufs/ffs/ffs_suspend.c
> index b50fadc..6513a21 100644
> --- a/sys/ufs/ffs/ffs_suspend.c
> +++ b/sys/ufs/ffs/ffs_suspend.c
> @@ -286,7 +286,7 @@ ffs_susp_ioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flags,
>  			error = ENOENT;
>  			break;
>  		}
> -		error = vfs_busy(mp, 0);
> +		error = vfs_busy(mp, 0, NULL);
>  		vfs_rel(mp);
>  		if (error != 0)
>  			break;
> diff --git a/sys/ufs/ufs/ufs_quota.c b/sys/ufs/ufs/ufs_quota.c
> index 4fbb8a1..3d17622 100644
> --- a/sys/ufs/ufs/ufs_quota.c
> +++ b/sys/ufs/ufs/ufs_quota.c
> @@ -519,7 +519,7 @@ quotaon(struct thread *td, struct mount *mp, int type, void *fname)
>  	}
>  	NDFREE(&nd, NDF_ONLY_PNBUF);
>  	vp = nd.ni_vp;
> -	error = vfs_busy(mp, MBF_NOWAIT);
> +	error = vfs_busy(mp, MBF_NOWAIT, NULL);
>  	vfs_rel(mp);
>  	if (error == 0) {
>  		if (vp->v_type != VREG) {
> -- 
> 2.1.2
> 
> > The rmlock primitive does exactly this optimization.  See the manpage
> > for the API (it's mostly very similar to the rwlock API):
> > https://www.freebsd.org/cgi/man.cgi?query=rmlock&apropos=0&sektion=0&manpath=FreeBSD+10.1-RELEASE&arch=default&format=html
> > 
> > 
> > > void   zfs_mountlist_wlock(void);
> > > void   zfs_mountlist_wunlock(void);
> > > void   zfs_mountlist_rlock(void);
> > > void   zfs_mountlist_runlock(void);
> > 
> > Why not:
> > 
> > #define zfs_mountlist_wlock() _zfs_mountlist_wlock(__FILE__, __LINE__)
> > void _zfs_mountlist_wlock(const char *file, int line);
> > 
> > /* etc, etc */
> > 
> > (This may be moot if we switch to an rmlock though)
> 
> Tiwei Bie
> 

-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150313200705.GA32157>