Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 Jun 2019 09:50:06 +0200
From:      Peter Eriksson <pen@lysator.liu.se>
To:        Rick Macklem <rmacklem@uoguelph.ca>, Alexander Motin <mav@FreeBSD.org>, "mmacy@ixsystems.com" <mmacy@ixsystems.com>, "ryan@ixsystems.com" <ryan@ixsystems.com>, "pjd@freebsd.org" <pjd@freebsd.org>, "freebsd-fs@freebsd.org" <freebsd-fs@FreeBSD.org>
Subject:   Re: RFC: patching fsshare in ZFS
Message-ID:  <FFB04DDA-8DC8-4DBA-89AB-943E9638175D@lysator.liu.se>
In-Reply-To: <YQXPR01MB3128B7972F1DCDF2A163859DDD160@YQXPR01MB3128.CANPRD01.PROD.OUTLOOK.COM>
References:  <YQXPR01MB3128B7972F1DCDF2A163859DDD160@YQXPR01MB3128.CANPRD01.PROD.OUTLOOK.COM>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi all!
=20
I=E2=80=99ve been experimenting a little with adding support for a =
simple BerkelyDB-based =E2=80=9Cexports=E2=80=9D database to mountd in =
order to speed things up for the ZFS share code. The changes to mountd =
are fairly simple, and the corresponding stuff was pretty simple to add =
to the ZFS code too last I tried it. Speeds things up quite a bit - no =
need to do linear searches through the /etc/zfs/exports file and no need =
to rewrite the file for changes either=E2=80=A6 With N*10000 NFS shared =
filesystems like we do this can be pretty nice to have.=20

My current DB-based code supports multiple exports entries per filsystem =
by separating the =E2=80=9Crows=E2=80=9D in the database entry for a =
filesystem with NUL characters.

Let me know if there is some interest in this for others than just me.

- Peter



> 2 - Peter has some NFS servers with 20000-72000+ file systems being =
exported.
>      The current code in fsshare.c copies the exports file and then =
appends the new
>       entry for a file system and then replaces the exports file with =
the new one.
>       I think this file copying happens for every file system, which =
seems like a lot
>       of overhead. (I forget what Peter said w.r.t. how long this =
takes, but I think it
>       was quite a while.)
>       My guess is that Pawel did this so that the update to the file =
would happen
>       atomically.
>       It seems to me that if mountd held a read lock on the export =
file while reading it
>       and fsshare() held a write lock on the file while appending the =
new entry, that
>       the file copying could be avoided?
>       - The main problem I see w.r.t. doing this is that an old mountd =
binary that doesn't
>         read lock the file could be broken by the fsshare() change.
>         --> One way to avoid this would be to have the new mountd =
write more than
>               just the pid in the MOUNTD_PID file so that fsshare() =
could tell if mountd was
>               going to be read locking the file.
>               OR
>               Just don't MFC the change and assume that the new mountd =
would be
>               released when the new fsshare() is (in FreeBSD13?).
>=20
> Anyhow, I can tweak mountd.c and fsshare.c, but that's as far as I can =
take it.
>=20
> Others would need to do testing and whatever it takes to get a change =
to fsshare.c
> into the ZFS sources.
>=20
> So, what do you think about this? rick
>=20
>=20




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FFB04DDA-8DC8-4DBA-89AB-943E9638175D>