Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Dec 2018 00:20:08 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Peter Eriksson <peter@ifm.liu.se>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: Suggestion for hardware for ZFS fileserver
Message-ID:  <YQBPR01MB0388B1A87193C374F69E6F86DDB70@YQBPR01MB0388.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <YQBPR01MB038805DBCCE94383219306E1DDB80@YQBPR01MB0388.CANPRD01.PROD.OUTLOOK.COM>
References:  <CAEW%2BogZnWC07OCSuzO7E4TeYGr1E9BARKSKEh9ELCL9Zc4YY3w@mail.gmail.com> <C839431D-628C-4C73-8285-2360FE6FFE88@gmail.com> <CAEW%2BogYWKPL5jLW2H_UWEsCOiz=8fzFcSJ9S5k8k7FXMQjywsw@mail.gmail.com> <4f816be7-79e0-cacb-9502-5fbbe343cfc9@denninger.net>, <3160F105-85C1-4CB4-AAD5-D16CF5D6143D@ifm.liu.se>, <YQBPR01MB038805DBCCE94383219306E1DDB80@YQBPR01MB0388.CANPRD01.PROD.OUTLOOK.COM>

next in thread | previous in thread | raw e-mail | index | archive | help
I wrote:
>Peter Eriksson wrote:
>[good stuff snipped]
>>This has caused some interesting problems=85
>>
>>First thing we noticed was that booting would take forever=85 Mounting th=
e 20-100k >>filesystems _and_ enabling them to be shared via NFS is not don=
e efficient at all (for each filesystem it re-reads /etc/zfs/exports (a cou=
ple of times) befor appending one line to the end. Repeat 20-100,000 times=
=85 Not to mention the big kernel lock for NFS =93hold all NFS activity whi=
le we flush and reinstalls all sharing information per filesystem=94 being =
done by mountd=85
>Yes, /etc/exports and mountd were implemented in the 1980s, when a dozen
>file systems would have been a large server. Scaling to 10,000 or more fil=
e
systems wasn't even conceivable back then.

>Wish list item #1: A BerkeleyDB-based =92sharetab=92 that replaces the hor=
ribly >slow /etc/zfs/exports text file.
>Wish list item #2: A reimplementation of mountd and the kernel interface t=
o allow >a =93diff=94 between the contents of the DB-based sharetab above b=
e input into the >kernel instead of the brute-force way it=92s done now..
>The parser in mountd for /etc/exports is already an ugly beast and I think
>implementing a "diff" version will be difficult, especially figuring out w=
hat needs
>to be deleted.
>
>I do have a couple of questions related to this:
>1 - Would your case work if there was an "add these lines to /etc/exports"=
?
>     (Basically adding entries for file systems, but not trying to delete =
anything
>      previously exported. I am not a ZFS guy, but I think ZFS just genera=
tes another
>      exports file and then gets mountd to export everything again.)
>2 - Are all (or maybe most) of these ZFS file systems exported with the sa=
me
>      arguments?
>      - Here I am thinking that a "default-for-all-ZFS-filesystems" line c=
ould be
>         put in /etc/exports that would apply to all ZFS file systems not =
exported
>         by explicit lines in the exports file(s).
>      This would be fairly easy to implement and would avoid trying to han=
dle
>      1000s of entries.
>
>In particular, #2 above could be easily implemented on top of what is alre=
ady
>there, using a new type of line in /etc/exports and handling that as a spe=
cial
>case by the NFS server code, when no specific export for the file system t=
o the
>client is found.
Unfortunately, it doesn't sound like #2 above would be useful for Peter. Al=
though it is
easy to implement a single default export for all ZFS file systems not alre=
ady exported,
it would not be easy to say "export all file systems below /foo/bar this wa=
y", since
the kernel code basically doesn't know the directory structure. It has vnod=
es for
file objects and mount points to work with. (The kernel exports hang off of=
 the
mount points.)
>>(I=92ve written some code that implements item #1 above and it helps quit=
e a bit. >>Nothing near production quality yet though. I have looked at ite=
m #2 a bit too but >>not done anything about it.)
Btw, this "item #2" is not what I am referring to.
[more good stuff snipped]

rick




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQBPR01MB0388B1A87193C374F69E6F86DDB70>