Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 3 May 2021 00:27:42 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        freebsd-stable <freebsd-stable@freebsd.org>
Cc:        Peter Eriksson <pen@lysator.liu.se>, Ryan Moeller <freqlabs@FreeBSD.org>,  Garrett Wollman <wollman@hergotha.csail.mit.edu>, Alan Somers <asomers@freebsd.org>, Juraj Lutter <otis@FreeBSD.org>
Subject:   Re: wanna solve the Linux NFSv4 client puzzle?
Message-ID:  <YQXPR0101MB09685D1285AF7DE46E5D7738DD5C9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <YQXPR0101MB09682E0EEF2995E3FBC20BB8DD409@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
References:  <YQXPR0101MB09682E0EEF2995E3FBC20BB8DD409@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>

next in thread | previous in thread | raw e-mail | index | archive | help
Rick Macklem wrote:=0A=
>Hi,=0A=
>=0A=
>I posted recently that enabling delegations should be avoided at this time=
,=0A=
>especially if your FreeBSD NFS server has Linux client mounts...=0A=
>=0A=
>I thought some of you might be curious why, and I thought it would be=0A=
>more fun if you look for yourselves.=0A=
>To play the game, you need to download a packet capture:=0A=
>fetch https://people.freebsd.org/~rmacklem/twoclientdeleg.pcap=0A=
>and then load it into wireshark.=0A=
>=0A=
>192.168.1.5 - FreeBSD server with all recent patches=0A=
>192.168.1.6 - FedoraCore 30 (Linux 5.2 kernel) client=0A=
>192.168.1.13 - FreeBSD client=0A=
>=0A=
>A few hints buried in RFC5661:=0A=
>- A fore channel is used for normal client->server RPCs and a back channel=
=0A=
>  is used for server->client callback RPCs.=0A=
>- After a new TCP is created, neither the fore nor back channels=0A=
>  are bound to the connection.=0A=
>- Bindings channel(s) to a connection is done by BindConnectionToSession.=
=0A=
>  but an implicit binding for the fore channel is created when the first R=
PC=0A=
>  request with a Sequence operation in it is sent on the new TCP connectio=
n.=0A=
>- A server->client callback cannot be done until the back channel is bound=
=0A=
>  via BindConnectionToServer.=0A=
>=0A=
>Ok, so we are ready...=0A=
>- Look at packet #s 3518->3605.=0A=
>  - What is going on here?=0A=
Ok, so here's my solution...=0A=
packet #3518, 3520 and 3521 are delegation recalls (CB_RECALL)=0A=
for 3 different delegations on three different session slots.=0A=
time: 137.5=0A=
=0A=
Expected response from the Linux client--> 3 replies to the CB_RECALLs.=0A=
What does it actually do?=0A=
--> Creates a new TCP connection using same port#. You can see it send=0A=
      a FIN (packet# 3523) and a SYN (packet# 3527).=0A=
      This means that the client is no longer obliged to reply to the CB_RE=
CALLs=0A=
      and the FreeBSD server will probably need to retry them.=0A=
      --> It also means that no back channel is bound to the session, so th=
e=0A=
             server cannot do callbacks (ie. cannot retry the CB_RECALLs ye=
t).=0A=
=0A=
packet# 3530 is a Setattr RPC, which has a Sequence operation in it.=0A=
--> This means the fore channel is implicitly bound to the new TCP=0A=
      connection, but no back channel, so the server cannot retry the CB_RE=
CALLs.=0A=
=0A=
You will notice a bunch of Setattr RPCs getting NFS4ERR_DELAY replies.=0A=
This tells the Linux client to "try again later".=0A=
--> It happens because the FreeBSD server cannot perform the Setattr=0A=
      until the client returns a delegation.=0A=
      --> That requires a CB_RECALL.=0A=
=0A=
packet# 3582 is a Setattr RPC reply. If you look in the Sequence operation=
=0A=
reply, you will see the flag SEQ4_STATUS_CB_PATH_DOWN is set.=0A=
--> This is the FreeBSD server telling the Linux client that the callback p=
ath=0A=
       is down (the back channel is not bound to the new TCP connection).=
=0A=
Time: 137.6  (took about 0.1sec for the server to notice that the callback=
=0A=
                     path/back channel is not working).=0A=
=0A=
packet# 3604 Linux client does a BindConnectionToSession to bind the=0A=
       back channel.=0A=
--> This is not permitted by RFC5661, since it is required to be done on=0A=
      the new TCP connection before the implicit binding of the fore=0A=
      channel only, already done by packet# 3530.=0A=
packet# 3605 FreeBSD server violates RFC5661 and allows the binding=0A=
     to be done, so that CB_RECALLs can again be done.=0A=
Time: 152.7=0A=
=0A=
  - How long does this take?=0A=
    152.7 - 137.5 =3D 15.2seconds=0A=
=0A=
>--> One more hint. Starting with #3605, things are working again.=0A=
      --> Things start working again because the FreeBSD server=0A=
            cheats and allows the BindConnectionToSession to be done.=0A=
            RFC5661 specifies a reply of NFS4ERR_INVAL for this.=0A=
=0A=
>There are actually 3 other examples of this in the pack capture.=0A=
Every time multiple concurrent callbacks are attempted, the Linux=0A=
client "bails out" by creating a new TCP connection.=0A=
--> This is said to be fixed in Linux 5.3, but I haven't tested a newer=0A=
       kernel than 5.2 yet.=0A=
=0A=
>Btw, one of the weirdnesses is said to be fixed in Linux 5.3 and the other=
=0A=
>in Linux 5.7, although I have not yet upgraded my kernel and tested this.=
=0A=
The "do BindConnectionToSession after an implicit binding" is said to be fi=
xed=0A=
in Linux 5.7, however the fix is not exactly what I would have expected.=0A=
--> I would have expected a BindConnectionToSession to be done right=0A=
      away when a new TCP connection is created.=0A=
      --> Linux 5.7 and newer is said to still wait (15sec?) to do the=0A=
            BindConnectionToSession, but fixes the bug by creating yet=0A=
            another new TCP connection just before doing the=0A=
            BindConnectionToSession RPC.=0A=
      --> A SEQ4_STATUS_CB_PATH_DOWN flag set in a Sequence operation=0A=
            reply is what triggers the BindConnectionToSession and that is =
still=0A=
            required for 5.7 or newer, but I'll need to test to see how lon=
g it takes=0A=
            for newer kernels?=0A=
=0A=
The old "cheat", which is still in the released server code (recently remov=
ed=0A=
by a patch in main, stable/12 and stable/13) implicitly bound both the fore=
=0A=
and back channels. Look for this comment in sys/fs/nfsserver/nfs_nfsdstate.=
c=0A=
in unpatched code...=0A=
	/*=0A=
	 * If this session handles the backchannel, save the nd_xprt for this=0A=
	 * RPC, since this is the one being used.=0A=
	 * RFC-5661 specifies that the fore channel will be implicitly=0A=
	 * bound by a Sequence operation.  However, since some NFSv4.1 clients=0A=
	 * erroneously assumed that the back channel would be implicitly=0A=
	 * bound as well, do the implicit binding unless a=0A=
	 * BindConnectiontoSession has already been done on the session.=0A=
	 */=0A=
=0A=
--> This worked fine and avoided most of the above craziness, but...=0A=
       (A) It violated RFC5661.=0A=
       and=0A=
       (B) It broke the Linux client badly when the "nconnects" mount=0A=
            option (added fairly recently) was used.=0A=
       --> So I felt I had to get rid of it. (The non-conformance with=0A=
              RFC5661 was reported by redhat.)=0A=
=0A=
Bottom line...unless all your Linux clients are kernel version 5.3 or newer=
,=0A=
avoid enabling delegations in the FreeBSD NFSv4.1/4.2 server.=0A=
--> Even with a completely patched server, you will still get 15second paus=
es=0A=
      every time the server attempts multiple concurrent callbacks.=0A=
=0A=
>Have fun with it, rick=0A=
At least you can now see why I have "fun with it";-) rick=0A=
=0A=
_______________________________________________=0A=
freebsd-stable@freebsd.org mailing list=0A=
https://lists.freebsd.org/mailman/listinfo/freebsd-stable=0A=
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"=
=0A=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQXPR0101MB09685D1285AF7DE46E5D7738DD5C9>