Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 8 Mar 2018 22:54:19 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        NAGY Andreas <Andreas.Nagy@frequentis.com>, "'freebsd-stable@freebsd.org'" <freebsd-stable@freebsd.org>
Subject:   =?iso-8859-1?Q?Re:_NFS_4.1_RECLAIM=5FCOMPLETE_FS=A0failed_error_in_combin?= =?iso-8859-1?Q?ation_with_ESXi_client?=
Message-ID:  <YQBPR0101MB1042D788A9B3DBF769052244DDDF0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <D890568E1D8DD044AA846C56245166780124AFCFF8@vie196nt>
References:  <c5c624de-42bb-45cf-8cf0-b25be56e5f58@frequentis.com> <YQBPR0101MB1042DEF0825996764CBCA829DDC40@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <D890568E1D8DD044AA846C56245166780124AFB90E@vie196nt> <YQBPR0101MB1042479407CAA253674BBAEBDDDB0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM> <D890568E1D8DD044AA846C56245166780124AFBD21@vie196nt>, <D890568E1D8DD044AA846C56245166780124AFBD91@vie196nt> <YQBPR0101MB104225B6884FEC70A03C61CCDDDA0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <D890568E1D8DD044AA846C56245166780124AFC0E2@vie196nt>, <YQBPR0101MB1042040D2BFB3681E940D271DDDA0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <2feda1e2-16d5-43b5-98eb-dcc71cc67c6f@frequentis.com> <YQBPR0101MB10427C97161C74A5C441D1DCDDD80@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <D890568E1D8DD044AA846C56245166780124AFCABC@vie196nt> <YQBPR0101MB1042B17763E2605A7CE72EF5DDDF0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <D890568E1D8DD044AA846C56245166780124AFCFF8@vie196nt>

next in thread | previous in thread | raw e-mail | index | archive | help
NAGY Andreas wrote:
>Thanks you, really great how fast you adapt the source/make patches for th=
is. Saw so many >posts were people did not get NFS41 working with ESXi and =
FreeBSD and now I have it already >running with your changes.
>
>I have now compiled the kernel with all 4 patches, and it works now.
Ok. Sounds like we are making progress. It also takes someone willing to te=
st patches, so
thanks for doing so.
>Some problems are still left:
>
>- the "Server returned improper reason for no delegation: 2" warnings are =
still in the >vmkernel.log.
>                2018-03-08T11:41:20.290Z cpu0:68011 opID=3D488969b0)WARNIN=
G: NFS41: >NFS41ValidateDelegation:608: Server returned improper reason for=
 no delegation: 2
I'll take another look and see if I can guess why it doesn't like "2" as a =
reason for not
issuing a delegation. (As noted before, I don't think this is serious, but?=
??)

>- can't delete a folder with the VMware host client datastore browser:
 >               2018-03-08T11:34:00.349Z cpu1:67981 opID=3Df5159ce3)WARNIN=
G: NFS41: >NFS41FileOpReaddir:4728: Failed to process READDIR result for fh=
 0x43046e4cb158: Transient >file system condition, suggest retry
[more of these snipped]
>                2018-03-08T11:34:00.352Z cpu1:67981 opID=3Df5159ce3)WARNIN=
G: UserFile: 2155: >hostd-worker: Directory changing too often to perform r=
eaddir operation (11 retries), >returning busy
This one is a mystery to me. It seemed to be upset that the directory is ch=
anging (I
assume either the Change or ModifyTime attributes). However, if entries are=
 being
deleted, the directory is changing and, as far as I know, the Change and Mo=
difyTime
attributes are supposed to change.
I might try posting on nfsv4@ietf.org in case somebody involved with this c=
lient reads
that list and can explain what this is?

>- after a reboot of the FreeBSD machine the ESXi does not restore the NFS =
datastore again >with following warning (just disconnecting the links is fi=
ne)
>                2018-03-08T12:39:44.602Z cpu23:66484)WARNING: NFS41: NFS41=
_Bug:2361: BUG - >Invalid BIND_CONN_TO_SESSION error: NFS4ERR_NOTSUPP
Hmm. Normally after a server reboot, the clients will try some RPC that sta=
rts with a
Sequence (the session op) and the server will reply NFS4ERR_BAD_SESSION.
This triggers recovery in the client.
The BindConnectiontoSession operation is done in an RPC by itself, so there=
 is no
Sequence op to trigger NFS4ERR_BAD_SESSION.
Maybe this client expects to see NFS4ERR_BAD_SESSION for the BindConnection=
toSession.
I'll post a patch that modifies the BindConnectiontoSession to do that.

>Actually I have only made some quick benchmarks with ATTO in a Windows VM =
which has a >vmdk on the NFS41 datastore which is mounted over two 1GB link=
s in different subnets.
>Read is nearly the double of just a single connection and write is just a =
bit faster. Don't know if >write speed could be improved, actually the shar=
e is UFS on a HW raid controller which has >local write speeds about 500MB/=
s.
Yes, before I posted that I didn't understand why multiple TCP links would =
be faster.
I didn't notice at the time that you mentioned using different subnets and,=
 as such,
links couldn't be trunked below TCP. In your case trunking above TCP makes =
sense.

Getting slower write rates than read rates from NFS is normal.
Did you try "sysctl vfs.nfsd.async=3D1"?
The other thing that might help for UFS is increasing the size of the buffe=
r cache.
(If this server is mainly an NFS server you could probably make the buffer =
cache
 greater than half of the machine's ram.
 Note to others, since ZFS doesn't use the buffer cache, the opposite is tr=
ue for
 ZFS.)

rick




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQBPR0101MB1042D788A9B3DBF769052244DDDF0>