Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 13 Jan 2017 23:02:22 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc:        Eugene Grosbein <eugen@grosbein.net>, Michael Sinatra <michael+lists@burnttofu.net>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject:   Re: NFSv4 stuck
Message-ID:  <YTXPR01MB01893E61D99D1B7D1F5D10E4DD780@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <20170112232016.GM30374@zxy.spb.ru>
References:  <20170111220818.GD30374@zxy.spb.ru> <YTXPR01MB0189449C0DC06F53E93A3EF9DD660@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM> <20170111225922.GE30374@zxy.spb.ru> <bfe09d16-8fdd-81b1-082b-bdf409d57be4@burnttofu.net> <20170111235020.GF30374@zxy.spb.ru> <58771EA6.1020104@grosbein.net> <20170112131504.GG30374@zxy.spb.ru> <YTXPR01MB0189D38D9CA9AA98614EE8B8DD790@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>, <20170112232016.GM30374@zxy.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
Slawa Olhovchenkov wrote:
[stuff snipped]
>> >
>> >What data? In may case no data.
You have a file system with no files in it. (It is file data I am referring=
 to.)
Admittedly a read-only file system won't get corrupted, but you will still =
have trouble
reading files, since NFSv4 require that they be Open'd before reading.
>> Certain NFSv4 operations (such as open and byte range locking) are stric=
tly ordered using a
>> seqid#. If you fail an RPC in progress (via a soft timeout or intr via a=
 signal) then this seqid gets
>> out of sync between client and server and your mount is badly broken.
>
>Mount can be droped? Automatic forced unmount?
>Or application can be manual killed for manual unmount?
>This is will be perfect for me. This is will be best that current behavior=
.
Well, since recently written data could be lost, I can't see this ever bein=
g automatic.
The manual "umount -f <mount-path>" should work, but only if a "umount <mou=
nt-path>" has
not already been done. (The latter gets stuck in the kernel, usually after =
locking the mounted-on
vnode and that blocks the subsequent "umount -f <mount-path>".

Someday, I plan on adding a new option to "umount" that goes directly to NF=
S (via the nfssvc(2)
syscall) to force a dismount, but I haven't gotten around to doing it.

Until then, it's "umount -f" or reboot. And please don't use "soft,intr" op=
tions, they won't usually
help and will break the mount for opening files sooner or later.
>
>> I do not believe this caused your hang though, since processes were slee=
ping on rpccon, which
>> means they were trying to do a new TCP connection to the server unsucces=
sfully.
>> - Which normally indicates a problem with your underlying network fabric=
.
>
>Network can fail always, at any time.
>This should not cause a blockage of the system.
Would you expect a local filesystem to keep working when the JBOD interface=
 to a drive is broken.
For NFS, a broken network means "can't talk to the file system" just like a=
 broken JBOD to a file
system's drive would mean this.

For NFS to work well, you want the most reliable network fabric possible.
One the network is fixed, it should again be possible for the mount to work=
.
(The processes in "rpccon" are trying to create a new TCP connection and wh=
en they succeed
 the mount point should again start working.)

rick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB01893E61D99D1B7D1F5D10E4DD780>