Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 14 Oct 2005 14:56:43 +0200
From:      Nicolas KOWALSKI <Nicolas.Kowalski@imag.fr>
To:        freebsd-fs@FreeBSD.org
Subject:   Re: FreeBSD NFS server not responding to TCP SYN packets from Linux/SunOS clients
Message-ID:  <vqou0fkw92s.fsf@obiou.imag.fr>
In-Reply-To: <20051014045824.V5343@odysseus.silby.com>
References:  <Pine.LNX.4.64.0510141021290.22064@corbeau.imag.fr> <20051014160128.hev160v52ossokg0@wwws.cs.ait.ac.th> <20051014045824.V5343@odysseus.silby.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Mike Silbersack <silby@silby.com> writes:

> On Fri, 14 Oct 2005, on@cs.ait.ac.th wrote:
>
>> Nicolas KOWALSKI wrote:
>>> Our FreeBSD 4.10 NFS server has some problems serving files by NFS
>>> on TCP (no problem with UDP) when the Linux (2.6) or Solaris (5.9)
>>> clients shut down in an unclean manner (power failure). When the
>>> clients try to mount the shares from the server after an unclean
>>> shutdown, the mount process hang during several minutes (delay is
>>> varying), then succeeds.
>>
>> That is just a wild guess, but NFS mounting would happen always at
>> the same stage of the boot, so maybe with the same source port
>> number and you could be facing the problem that the connection is
>> waiting for termination on the server (close_wait or fin_wait or
>> something)... Se source port in working example is 798 and source
>> port in failing example is 799 certainly not random.
>
> The socket on the server would still be in the ESTABLISHED state,
> which is even worse than the close_wait or fin_wait states in this
> case.  The SYN will be accepted if it's greater than the previous
> sequence number, so that's a 50% chance it'll work.

Thanks for this explanation.

> Assuming that port reuse is the problem, there is no quick fix for
> this, just resetting connections when a SYN comes in would be a
> really big security problem.

Really? Are Linux and Solaris that insecure because of this behaviour?

> Actually, there may be a quick fix for this specific machine.  If you
> set net.inet.tcp.keepidle to 1 minute (60*whatever kern.hz is),
> that'll cause keepalive packets to be sent every minute to an idle
> connection, rather than every 2 hours.  That would kill the stuck
> connections much quicker.

Unfortunately, this does not work as expected. I just tested with my
workstation (Linux 2.6), with NFS filesystems mounted with TCP; when
the station rebooted abruptely, mounting the same NFS filesystems hung
more than 1 minute (15 minutes just now). During this hang, I saw on
the server, using netstat, the nfsd process related to my workstation
in ESTABLISHED state.

Any other tip?

Many Thanks in advance,
-- 
Nicolas



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?vqou0fkw92s.fsf>