Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 14 Oct 2005 11:05:53 -0500
From:      Eric Anderson <anderson@centtech.com>
To:        Nicolas KOWALSKI <Nicolas.Kowalski@imag.fr>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: FreeBSD NFS server not responding to TCP SYN packets from Linux/SunOS clients
Message-ID:  <434FD761.3050506@centtech.com>
In-Reply-To: <vqou0fkw92s.fsf@obiou.imag.fr>
References:  <Pine.LNX.4.64.0510141021290.22064@corbeau.imag.fr>	<20051014160128.hev160v52ossokg0@wwws.cs.ait.ac.th>	<20051014045824.V5343@odysseus.silby.com> <vqou0fkw92s.fsf@obiou.imag.fr>

next in thread | previous in thread | raw e-mail | index | archive | help
Nicolas KOWALSKI wrote:
> Mike Silbersack <silby@silby.com> writes:
> 
> 
>>On Fri, 14 Oct 2005, on@cs.ait.ac.th wrote:
>>
>>
>>>Nicolas KOWALSKI wrote:
>>>
>>>>Our FreeBSD 4.10 NFS server has some problems serving files by NFS
>>>>on TCP (no problem with UDP) when the Linux (2.6) or Solaris (5.9)
>>>>clients shut down in an unclean manner (power failure). When the
>>>>clients try to mount the shares from the server after an unclean
>>>>shutdown, the mount process hang during several minutes (delay is
>>>>varying), then succeeds.
>>>
>>>That is just a wild guess, but NFS mounting would happen always at
>>>the same stage of the boot, so maybe with the same source port
>>>number and you could be facing the problem that the connection is
>>>waiting for termination on the server (close_wait or fin_wait or
>>>something)... Se source port in working example is 798 and source
>>>port in failing example is 799 certainly not random.
>>
>>The socket on the server would still be in the ESTABLISHED state,
>>which is even worse than the close_wait or fin_wait states in this
>>case.  The SYN will be accepted if it's greater than the previous
>>sequence number, so that's a 50% chance it'll work.
> 
> 
> Thanks for this explanation.
> 
> 
>>Assuming that port reuse is the problem, there is no quick fix for
>>this, just resetting connections when a SYN comes in would be a
>>really big security problem.
> 
> 
> Really? Are Linux and Solaris that insecure because of this behaviour?
> 
> 
>>Actually, there may be a quick fix for this specific machine.  If you
>>set net.inet.tcp.keepidle to 1 minute (60*whatever kern.hz is),
>>that'll cause keepalive packets to be sent every minute to an idle
>>connection, rather than every 2 hours.  That would kill the stuck
>>connections much quicker.
> 
> 
> Unfortunately, this does not work as expected. I just tested with my
> workstation (Linux 2.6), with NFS filesystems mounted with TCP; when
> the station rebooted abruptely, mounting the same NFS filesystems hung
> more than 1 minute (15 minutes just now). During this hang, I saw on
> the server, using netstat, the nfsd process related to my workstation
> in ESTABLISHED state.
> 
> Any other tip?
> 
> Many Thanks in advance,


Man fixmount?

      -A      Issues a command to the remote mountd declaring that all 
of its
              file systems have been unmounted.  This should be used 
with cau-
              tion, as it removes all remote mount entries pertaining to the
              local system, whether or not any file systems are still 
mounted
              locally.



-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
Anything that works is better than anything that doesn't.
------------------------------------------------------------------------



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?434FD761.3050506>