From owner-freebsd-stable  Wed Jan 16 13:35:21 2002
Delivered-To: freebsd-stable@freebsd.org
Received: from emmi.physik.TU-Berlin.DE (emmi.physik.TU-Berlin.DE [130.149.160.103])
	by hub.freebsd.org (Postfix) with ESMTP id 3706B37B416
	for <freebsd-stable@freebsd.org>; Wed, 16 Jan 2002 13:35:13 -0800 (PST)
Received: (from jschlesn@localhost)
	by emmi.physik.TU-Berlin.DE (8.11.6/8.11.6) id g0GLYq375192;
	Wed, 16 Jan 2002 22:34:52 +0100 (CET)
	(envelope-from jschlesn)
Date: Wed, 16 Jan 2002 22:34:52 +0100
From: Jan Schlesner <jschlesn@physik.TU-Berlin.DE>
To: Steve Shorter <steve@nomad.tor.lets.net>
Cc: freebsd-stable@freebsd.org
Subject: Re: "server not responding" / "is alive again" NFS tunables
Message-ID: <20020116223452.A74841@physik.TU-Berlin.DE>
References: <20020116101212.A610@nomad.lets.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <20020116101212.A610@nomad.lets.net>; from steve@nomad.tor.lets.net on Wed, Jan 16, 2002 at 10:12:12AM -0500
Sender: owner-freebsd-stable@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-stable.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-stable>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-stable>
X-Loop: FreeBSD.ORG

Hi,

some weeks ago, the same problem was posted in this newsgroup. Here is
the anwser from Ian Dowse:
--
These are a side-effect of the operation of the NFS dynamic retransmit
timeout code. The NFS client measures the request response time for
various types of operations and it sets a timeout based on the mean
and deviation of observed times.

The time taken by the server to perform some operations can vary
wildly though, so occasionally when a large number of operations
complete with very little delay, the response time estimate and
hence the timeout become very small. Then when one request is
unusually slow to complete (such as when the disk on the server is
busy), the client thinks that the server isn't responding and prints
those warnings. A fraction of a second later the request completes
and the client prints a an 'is alive again' message.

On non-soft mounts these messages are completely harmless because
the client will just wait for the server to eventually reply. On
soft mounts, the effect can cause problems because applications
occasionally see an EINTR error.

The dynamic retransmit timeout code can be disabled with the `-d'
flag to mount_nfs; this is often recommended for fast networks that
see very little packet loss.
--

On Wed, Jan 16, 2002 at 10:12:12AM -0500, Steve Shorter wrote:
> 
> 	I have a dedicated NFS server with 16 nfsd's running, connected
> to SCSI raid/softupdates and good network connectivity/switching. Under
> moderate or even sometimes light load the clients(7 of them) log messages
> 
>      nfs server 192.168.10.2:/mnt: not responding
>      nfs server 192.168.10.2:/mnt: is alive again
> 
>  several times per minute. They always have the same timestamp. Performance
> is not noticably impaired, but I am wondering if this situation will eventually
> become a performance barrier as the system ramps up to full production, if
> the above log messages mean that packets must be delayed or retransmitted.

-- 
[ gpg key: http://wwwnlds.physik.tu-berlin.de/~schlesner/jschlesn.gpg ]
[ key fingerprint: 4236 3497 C4CF 4F3A 274F  B6E2 C4F6 B639 1DF4 CF0A ]
--
It's better to reign in hell,
	than to serve in heaven...

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message