From owner-freebsd-stable Wed Jan 16 13:35:21 2002 Delivered-To: freebsd-stable@freebsd.org Received: from emmi.physik.TU-Berlin.DE (emmi.physik.TU-Berlin.DE [130.149.160.103]) by hub.freebsd.org (Postfix) with ESMTP id 3706B37B416 for ; Wed, 16 Jan 2002 13:35:13 -0800 (PST) Received: (from jschlesn@localhost) by emmi.physik.TU-Berlin.DE (8.11.6/8.11.6) id g0GLYq375192; Wed, 16 Jan 2002 22:34:52 +0100 (CET) (envelope-from jschlesn) Date: Wed, 16 Jan 2002 22:34:52 +0100 From: Jan Schlesner To: Steve Shorter Cc: freebsd-stable@freebsd.org Subject: Re: "server not responding" / "is alive again" NFS tunables Message-ID: <20020116223452.A74841@physik.TU-Berlin.DE> References: <20020116101212.A610@nomad.lets.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20020116101212.A610@nomad.lets.net>; from steve@nomad.tor.lets.net on Wed, Jan 16, 2002 at 10:12:12AM -0500 Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hi, some weeks ago, the same problem was posted in this newsgroup. Here is the anwser from Ian Dowse: -- These are a side-effect of the operation of the NFS dynamic retransmit timeout code. The NFS client measures the request response time for various types of operations and it sets a timeout based on the mean and deviation of observed times. The time taken by the server to perform some operations can vary wildly though, so occasionally when a large number of operations complete with very little delay, the response time estimate and hence the timeout become very small. Then when one request is unusually slow to complete (such as when the disk on the server is busy), the client thinks that the server isn't responding and prints those warnings. A fraction of a second later the request completes and the client prints a an 'is alive again' message. On non-soft mounts these messages are completely harmless because the client will just wait for the server to eventually reply. On soft mounts, the effect can cause problems because applications occasionally see an EINTR error. The dynamic retransmit timeout code can be disabled with the `-d' flag to mount_nfs; this is often recommended for fast networks that see very little packet loss. -- On Wed, Jan 16, 2002 at 10:12:12AM -0500, Steve Shorter wrote: > > I have a dedicated NFS server with 16 nfsd's running, connected > to SCSI raid/softupdates and good network connectivity/switching. Under > moderate or even sometimes light load the clients(7 of them) log messages > > nfs server 192.168.10.2:/mnt: not responding > nfs server 192.168.10.2:/mnt: is alive again > > several times per minute. They always have the same timestamp. Performance > is not noticably impaired, but I am wondering if this situation will eventually > become a performance barrier as the system ramps up to full production, if > the above log messages mean that packets must be delayed or retransmitted. -- [ gpg key: http://wwwnlds.physik.tu-berlin.de/~schlesner/jschlesn.gpg ] [ key fingerprint: 4236 3497 C4CF 4F3A 274F B6E2 C4F6 B639 1DF4 CF0A ] -- It's better to reign in hell, than to serve in heaven... To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message