Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Aug 2010 10:35:49 -0700
From:      Mark Morley <mark@islandnet.com>
To:        FreeBSD Stable <freebsd-stable@freebsd.org>
Subject:   NFS stalling on 8.1-STABLE
Message-ID:  <20100812175029.76D811065696@hub.freebsd.org>

next in thread | raw e-mail | index | archive | help
Hi all,

I have five front end web servers that all mount their content from the same server via NFS.  If I stress the link on any one of the machines (eg: copy a large directory with a lot of files to/from the mounted file system) the client will pause.  That is, all processes trying to access that mount will freeze.  The log files with hundreds or thousands of nfs server not responding / is alive again messages. After 60 seconds it returns to normal, unless the load is still there in which case it continues to pause.

This has only started happening since I upgraded the client machines to 8.1-STABLE (previously four of them were 8.0 and one was 7.3).  The server is 7.1-RELEASE-p11.  No other changes have taken place in terms of hardware or software or mount options, etc.

All nics involved are gigabit em cards, and they are on a private network (web access to the boxes is via an external interface).

If I truss a command such as "df", it gets to&nbsp;getfsstat() and pauses there.

Mount options are currently "rw,tcp,nolockd,noatime,nosuid,bg,intr,soft,rsize=32768,wsize=32768" but I've tried all sorts of things and it doesn't seem to make a difference.

Here's a sample output from nfsstat -c from one of the boxes (uptime 14 days):

Client Info:
Rpc Counts:
Getattr   Setattr    Lookup  Readlink      Read     Write    Create    Remove
75552107   3008653 300569929    253365   2426554   4748471   2035545   3015497
Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus    Access
864598     50887      7462     11895   1137933  16160386         0  31593291
Mknod    Fsstat    Fsinfo  PathConf    Commit
0  22510271         5         0   3569465
Rpc Info:
TimedOut   Invalid X Replies   Retries  Requests
0         0         0         0 467516377
Cache Info:
Attr Hits    Misses Lkup Hits    Misses BioR Hits    Misses BioW Hits    Misses
1461457650  75552057 963440449 300536041  37404178   2359677   9467719   4748471
BioRLHits    Misses BioD Hits    Misses DirE Hits    Misses
14409992    253365  29508747  16119060  22292421     23233

Any thoughts?

Mark



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100812175029.76D811065696>