Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 May 2005 16:55:23 +0200
From:      Heinrich Rebehn <rebehn@ant.uni-bremen.de>
To:        freebsd-net@freebsd.org
Subject:   nfsrvstats.srvrpc_errs rapidly increasing
Message-ID:  <4280CB5B.1080007@ant.uni-bremen.de>

next in thread | raw e-mail | index | archive | help
Hi all,

In order to find the cause of the problems with our Linux NFS clients, i 
toook a look at 'nfsstat -s' on our FreeBSD server (RELENG_5_3).
I noticed that "Server Ret-Failed" was rapidly increasing. After 1 day 
of uptime, it is already at 643936:

#######################################################################
root@antsrv1 [~] # nfsstat -s

Server Info:
   Getattr   Setattr    Lookup  Readlink      Read     Write    Create 
   Remove
   2501670    234193   1051157     12421    365378    185952     61166 
    74050
    Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus 
   Access
     60646     19767       246      1494       354      2265     50548 
  4465364
     Mknod    Fsstat    Fsinfo  PathConf    Commit
        12       588       141         0    103946
Server Ret-Failed
            643936
Server Faults
             0
Server Cache Stats:
    Inprog      Idem  Non-idem    Misses
         3         5         0    162819
Server Write Gathering:
  WriteOps  WriteRPC   Opsaved
    185952    185952         0
root@antsrv1 [~] # uptime
  4:24PM  up 1 day, 17 mins, 4 users, load averages: 0.02, 0.03, 0.00
######################################################################

Looking into nfsstat's source, i found that "nfsrvstats.srvrpc_errs" is 
the counter shown. Grep-ing the kernel sources showed that it is 
increased by /usr/src/sys/nfsserver/nfs_srvsock.c.
It seems to be a catch-all for unexpected rpc errors.
The procedure, nfs_rephead(), is called by nfs_srvcache.c, where 
rp->rc_status is supplied as value for the error.
At this point i am unable to track things any further, i am not familiar 
with kernel sources.

Question: is there a list of error codes somewhere?

I hacked a log output into nfs_srvsock.c:

--- nfs_srvsock.c       Sat Jul 24 04:07:09 2004
+++ nfs_srvsock.ANT.c   Tue May 10 16:30:52 2005
@@ -213,8 +213,10 @@
         }
         *mbp = mb;
         *bposp = bpos;
-       if (err != 0 && err != NFSERR_RETVOID)
+        if (err != 0 && err != NFSERR_RETVOID){
                 nfsrvstats.srvrpc_errs++;
+                log(LOG_WARNING, "ANT: unknown RPC error %d\n", err);
+        }
         return mreq;
  }

Most errors (>90%) are "2", but i also see 1, 13, 17, 66, 70

Any thoughts on this? We do have annoying problems with Linux clients 
(2.6.8) occasionally hanging when mounting from the FBSD machine. I 
don't know if this is related, but at least it's a point to start.

Thanks for any help,

	Heinrich Rebehn
-- 

Heinrich Rebehn

University of Bremen
Physics / Electrical and Electronics Engineering
- Department of Telecommunications -

Phone : +49/421/218-4664
Fax   :            -3341



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4280CB5B.1080007>