Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 May 2006 11:10:41 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Rong-en Fan <grafan@gmail.com>
Cc:        freebsd-stable@freebsd.org, Howard Leadmon <howard@leadmon.net>, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: Trouble with NFSd under 6.1-Stable, any ideas?
Message-ID:  <20060523081041.GL54541@deviant.kiev.zoral.com.ua>
In-Reply-To: <6eb82e0605221443m5cc3c93bwaf9126ff2fb59667@mail.gmail.com>
References:  <017301c67784$45377a90$071872cf@Leadmon.local> <20060515024958.GA99002@xor.obsecurity.org> <6eb82e0605221443m5cc3c93bwaf9126ff2fb59667@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--fLj60tP2PZ34xyqD
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, May 22, 2006 at 05:43:32PM -0400, Rong-en Fan wrote:
> On 5/14/06, Kris Kennaway <kris@obsecurity.org> wrote:
> >On Sun, May 14, 2006 at 02:28:55PM -0400, Howard Leadmon wrote:
> >>
> >>    Hello All,
> >>
> >>  I have been running FBSD a long while, and actually running since the=
=20
> >5.x
> >> releases on the server I am having troubles with.   I basically have a=
=20
> >small
> >> network and just use NIS/NFS to link my various FBSD and Solaris machi=
nes
> >> together.
> >>
> >>  This has all been running fine up till a few days ago, when all of a=
=20
> >sudden
> >> NFS came to a crawl, and CPU usage so high the box appears to freeze=
=20
> >almost.
> >> When I had 6.1-RC running all seemed well, then came the announcement=
=20
> >for the
> >> official 6.1 release, so I did the cvs updates, made world, kernel, an=
d=20
> >ran
> >> mergemaster to get everything up to the 6.1 stable version.
> >>
> >>  Now after doing this, something is wrong with NFS.   It works, it wil=
l=20
> >return
> >> information and open files, just it's very very slow, and while=20
> >performing a
> >> request the CPU spike is astounding.  A simple du of my home directory=
=20
> >can
> >> take minutes, and machine all but locks up if the request is done over=
=20
> >NFS.
> >> Here is top snip:
> >>
> >>   PID USERNAME   THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU=20
> >COMMAND
> >>   497 root         1   4    0  1252K   780K -      2  50:42 188.48% nf=
sd
> >>
> >>
> >>  This is a nice IBM eServer with dual P4-XEON's and a couple GB or RAM=
=20
> >on a
> >> disk array, and locally is screams, heck NFS used to scream till I=20
> >updated.  I
> >> am not really sure what info would be useful in debugging, so won't po=
st=20
> >tons
> >> of misc junk in this eMail, but if anyone has any ideas as to how best=
 to
> >> figure out and resolve this issue it would sure be appreicated...
> >
> >Use tcpdump and related tools to find out what traffic is being sent.
> >
> >Also verify that you did not change your system configuration in any
> >way: there have been no changes to NFS since the release, so it is
> >unclear why an update would cause the problem to suddenly occur.
> >
> >Kris
>=20
> Hi Kris and Howard,
>=20
> As I posted few days ago, I have similar problems like Howard's
> (some details in the thread "6.1-RELEASE, em0 high interrupt rate
> and nfsd eats lots of cpu" on stable@). After binary searching
> the source tree, I found that
>=20
> RELENG_6_1, 2006.04.30.03.57 ok
> RELENG_6_1, 2006.04.30.04.00 bad
>=20
> The only commit is kern/vfs_lookup.c, an MFC of rev 1.90 and 1.91.
> With 04.30 03.57's source + manaully patched vfs_lookup.c rev 1.90,
> the same problem occurs.
>=20
> Let me refresh what problems I'm seeing
>=20
> 1. a client (no matter Linux 2.6.16 or FreeBSD 6.1) runs du on
>   a nfs directory
> 2. on server-side, nfsd starts to eats lots of CPU
> 3. the du finishes
> 4. on server-side, nfsd still eats lots of CPU, but there is no
>   nfs traffic. Wait for 5 minutes, you can still see that nfsd is
>   "running" and eats lots of CPU.
>=20
> On FreeBSD 6.1R client, it uses UDP mount and fstab is like
> "rw,-L,nosuid,bg,nodev". On Linux cleint, it uses UDP mount and
> fstab is like "defaults,udp,hard,intr,nfsvers=3D3,rsize=3D8192,wsize=3D81=
92".
> The server's kernel conf is at
>=20
> http://www.rafan.org/FreeBSD/nfs/KERNEL
>=20
> Some related configuration files:
>=20
> /etc/export
>  /export/dir1 host1 host2...
>  /export/dir2 host1 host2...
>=20
> /etc/rc.conf
> nfs_server_enable=3D"YES"
> nfs_server_flags=3D"-u -t -n 16"
> mountd_enable=3D"YES"
> mountd_flags=3D"-r -l -n"
> rpc_lockd_enable=3D"YES"
> rpc_statd_enable=3D"YES"
> rpcbind_enable=3D"YES"
>=20
> /etc/fstab:
> /dev/...  /export/dir1 ufs rw,nosuid,noexec 2 2
> /dev/...  /export/dir2 ufs rw,nosuid,noexec,userquota 2 2
>=20
> The NFS server is also using amd to mount some backup directories
> from another NFS server. the amd.conf is
>=20
> [global]
> browsable_dirs =3D yes
> map_type =3D file
> mount_type =3D nfs
> auto_dir =3D /nfs
> fully_qualified_hosts =3D no
> log_file =3D syslog
> nfs_proto =3D udp
> nfs_allow_insecure_port =3D no
> nfs_vers =3D 3
> # plock =3D yes
> selectors_on_default =3D yes
> restart_mounts =3D yes
>=20
> [/backup]
> map_options =3D type:=3Ddirect
> map_name =3D /etc/amd.direct
>=20
> /etc/amd.direct:
> /defaults
> opts:=3Drw,grpid,resvport,vers=3D3,proto=3Dudp,nosuid,nodev,rsize=3D8192,=
wsize=3D8192
> backup          type:=3Dnfs;rhost:=3Dnfs2;rfs:=3D/nfs2/${host}
>=20
>=20
> If there are any thing I can provide to help tracking this down. Please
> let me know. By the way, I tried with truss/kdump to see what happens
> when nfsd eats lot of CPUs, but in vain. They do not return anything.
>=20
I tried your recipe on 7-CURRENT with locally exported fs, remounted
over nfs. I did not get the behaviour your described.

Could you, please, provide the backtrace for the nfsd that
eats the CPU (from the ddb). I think it would be helpful to get several
backtraces (i.e., bt <nfsd pid>, cont, bt <nfsd pid> ...) to
see where it running.

Also, just in case, does filesystem that is exported and shows problem,
have quotas enabled ? One line of your fstab has userquotas, other does not.

--fLj60tP2PZ34xyqD
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (FreeBSD)

iD8DBQFEcsOAC3+MBN1Mb4gRAsW0AJ0eDFKjG3pmdzTe+/vySWbroCbUfACgsBxP
/FED5tWVvZycXwKaId17eKk=
=8ugA
-----END PGP SIGNATURE-----

--fLj60tP2PZ34xyqD--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060523081041.GL54541>