Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Dec 2014 07:56:42 -0500 (EST)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        =?utf-8?B?TG/Dr2M=?= Blot <loic.blot@unix-experience.fr>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: High Kernel Load with nfsv4
Message-ID:  <1280247055.9141285.1418216202088.JavaMail.root@uoguelph.ca>
In-Reply-To: <1e19554bc0d4eb3e8dab74e2056b5ec4@mail.unix-experience.fr>

next in thread | previous in thread | raw e-mail | index | archive | help
Loic Blot wrote:
> Hi Rick,
> I'm trying NFSv3.
> Some jails are starting very well but now i have an issue with lockd
> after some minutes:
>=20
> nfs server 10.10.X.8:/jails: lockd not responding
> nfs server 10.10.X.8:/jails lockd is alive again
>=20
> I look at mbuf, but i seems there is no problem.
>=20
Well, if you need locks to be visible across multiple clients, then
I'm afraid you are stuck with using NFSv4 and the performance you get
from it. (There is no way to do file handle affinity for NFSv4 because
the read and write ops are buried in the compound RPC and not easily
recognized.)

If the locks don't need to be visible across multiple clients, I'd
suggest trying the "nolockd" option with nfsv3.

> Here is my rc.conf on server:
>=20
> nfs_server_enable=3D"YES"
> nfsv4_server_enable=3D"YES"
> nfsuserd_enable=3D"YES"
> nfsd_server_flags=3D"-u -t -n 256"
> mountd_enable=3D"YES"
> mountd_flags=3D"-r"
> nfsuserd_flags=3D"-usertimeout 0 -force 20"
> rpcbind_enable=3D"YES"
> rpc_lockd_enable=3D"YES"
> rpc_statd_enable=3D"YES"
>=20
> Here is the client:
>=20
> nfsuserd_enable=3D"YES"
> nfsuserd_flags=3D"-usertimeout 0 -force 20"
> nfscbd_enable=3D"YES"
> rpc_lockd_enable=3D"YES"
> rpc_statd_enable=3D"YES"
>=20
> Have you got an idea ?
>=20
> Regards,
>=20
> Lo=C3=AFc Blot,
> UNIX Systems, Network and Security Engineer
> http://www.unix-experience.fr
>=20
> 9 d=C3=A9cembre 2014 04:31 "Rick Macklem" <rmacklem@uoguelph.ca> a =C3=A9=
crit:
> > Loic Blot wrote:
> >=20
> >> Hi rick,
> >>=20
> >> I waited 3 hours (no lag at jail launch) and now I do: sysrc
> >> memcached_flags=3D"-v -m 512"
> >> Command was very very slow...
> >>=20
> >> Here is a dd over NFS:
> >>=20
> >> 601062912 bytes transferred in 21.060679 secs (28539579 bytes/sec)
> >=20
> > Can you try the same read using an NFSv3 mount?
> > (If it runs much faster, you have probably been bitten by the ZFS
> > "sequential vs random" read heuristic which I've been told things
> > NFS is doing "random" reads without file handle affinity. File
> > handle affinity is very hard to do for NFSv4, so it isn't done.)
> >=20
I was actually suggesting that you try the "dd" over nfsv3 to see how
the performance compared with nfsv4. If you do that, please post the
comparable results.

Someday I would like to try and get ZFS's sequential vs random read
heuristic modified and any info on what difference in performance that
might make for NFS would be useful.

rick

> > rick
> >=20
> >> This is quite slow...
> >>=20
> >> You can found some nfsstat below (command isn't finished yet)
> >>=20
> >> nfsstat -c -w 1
> >>=20
> >> GtAttr Lookup Rdlink Read Write Rename Access Rddir
> >> 0 0 0 0 0 0 0 0
> >> 4 0 0 0 0 0 16 0
> >> 2 0 0 0 0 0 17 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 4 0 0 0 0 4 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 4 0 0 0 0 0 3 0
> >> 0 0 0 0 0 0 3 0
> >> 37 10 0 8 0 0 14 1
> >> 18 16 0 4 1 2 4 0
> >> 78 91 0 82 6 12 30 0
> >> 19 18 0 2 2 4 2 0
> >> 0 0 0 0 2 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> GtAttr Lookup Rdlink Read Write Rename Access Rddir
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 1 0 0 0 0 1 0
> >> 4 6 0 0 6 0 3 0
> >> 2 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 1 0 0 0 0 0 0 0
> >> 0 0 0 0 1 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 6 108 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> GtAttr Lookup Rdlink Read Write Rename Access Rddir
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 98 54 0 86 11 0 25 0
> >> 36 24 0 39 25 0 10 1
> >> 67 8 0 63 63 0 41 0
> >> 34 0 0 35 34 0 0 0
> >> 75 0 0 75 77 0 0 0
> >> 34 0 0 35 35 0 0 0
> >> 75 0 0 74 76 0 0 0
> >> 33 0 0 34 33 0 0 0
> >> 0 0 0 0 5 0 0 0
> >> 0 0 0 0 0 0 6 0
> >> 11 0 0 0 0 0 11 0
> >> 0 0 0 0 0 0 0 0
> >> 0 17 0 0 0 0 1 0
> >> GtAttr Lookup Rdlink Read Write Rename Access Rddir
> >> 4 5 0 0 0 0 12 0
> >> 2 0 0 0 0 0 26 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 4 0 0 0 0 4 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 4 0 0 0 0 0 2 0
> >> 2 0 0 0 0 0 24 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> GtAttr Lookup Rdlink Read Write Rename Access Rddir
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 4 0 0 0 0 0 7 0
> >> 2 1 0 0 0 0 1 0
> >> 0 0 0 0 2 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 6 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 4 6 0 0 0 0 3 0
> >> 0 0 0 0 0 0 0 0
> >> 2 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> GtAttr Lookup Rdlink Read Write Rename Access Rddir
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 4 71 0 0 0 0 0 0
> >> 0 1 0 0 0 0 0 0
> >> 2 36 0 0 0 0 1 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 1 0 0 0 0 0 1 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 79 6 0 79 79 0 2 0
> >> 25 0 0 25 26 0 6 0
> >> 43 18 0 39 46 0 23 0
> >> 36 0 0 36 36 0 31 0
> >> 68 1 0 66 68 0 0 0
> >> GtAttr Lookup Rdlink Read Write Rename Access Rddir
> >> 36 0 0 36 36 0 0 0
> >> 48 0 0 48 49 0 0 0
> >> 20 0 0 20 20 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 3 14 0 1 0 0 11 0
> >> 0 0 0 0 0 0 0 0
> >> 0 0 0 0 0 0 0 0
> >> 0 4 0 0 0 0 4 0
> >> 0 0 0 0 0 0 0 0
> >> 4 22 0 0 0 0 16 0
> >> 2 0 0 0 0 0 23 0
> >>=20
> >> Regards,
> >>=20
> >> Lo=C3=AFc Blot,
> >> UNIX Systems, Network and Security Engineer
> >> http://www.unix-experience.fr
> >>=20
> >> 8 d=C3=A9cembre 2014 09:36 "Lo=C3=AFc Blot" <loic.blot@unix-experience=
.fr> a
> >> =C3=A9crit:
> >>> Hi Rick,
> >>> I stopped the jails this week-end and started it this morning,
> >>> i'll
> >>> give you some stats this week.
> >>>=20
> >>> Here is my nfsstat -m output (with your rsize/wsize tweaks)
> >>>=20
> >>>=20
> >>=20
> > nfsv4,tcp,resvport,hard,cto,sec=3Dsys,acdirmin=3D3,acdirmax=3D60,acregm=
in=3D5,acregmax=3D60,nametimeo=3D60,negna
> >>>=20
> >>=20
> > etimeo=3D60,rsize=3D32768,wsize=3D32768,readdirsize=3D32768,readahead=
=3D1,wcommitsize=3D773136,timeout=3D120,retra
> >>> s=3D2147483647
> >>>=20
> >>> On server side my disks are on a raid controller which show a
> >>> 512b
> >>> volume and write performances
> >>> are very honest (dd if=3D/dev/zero of=3D/jails/test.dd bs=3D4096
> >>> count=3D100000000 =3D> 450MBps)
> >>>=20
> >>> Regards,
> >>>=20
> >>> Lo=C3=AFc Blot,
> >>> UNIX Systems, Network and Security Engineer
> >>> http://www.unix-experience.fr
> >>>=20
> >>> 5 d=C3=A9cembre 2014 15:14 "Rick Macklem" <rmacklem@uoguelph.ca> a
> >>> =C3=A9crit:
> >>>=20
> >>>> Loic Blot wrote:
> >>>>=20
> >>>>> Hi,
> >>>>> i'm trying to create a virtualisation environment based on
> >>>>> jails.
> >>>>> Those jails are stored under a big ZFS pool on a FreeBSD 9.3
> >>>>> which
> >>>>> export a NFSv4 volume. This NFSv4 volume was mounted on a big
> >>>>> hypervisor (2 Xeon E5v3 + 128GB memory and 8 ports (but only 1
> >>>>> was
> >>>>> used at this time).
> >>>>>=20
> >>>>> The problem is simple, my hypervisors runs 6 jails (used 1% cpu
> >>>>> and
> >>>>> 10GB RAM approximatively and less than 1MB bandwidth) and works
> >>>>> fine at start but the system slows down and after 2-3 days
> >>>>> become
> >>>>> unusable. When i look at top command i see 80-100% on system
> >>>>> and
> >>>>> commands are very very slow. Many process are tagged with
> >>>>> nfs_cl*.
> >>>>=20
> >>>> To be honest, I would expect the slowness to be because of slow
> >>>> response
> >>>> from the NFSv4 server, but if you do:
> >>>> # ps axHl
> >>>> on a client when it is slow and post that, it would give us some
> >>>> more
> >>>> information on where the client side processes are sitting.
> >>>> If you also do something like:
> >>>> # nfsstat -c -w 1
> >>>> and let it run for a while, that should show you how many RPCs
> >>>> are
> >>>> being done and which ones.
> >>>>=20
> >>>> # nfsstat -m
> >>>> will show you what your mount is actually using.
> >>>> The only mount option I can suggest trying is
> >>>> "rsize=3D32768,wsize=3D32768",
> >>>> since some network environments have difficulties with 64K.
> >>>>=20
> >>>> There are a few things you can try on the NFSv4 server side, if
> >>>> it
> >>>> appears
> >>>> that the clients are generating a large RPC load.
> >>>> - disabling the DRC cache for TCP by setting vfs.nfsd.cachetcp=3D0
> >>>> - If the server is seeing a large write RPC load, then
> >>>> "sync=3Ddisabled"
> >>>> might help, although it does run a risk of data loss when the
> >>>> server
> >>>> crashes.
> >>>> Then there are a couple of other ZFS related things (I'm not a
> >>>> ZFS
> >>>> guy,
> >>>> but these have shown up on the mailing lists).
> >>>> - make sure your volumes are 4K aligned and ashift=3D12 (in case a
> >>>> drive
> >>>> that uses 4K sectors is pretending to be 512byte sectored)
> >>>> - never run over 70-80% full if write performance is an issue
> >>>> - use a zil on an SSD with good write performance
> >>>>=20
> >>>> The only NFSv4 thing I can tell you is that it is known that
> >>>> ZFS's
> >>>> algorithm for determining sequential vs random I/O fails for
> >>>> NFSv4
> >>>> during writing and this can be a performance hit. The only
> >>>> workaround
> >>>> is to use NFSv3 mounts, since file handle affinity apparently
> >>>> fixes
> >>>> the problem and this is only done for NFSv3.
> >>>>=20
> >>>> rick
> >>>>=20
> >>>>> I saw that there are TSO issues with igb then i'm trying to
> >>>>> disable
> >>>>> it with sysctl but the situation wasn't solved.
> >>>>>=20
> >>>>> Someone has got ideas ? I can give you more informations if you
> >>>>> need.
> >>>>>=20
> >>>>> Thanks in advance.
> >>>>> Regards,
> >>>>>=20
> >>>>> Lo=C3=AFc Blot,
> >>>>> UNIX Systems, Network and Security Engineer
> >>>>> http://www.unix-experience.fr
> >>>>> _______________________________________________
> >>>>> freebsd-fs@freebsd.org mailing list
> >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> >>>>> To unsubscribe, send any mail to
> >>>>> "freebsd-fs-unsubscribe@freebsd.org"
> >>>=20
> >>> _______________________________________________
> >>> freebsd-fs@freebsd.org mailing list
> >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> >>> To unsubscribe, send any mail to
> >>> "freebsd-fs-unsubscribe@freebsd.org"
>=20



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1280247055.9141285.1418216202088.JavaMail.root>