Date: Wed, 20 Jul 2011 15:13:46 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Clinton Adams <clinton.adams@gmail.com> Cc: FreeBSD FS <freebsd-fs@freebsd.org> Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel Message-ID: <1487604530.809805.1311189226746.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <CAEuopLZMEvm56s3N5MgN%2B7mGCdoP_RkZa3R2zd5QG1dbLtVqaA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Clinton Adams wrote: > On Wed, Jul 20, 2011 at 3:29 PM, Rick Macklem <rmacklem@uoguelph.ca> > wrote: > > Clinton Adams wrote: > >> On Wed, Jul 20, 2011 at 1:09 AM, Rick Macklem > >> <rmacklem@uoguelph.ca> > >> wrote: > >> > Please try the patch, which is at: > >> > =C2=A0 http://people.freebsd.org/~rmacklem/noopen.patch > >> > (This patch is against the file in -current, so patch may not > >> > like > >> > it, but > >> > =C2=A0it should be easy to do by hand, if patch fails.) > >> > > >> > Again, good luck with it and please let me know how it goes, rick > >> > > >> > >> Thanks for your help with this, trying the patches now. Tests with > >> one > >> client look good so far, values for OpenOwner and CacheSize are > >> more > >> in line, we'll test with more clients later today. We were hitting > >> the > >> nfsrc_floodlevel with just three clients before, all using nfs4 > >> mounted home and other directories. Clients are running Ubuntu > >> 10.04.2 > >> LTS. Usage has been general desktop usage, nothing unusual that we > >> could see. > >> > >> Relevant snippet of /proc/mounts on client (rsize,wsize are being > >> automatically negotiated, not specified in the automount options): > >> pez.votesmart.org:/public /export/public nfs4 > >> rw,relatime,vers=3D4,rsize=3D65536,wsize=3D65536,namlen=3D255,hard,pro= to=3Dtcp,timeo=3D600,retrans=3D2,sec=3Dkrb5,clientaddr=3D192.168.255.112,mi= norversion=3D0,addr=3D192.168.255.25 > >> 0 0 > >> pez.votesmart.org:/home/clinton /home/clinton nfs4 > >> rw,relatime,vers=3D4,rsize=3D65536,wsize=3D65536,namlen=3D255,hard,pro= to=3Dtcp,timeo=3D600,retrans=3D2,sec=3Dkrb5,clientaddr=3D192.168.255.112,mi= norversion=3D0,addr=3D192.168.255.25 > >> 0 0 > >> > >> nfsstat -e -s, with patches, after some stress testing: > >> Server Info: > >> Getattr Setattr Lookup Readlink Read Write Create Remove > >> 95334 1 28004 50 297125 2 0 0 > >> Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > >> 0 0 0 0 0 1242 0 1444 > >> Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf > >> 0 0 0 0 2 0 4 4 > >> Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock > >> 176735 0 0 21175 0 0 49171 0 > >> LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH > >> 0 0 21184 0 0 549853 0 17 > >> Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create > >> 0 21186 176735 0 0 0 > >> Server: > >> Retfailed Faults Clients > >> 0 0 1 > >> OpenOwner Opens LockOwner Locks Delegs > >> 291 2 0 0 0 > >> Server Cache Stats: > >> Inprog Idem Non-idem Misses CacheSize TCPPeak > >> 0 0 0 549969 291 2827 > >> > > Yes, these stats look reasonable. > > (and sorry if the mail system I use munged the whitespace) > > > >> nfsstat -e -s, prior to patches, general usage: > >> > >> Server Info: > >> Getattr Setattr Lookup Readlink Read Write Create Remove > >> 2813477 62661 382636 1419 837492 2115422 0 33976 > >> Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > >> 31164 1310 0 0 0 15678 10 307236 > >> Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf > >> 0 0 2 1 144550 0 43 43 > >> Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock > >> 1462595 0 595 11267 0 0 550761 280674 > >> LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH > >> 155 212299 286615 0 0 6651006 0 1234 > >> Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create > >> 256784 320761 1495805 0 0 738 > >> Server: > >> Retfailed Faults Clients > >> 0 0 3 > >> OpenOwner Opens LockOwner Locks Delegs > >> 6 178 8012 2 0 > >> Server Cache Stats: > >> Inprog Idem Non-idem Misses CacheSize TCPPeak > >> 0 0 96 6876610 8084 13429 > >> > > Hmm. LockOwners have the same property as OpenOwners in that the > > server is required to hold onto the last reply in the cache until > > the Open/Lock Owner is released. Unfortunately, a server can't > > release a LockOwner until either the client issues a > > ReleaseLockOwner > > operation to tell the server that it will no longer use the > > LockOwner > > or the open is closed. > > > > These stats suggest that the client tried to do byte range locking > > over 8000 times with different LockOwners (I don't know how the > > Linux > > client decided to use a different LockOwner?), for file(s) that were > > still open. (When I test using the Fedora15 client, I do see > > ReleaseLockOwner operations, but usually just before a close. I > > don't > > know how recently that was added to the Linux client. > > ReleaseLockOwner > > was added just before the RFC was published to try and deal with a > > situation where the client uses a lot of LockOwners that the server > > must > > hold onto until the file is closed. > > > > If this is legitimate, all that can be done is increase > > NFSRVCACHE_FLOODLEVEL and hope that you can find a value large > > enough > > that the clients don't bump into it without exhausting mbufs. (I'd > > increase "kern.ipc.nmbclusters" to something larger than what you > > set NFSRVCACHE_FLOODLEEVEL to.) > > > > However, I suspect the 8084 LockOwners is a result of some other > > problem. Fingers and toes crossed that it was a side effect of the > > cache SMP bugs fixed by cache.patch. (noopen.patch won't help for > > this case, because it appears to be lockowners and not openowners > > that are holding the cached entries, but it won't do any harm, > > either.) > > > > If you see very large LockOwner counts again, with the patched > > kernel, all I can suggest is doing a packet capture and emailing > > it to me. "tcpdump -s 0 -w xxx" run for a short enough time > > that "xxx" isn't huge when run on the server > > might catch some issue (like the client retrying a lock over and > > over > > and over again). A packet capture might also show if the Ubuntu > > client > > is doing ReleaseLockOwner operations. (Btw, you can look at the > > trace > > using wireshark, which knows about NFSv4.) >=20 > Running four clients now and the LockOwners are steadily climbing, > nfsstat consistently reported it as 0 prior to users logging into the > nfsv4 test systems - my testing via ssh didn't show anything like > this. Attached tcpdump file is from when I first noticed the jump in > LockOwners from 0 to ~600. I tried wireshark on this and didn't see > any releaselockowner operations. >=20 > Server Info: > Getattr Setattr Lookup Readlink Read Write Create Remove > 1226807 47083 54617 175 1128558 806036 0 695 > Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > 606 72 0 0 0 3189 0 13848 > Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf > 0 0 0 0 7645 0 9 9 > Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock > 246079 0 22 73672 0 0 141287 7076 > LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH > 10 6218 89443 0 0 2516897 0 1836 > Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create > 0 90421 246804 0 0 47 > Server: > Retfailed Faults Clients > 0 0 4 > OpenOwner Opens LockOwner Locks Delegs > 6 242 2481 22 0 > Server Cache Stats: > Inprog Idem Non-idem Misses CacheSize TCPPeak > 0 0 2 2518251 2502 4772 >=20 > Thanks again for your help on this >=20 Well, I looked at the packet trace and I'm afraid what I see is pretty well a worst case scenario for an NFSv4.0 server. At line #261 and #309, the client does a LOCK op with a different new lockowner for the same open/file. I see a few opens/closes, but no close for this open and no ReleaseLockOwner op. I suspect the file (fh CRC 0x1091fd96) is being kept open (until the user logs out?) and every now and again, a fresh lock_owner is created, followed by a few lock ops. If you look at the Lock ops at packets #261 and #309, you'll notice the sequence# as a56 and a57 respectively. This indicates that 2647 operations like these Locks have been done in the Open. Bottom line... The server can't throw the lock_owners away until the client closes the file and it looks like the # of lock_owners (each with a cached reply, which is also required by the protocol) is just gonna keep growing. I think you'll either need to figure out a way to get the file closed (user logging out and then back in, maybe?) and then increase the flood level enough that the users don't hit it. OR switch to a NFSv3 mount. rick ps: I hope you didn't mind me putting the mailing list back on the cc.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1487604530.809805.1311189226746.JavaMail.root>