From owner-freebsd-fs@FreeBSD.ORG Tue Sep 10 09:44:16 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 9979C41E for ; Tue, 10 Sep 2013 09:44:16 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 1B0D428AC for ; Tue, 10 Sep 2013 09:44:15 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id 9DADF115FE6C; Tue, 10 Sep 2013 11:44:13 +0200 (CEST) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.3 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR: 17.0430] X-CRM114-CacheID: sfid-20130910_11441_DE3D786C X-CRM114-Status: Good ( pR: 17.0430 ) X-DSPAM-Result: Whitelisted X-DSPAM-Processed: Tue Sep 10 11:44:13 2013 X-DSPAM-Confidence: 0.9957 X-DSPAM-Probability: 0.0000 X-DSPAM-Signature: 522ee9ed6101076816758 X-DSPAM-Factors: 27, From*Attila Nagy , 0.00010, MFC, 0.00082, wrote+>>, 0.00133, cache, 0.00177, wrote+>, 0.00205, >>+>, 0.00304, >+>, 0.00353, >>+I, 0.00354, >>+>>, 0.00393, >>+>>, 0.00393, fixes, 0.00442, 215, 0.00442, 209, 0.00442, >>+Hi, 0.00482, Url*//people, 0.00482, the+machine, 0.00482, 208, 0.00482, Doing, 0.00482, >+it, 0.00530, From*Attila, 0.00530, wrote, 0.00532, wrote, 0.00532, threads, 0.00589, CPUs, 0.00662, )+>>, 0.00662, I+won't, 0.00662, X-Spambayes-Classification: ham; 0.00 Received: from japan.t-online.private (japan.t-online.co.hu [195.228.243.99]) by people.fsn.hu (Postfix) with ESMTPSA id A28D9115FE5D; Tue, 10 Sep 2013 11:44:11 +0200 (CEST) Message-ID: <522EE9EB.4010706@fsn.hu> Date: Tue, 10 Sep 2013 11:44:11 +0200 From: Attila Nagy MIME-Version: 1.0 To: Rick Macklem Subject: Re: High CPU usage with newnfs(d) - seems to be a cache issue References: <1721695444.20803113.1378767479011.JavaMail.root@uoguelph.ca> In-Reply-To: <1721695444.20803113.1378767479011.JavaMail.root@uoguelph.ca> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Sep 2013 09:44:16 -0000 Hi, On 09/10/13 00:57, Rick Macklem wrote: > Attila Nagy wrote: >> Hi, >> >> I've observed some insane CPU usage on stable/9@r255367. >> About the machine: >> CPU: Intel(R) Xeon(R) CPU E5620 @ 2.40GHz (2400.14-MHz >> K8-class CPU) >> real memory = 34359738368 (32768 MB) >> FreeBSD/SMP: Multiprocessor System Detected: 16 CPUs >> FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 SMT threads >> >> It does some NFS serving like this (now running oldnfs) -not quite >> peak >> times actually: >> # nfsstat -w 1 -os >> GtAttr Lookup Rdlink Read Write Rename Access Rddir >> 763 7206 1 175 92 0 915 3589 >> 748 7665 10 131 60 0 905 2923 >> 787 9657 23 204 50 0 974 2387 >> 517 9881 9 150 41 0 572 2321 >> 709 8708 71 235 70 0 1220 3271 >> 621 9157 9 254 208 0 928 2563 >> 699 5336 29 271 103 0 1242 3448 >> 656 4291 11 201 209 0 1119 3908 >> 506 3722 0 215 183 0 970 2516 >> 698 1476 1 151 66 0 903 2094 >> 501 2865 11 268 117 0 995 1392 >> 638 6284 46 233 47 0 1096 4847 >> 893 7909 47 175 73 0 870 4070 >> 651 3936 48 255 51 0 955 2514 >> 424 4211 17 223 29 0 745 1458 >> 589 8197 26 199 39 0 918 2983 >> >> It's being hammered by about 40 machines on multiple connections (it >> has >> 35 UFS file systems exported). >> >> When running newnfs (admittedly in some stupid way, with -n 32, the >> profiling was made with this, maybe this causes some lock >> contention), >> it occasionally eats 1600% CPU (means: 0 idle). >> Lowering the thread number doesn't really solves the problem, I've >> seen >> -n X*100 CPU usage peaks lately on machines with lower (4-8) -n >> counts... >> >> Doing a profiling with pmc shows that most of the time is spent in >> nfsrvd_updatecache and nfsrvd_getcache: >> http://pastebin.com/knyppv4d >> >> Switching back to oldnfsd (even with -n 32) gives a stable 50-60% CPU >> usage (out of the "possible" 1600%) when loaded. >> >> I know that there are some changes regarding this cache in the >> CURRENT >> code (along with the possibility to set some values with sysctls), >> but I >> can't run CURRENT. >> >> Any ideas on how to improve newnfsd, so we can continue serving NFS >> in >> the future days, where I won't be able to switch back to the old one? >> :) >> > Well, I put a 1 month MFC on r254337 (which I believe fixes this), so > it should be in stable/9 in about a week. Alternately, an uglier (but > semantically equivalent) patch can be found at: > http://people.freebsd.org/~rmacklem/drc4-stable9.patch > Great, I'm eagerly waiting for this to happen then. :)