From owner-freebsd-bugs@FreeBSD.ORG Tue Mar 30 17:00:20 2010 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A8BEC106564A for ; Tue, 30 Mar 2010 17:00:20 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 977768FC12 for ; Tue, 30 Mar 2010 17:00:20 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o2UH0JIP040381 for ; Tue, 30 Mar 2010 17:00:19 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o2UH0JcD040377; Tue, 30 Mar 2010 17:00:19 GMT (envelope-from gnats) Date: Tue, 30 Mar 2010 17:00:19 GMT Message-Id: <201003301700.o2UH0JcD040377@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Rich Cc: Subject: Re: misc/145189: nfsd performs abysmally under load X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Rich List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Mar 2010 17:00:20 -0000 The following reply was made to PR misc/145189; it has been noted by GNATS. From: Rich To: Bruce Evans Cc: freebsd-gnats-submit@freebsd.org, freebsd-bugs@freebsd.org Subject: Re: misc/145189: nfsd performs abysmally under load Date: Tue, 30 Mar 2010 12:29:37 -0400 On Tue, Mar 30, 2010 at 11:50 AM, Bruce Evans wrote: > Does it work better when limited to 1 thread (nfsd -n 1)? =A0In at least > some versions of it (or maybe in nfsiod), multiple threads fight each oth= er > under load. It doesn't seem to - nfsd -n 1 still ranges between 1-3 MB/s for files > RAM on server or client (6 and 4 GB, respectively). >> For instance, copying a 4GB file over NFSv3 from a ZFS filesystem with t= he >> following flags >> [rw,nosuid,hard,intr,nofsc,tcp,vers=3D3,rsize=3D8192,wsize=3D8192,sloppy= ,addr=3DX.X.X.X](Linux >> client, the above is the server), I achieve 2 MB/s, fluctuating between = 1 >> and 3. (pv reports 2.23 MB/s avg) >> >> Locally, on the server, I achieve 110-140 MB/s (at the end of pv, it >> reports 123 MB/s avg). >> >> I'd assume network latency, but nc with no flags other than port achieve= s >> 30-50 MB/s between server and client. >> >> Latency is also abysmal - ls on a randomly chosen homedir full of files, >> according to time, takes: >> real =A0 =A00m15.634s >> user =A0 =A00m0.012s >> sys =A0 =A0 0m0.097s >> while on the local machine: >> real =A0 =A00m0.266s >> user =A0 =A00m0.007s >> sys =A0 =A0 0m0.000s > > It probably is latency. =A0nfs is very latency-sensitive when there are l= ots > of small files. =A0Transfers of large files shouldn't be affected so much= . Sure, and next on my TODO is to look into whether 9.0-CURRENT makes certain ZFS high-latency things perform better. >> The server in question is a 3GHz Core 2 Duo, running FreeBSD RELENG_8. T= he >> kernel conf, DTRACE_POLL, is just the stock AMD64 kernel with all of the >> DTRACE-related options turned on, as well as the option to enable pollin= g in >> the NIC drivers, since we were wondering if that would improve our >> performance. > > Enabling polling is a good way to destroy latency. =A0A ping latency of > more that about 50uS causes noticable loss of performance for nfs, but > LAN latency is usually a few times higher than that, and polling without > increasing the clock interrupt frequency to an excessively high value > gives a latency of at least 20 times higher than that. =A0Also, -current > with debugging options is so bloated that even localhost has a ping > latency of about 50uS on a Core2 (up from 2uS for FreeBSD-4 on an > AthlonXP). =A0Anyway try nfs on localhost to see if reducing the latency > helps. Actually, we noticed that throughput appeared to get marginally better whil= e causing occasional bursts of crushing latency, but yes, we have it on in th= e kernel without using it in any actual NICs at present. :) But yes, I'm getting 40-90+ MB/s, occasionally slowing to 20-30 MB/s, average after copying a 6.5 GB file of 52.7 MB/s, on localhost IPv4, with no additional mount flags. {r,w}size=3D8192 on localhost goes up to 80-100 MB/s, with occasional sinks to 60 (average after copying another, separate 6.5 GB file: 77.3 MB/s). Also: 64 bytes from 127.0.0.1: icmp_seq=3D0 ttl=3D64 time=3D0.015 ms 64 bytes from 127.0.0.1: icmp_seq=3D1 ttl=3D64 time=3D0.049 ms 64 bytes from 127.0.0.1: icmp_seq=3D2 ttl=3D64 time=3D0.012 ms 64 bytes from [actual IP]: icmp_seq=3D0 ttl=3D64 time=3D0.019 ms 64 bytes from [actual IP]: icmp_seq=3D1 ttl=3D64 time=3D0.015 ms >> We tested this with a UFS directory as well, because we were curious if >> this was an NFS/ZFS interaction - we still got 1-2 MB/s read speed and >> horrible latency while achieving fast throughput and latency local to th= e >> server, so we're reasonably certain it's not "just" ZFS, if there is ind= eed >> any interaction there. > > After various tuning and bug fixing (now partly committed by others) I ge= t > improvements like the following on low-end systems with ffs (I don't use > zfs): > - very low end with 100Mbps ethernet: little change; bulk transfers alway= s > =A0went at near wire speed (about 10 MB/S) > - low end with 1Gbps/S: bulk transfers up from 20MB/S to 45MB/S (local ff= s > =A050MB/S). =A0buildworld over nfs of 5.2 world down from 1200 seconds to= 800 > =A0seconds (this one is very latency-sensitive. =A0Takes about 750 second= s on > =A0local ffs). Is this on 9.0-CURRENT, or RELENG_8, or something else? >> Read speed of a randomly generated 6500 MB file on UFS over NFSv3 with t= he >> same flags as above: 1-3 MB/s, averaging 2.11 MB/s >> Read speed of the same file, local to the server: consistently between >> 40-60 MB/s, averaging 61.8 MB/s [it got faster over time - presumably UF= S >> was aggressively caching the file, or something?] > > You should use a file size larger than the size of main memory to prevent > caching, especially for reads. =A0That is 1GB on my low-end systems. I didn't mention the server's RAM, explicitly, but it has 6 GB of real RAM, and the files used were 6.5-7 GB each in that case (I did use a 4GB file earlier - I've avoided doing that again here). >> Read speed of the same file over NFS again, after the local test: >> Amusingly, worse (768 KB/s-2.2 MB/s, with random stalls - average report= ed >> 270 KB/s(!)). > > The random stalls are typical of the problem with the nfsd's getting > in each other's way, and/or of related problems. =A0The stalls that I > saw were very easy to see in real time using "netstat -I > 1" -- they happened every few seconds and lasted a second or 2. =A0But > they were never long enough to reduce the throughput by more than a > factor of 3, so I always got over 19 MB/S. =A0The throughput was reduced > by approximately the ratio of stalled time to non-stalled time. I believe it. I'm seeing at least partially similar behavior here, when I mention the performance drops where transfer briefly pauses and then picks up again in the localhost case, even with nfsd -n 1 and nfsiod -n 1. - Rich