From owner-svn-src-all@FreeBSD.ORG Mon May 30 15:25:15 2011 Return-Path: Delivered-To: svn-src-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 816771065673; Mon, 30 May 2011 15:25:15 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au [211.29.132.183]) by mx1.freebsd.org (Postfix) with ESMTP id 1AFEA8FC15; Mon, 30 May 2011 15:25:14 +0000 (UTC) Received: from c122-106-165-191.carlnfd1.nsw.optusnet.com.au (c122-106-165-191.carlnfd1.nsw.optusnet.com.au [122.106.165.191]) by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p4UFP3iJ016780 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 31 May 2011 01:25:05 +1000 Date: Tue, 31 May 2011 01:25:03 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: mdf@FreeBSD.org In-Reply-To: Message-ID: <20110531004247.C4034@besplex.bde.org> References: <201105131848.p4DIm1j7079495@svn.freebsd.org> <201105282103.43370.pieter@degoeje.nl> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-64904466-1306769103=:4034" Cc: svn-src-head@FreeBSD.org, Pieter de Goeje , svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r221853 - in head/sys: dev/md dev/null sys vm X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 May 2011 15:25:15 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-64904466-1306769103=:4034 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Sat, 28 May 2011 mdf@FreeBSD.org wrote: > On Sat, May 28, 2011 at 12:03 PM, Pieter de Goeje wro= te: >> On Friday 13 May 2011 20:48:01 Matthew D Fleming wrote: >>> Author: mdf >>> Date: Fri May 13 18:48:00 2011 >>> New Revision: 221853 >>> URL: http://svn.freebsd.org/changeset/base/221853 >>> >>> Log: >>> =A0 Usa a globally visible region of zeros for both /dev/zero and the m= d >>> =A0 device. =A0There are likely other kernel uses of "blob of zeros" th= an can >>> =A0 be converted. >>> >>> =A0 Reviewed by: =A0 =A0 =A0 =A0alc >>> =A0 MFC after: =A01 week >> >> This change seems to reduce /dev/zero performance by 68% as measured by = this >> command: dd if=3D/dev/zero of=3D/dev/null bs=3D64k count=3D100000. >> >> x dd-8-stable >> + dd-9-current >> +-----------------------------------------------------------------------= --+ >> |+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= | Argh, hard \xa0. [...binary garbage deleted] >> This particular measurement was against 8-stable but the results are the= same >> for -current just before this commit. Basically througput drops from >> ~13GB/sec to 4GB/sec. >> >> Hardware is a Phenom II X4 945 with 8GB of 800Mhz DDR2 memory. FreeBSD/a= md64 >> is installed. This processor has 6MB of L3 cache. >> >> To me it looks like it's not able to cache the zeroes anymore. Is this >> intentional? I tried to change ZERO_REGION_SIZE back to 64K but that did= n't >> help. > > Hmm. I don't have access to my FreeBSD box over the weekend, but I'll > run this on my box when I get back to work. > > Meanwhile you could try setting ZERO_REGION_SIZE to PAGE_SIZE and I > think that will restore things to the original performance. Using /dev/zero always thrashes caches by the amount + (unless the arch uses nontemporal memory accesses for uiomove, which none do AFAIK). So a large source buffer is always just a pessimization. A large target buffer size is also a pessimization, but for the target buffer a fairly large size is needed to amortize the large syscall costs. In this PR, the target buffer size is 64K. ZERO_REGION_SIZE is 64K on i386 and 2M on amd64. 64K+64K on i386 is good for thrashing the L1 cache. It will only have a noticeable impact on a current L2 cache in competition with other threads. It is hard to fit everything in the L1 cache even with non-bloated buffer sizes and 1 thread (16 for the source (I)cache, 0 for the source (D)cache and 4K for the target cache might work). On amd64, 2M+2M is good for thrashing most L2 caches. In this PR, the thrashing is limited by the target buffer size to about 64K+64K, up from 4K+64K, and it is marginal whether the extra thrashing from the larger source buffer makes much difference. The old zbuf source buffer size of PAGE_SIZE was already too large. The source buffer size only needs to be large enough to amortize loop overhead. 1 cache line is enough in most cases. uiomove() and copyout() unfortunately don't support copying from register space, so there must be a source buffer. This may limit the bandwidth by a factor of 2 in some cases, since most modern CPUs can execute either 2 64-bit stores or 1 64-bit store and 1 64-bit load per cycle if everything is already in the L1 cache. However, target buffers for /dev/zero (or any user i/o) probably need to be larger than the L1 cache to amortize the syscall overhead, so there are usually plenty of cycles to spare for the unnecessary loads while the stores wait for caches. This behaviour is easy to see for regular files too (regular files get copied out from the buffer cache). You have limited control on the amount of thrashing by changing the target buffer size, and can determine cache sizes by looking at throughputs. Bruce --0-64904466-1306769103=:4034--