From owner-freebsd-performance@FreeBSD.ORG Mon Jan 9 14:43:35 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2DFC7106566B for ; Mon, 9 Jan 2012 14:43:35 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au [211.29.132.183]) by mx1.freebsd.org (Postfix) with ESMTP id A29A38FC16 for ; Mon, 9 Jan 2012 14:43:34 +0000 (UTC) Received: from c211-30-171-136.carlnfd1.nsw.optusnet.com.au (c211-30-171-136.carlnfd1.nsw.optusnet.com.au [211.30.171.136]) by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q09EhSS2002916 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 10 Jan 2012 01:43:29 +1100 Date: Tue, 10 Jan 2012 01:43:28 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Bruce Evans In-Reply-To: <20120104000111.K6684@besplex.bde.org> Message-ID: <20120110013455.D2530@besplex.bde.org> References: <20120103073736.218240@gmx.com> <20120103083454.GA22673@zlo.nu> <20120104000111.K6684@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Marc Olzheim , Garrett Cooper , freebsd-performance@freebsd.org, Dieter BSD Subject: Re: cmp(1) has a bottleneck, but where? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Jan 2012 14:43:35 -0000 On Wed, 4 Jan 2012, Bruce Evans wrote: > On Tue, 3 Jan 2012, Marc Olzheim wrote: > >> On Tue, Jan 03, 2012 at 12:21:10AM -0800, Garrett Cooper wrote: >>> The file is 3.0GB in size. Look at all those page faults though! >>> Thanks! >>> -Garrett >> >> From usr.bin/cmp/c_regular.c: >> >> #define MMAP_CHUNK (8*1024*1024) >> ... >> for (..) { >> mmap() chunk of size MMAP_CHUNK. >> compare >> munmap()k >> } >> >> That 8 MB chunk size sounds like a bad plan to me. I can imagine >> something needed to be done to compare files larger than X GB on a 32bit >> system, but 8MB is pretty small... > > 8MB is more than large enough. It works at disk speed in my tests. cp > still uses this value. Old versions of cmp used the bogus value of > ... > In my tests, using "-" for one of the files mainly takes lots more user > time. It only reduces the real time by 25%. This is on a core2. On > a system with a slow CPU, it is easy for getc() to be much slower than > the disk. More careful tests showed serious slowness when the combined file sizes exceeded the cache size. cmp takes an enormous amount of CPU (see another reply), and this seems to be done mostly in series with i/o, so the total time increases too much. A smaller mmap() size or not using mmap() at all might improve paralellism. Bruce