From owner-freebsd-current@FreeBSD.ORG  Wed Jun 23 06:42:18 2004
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 06C2E16A4DE
	for <current@freebsd.org>; Wed, 23 Jun 2004 06:42:18 +0000 (GMT)
Received: from aldan.algebra.com (aldan.algebra.com [216.254.65.224])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 5D77D43D53
	for <current@freebsd.org>; Wed, 23 Jun 2004 06:42:17 +0000 (GMT)
	(envelope-from mi@aldan.algebra.com)
Received: from aldan.algebra.com (mi@localhost [127.0.0.1])
	by aldan.algebra.com (8.12.11/8.12.11) with ESMTP id i5N6fM6g048265
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 23 Jun 2004 02:41:23 -0400 (EDT)
	(envelope-from mi@aldan.algebra.com)
Received: from localhost (localhost [[UNIX: localhost]])
	by aldan.algebra.com (8.12.11/8.12.11/Submit) id i5N6fJOL048264;
	Wed, 23 Jun 2004 02:41:19 -0400 (EDT)
	(envelope-from mi)
From: Mikhail Teterin <mi+kde@aldan.algebra.com>
To: Peter Wemm <peter@wemm.org>
Date: Wed, 23 Jun 2004 02:41:17 -0400
User-Agent: KMail/1.6.2
References: <Pine.BSF.4.21.0406201716191.23541-100000@InterJet.elischer.org>
	<200406220108.31366@aldan> <200406222027.30702.peter@wemm.org>
In-Reply-To: <200406222027.30702.peter@wemm.org>
X-Face: %UW#n0|w>ydeGt/b@1-.UFP=K^~-:0f#O:D7w<gv/&E-lL7twZCT8B~/PA4|\t$ti+22K">hJ5G_<5143Bb3kOIs9XpX+"V+~$adGP:J|SLieM31VIhqXeLBli"<kcG^EOVihy+z3/UR{6SCQ
MIME-Version: 1.0
Content-Disposition: inline
Content-Type: text/plain;
  charset="koi8-u"
Content-Transfer-Encoding: 7bit
Message-Id: <200406230241.18132@aldan>
X-Mailman-Approved-At: Wed, 23 Jun 2004 11:54:04 +0000
cc: current@freebsd.org
cc: questions@freebsd.org
cc: Mikhail Teterin <Mikhail.Teterin@Murex.com>
cc: freebsd-current@freebsd.org
cc: Julian Elischer <julian@elischer.org>
Subject: Re: read vs. mmap (or io vs. page faults)
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 23 Jun 2004 06:42:18 -0000

On Tuesday 22 June 2004 11:27 pm, Peter Wemm wrote:

= mmap is more valuable as a programmer convenience these days. Don't
= make the mistake of assuming its faster, especially since the cost of
= a copy has gone way down.

Actually, let me back off from agreeing with you here :-) On io-bound
machines (such as my laptop), there is no discernable difference in
either the CPU or the elapsed time -- md5-ing a file with mmap or read
is (curiously) slightly faster than just cat-ing it into /dev/null.

On an dual P2 450MHz, the single process always wins the CPU time and
sometimes the elapsed time. Sometimes it wins handsomly:

	mmap: 35.271u 4.004s 1:06.08 59.4%   10+190k 0+0io 4185pf+0w
	read: 32.134u 15.797s 1:58.72 40.3%  408+302k 11228+0io 12pf+0w

or

	mmap: 35.039u 4.558s 1:10.27 56.3%    10+190k 5+0io 5028pf+0w
	read: 29.931u 27.848s 2:07.17 45.4%   10+187k 11219+0io 5pf+0w

Mind you, both of the two processors are Xeons with _2Mb of cache on
each_, so memory copying should be even cheaper on them than usual. And
yet mmap manages to win...

On a single P2 400MHz (standard 521Kb cache) mmap always wins the CPU
time, and, thanks to that, can win the elapsed time on a busy system.
For example, running two of these processes in parallel (on two separate
copies of the same huge file residing on distinct disks) yields (same
1462726660-byte file as in the dual Xeon stats above):

	mmap: 66.989u 7.584s 3:01.76 41.0%    5+238k 90+0io 22456pf+0w
	      65.474u 7.729s 2:38.59 46.1%    5+241k 90+0io 22401pf+0w
	read: 60.724u 42.394s 3:37.01 47.5%   5+241k 22541+0io 0pf+0w
	      61.778u 41.987s 3:35.36 48.1%   5+239k 11256+0io 0pf+0w

That's 182 vs. 215 seconds, or 15% elapsed time win for mmap. Evidently,
mmap runs through that "nasty nasty code" faster than read runs through
its. mmap loses on an idle system, I presume, because page-faulting is
not smart enough to page-fault ahead as efficiently as read pre-reads
ahead.

Why am I complaining then? Because I want the "nasty nasty code"
improved so that using mmap is beneficial for the single process too.

Thank you very much! Yours,

	-mi