From owner-freebsd-stable@FreeBSD.ORG  Sat Mar 25 14:20:25 2006
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: stable@freebsd.org
Delivered-To: freebsd-stable@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 4D8F816A444;
	Sat, 25 Mar 2006 14:20:25 +0000 (UTC)
	(envelope-from mi+kde@aldan.algebra.com)
Received: from aldan.algebra.com (aldan.algebra.com [216.254.65.224])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B9EDF43D55;
	Sat, 25 Mar 2006 14:20:24 +0000 (GMT)
	(envelope-from mi+kde@aldan.algebra.com)
Received: from aldan.algebra.com (aldan [127.0.0.1])
	by aldan.algebra.com (8.13.6/8.13.6) with ESMTP id k2PEKEpo006283
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 25 Mar 2006 09:20:14 -0500 (EST)
	(envelope-from mi+kde@aldan.algebra.com)
Received: from localhost (localhost [[UNIX: localhost]])
	by aldan.algebra.com (8.13.6/8.13.6/Submit) id k2PEKEtC006282;
	Sat, 25 Mar 2006 09:20:14 -0500 (EST)
	(envelope-from mi+kde@aldan.algebra.com)
From: Mikhail Teterin <mi+kde@aldan.algebra.com>
To: Peter Jeremy <peterjeremy@optushome.com.au>
Date: Sat, 25 Mar 2006 09:20:13 -0500
User-Agent: KMail/1.8.2
References: <200603232352.k2NNqPS8018729@gate.bitblocks.com>
	<200603241518.01027.mi+mx@aldan.algebra.com>
	<20060325103927.GE703@turion.vk2pj.dyndns.org>
In-Reply-To: <20060325103927.GE703@turion.vk2pj.dyndns.org>
X-Face: %UW#n0|w>ydeGt/b@1-.UFP=K^~-:0f#O:D7w<gv/&E-lL7twZCT8B~/PA4|\t$ti+22K">hJ5G_<5143Bb3kOIs9XpX+"V+~$adGP:J|SLieM31VIhqXeLBli"<kcG^EOVihy+z3/UR{6SCQ
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200603250920.14208@aldan>
Cc: alc@freebsd.org, Mikhail Teterin <mi+mx@aldan.algebra.com>,
	stable@freebsd.org
Subject: Re: Reading via mmap stinks (Re: weird bugs with mmap-ing via NFS)
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 25 Mar 2006 14:20:25 -0000

On Saturday 25 March 2006 05:39 am, Peter Jeremy wrote:
= On Fri, 2006-Mar-24 15:18:00 -0500, Mikhail Teterin wrote:
= >which there is not with the read. Read also requires fairly large
= >buffers in the user space to be efficient -- *in addition* to the
= >buffers in the kernel. 
= 
= I disagree.  With a filesystem read, the kernel is solely responsible
= for handling physical I/O with an efficient buffer size. The userland
= buffers simply amortise the cost of the system call and copyout
= overheads.

I don't see a disagreement in the above :-) Mmap API can be slightly faster 
than read -- kernel is still "responsible for handling physical I/O with an 
efficient buffer size". But instead of copying the data out after reading, it 
can read it directly into the process' memory.

= >I'm also quite certain, that fulfulling my "demands" would add quite a
= >bit of complexity to the mmap support in kernel, but hey, that's what the
= > kernel is there for :-)
= 
= Unfortunately, your patches to implement this seem to have become detached
= from your e-mail. :-)

If I manage to *convince* someone, that there is a problem to solve, I'll 
consider it a good contribution to the project...

= mmap can lend itself to cleaner implementatione because there's no
= need to have a nested loop to read buffers and then process them.  You
= can mmap then entire file and process it.  The downside is that on a
= 32-bit architecture, this limits you to processing files that are
= somewhat less than 2GB.

First, only one of our architectures is 32-bit :-) On 64-bit systems, the 
addressable memory (kind of) matches the maximum file size. Second even with 
the loop reading/processing chunks at a time, the implementation is cleaner, 
because it does not need to allocate any memory nor try to guess, which 
buffer size to pick for optimal performance, nor align the buffers on pages 
(which grep is doing, for example, rather hairily).

= The downside is that touching an uncached page triggers a trap which may
= not be as efficient as reading a block of data through the filesystem
= interface, and I/O errors are delivered via signals (which may not be as
= easy to handle).

My point exactly. It does seem to be less efficient *at the moment* and I
am trying to have the kernel support for this cleaner method of reading 
*improved*. By convincing someone with a clue to do it, that is... :-)

= >Would you care to look at my program instead? Thanks:
= >
= >	http://aldan.algebra.com/mzip.c

I'm sorry, that should be  http://aldan.algebra.com/~mi/mzip.c -- I checked 
this time :-(

= I tried writing a program that just mmap'd my entire (2GB) test file
= and summed all the longwords in it.

The files I'm dealing with are database dumps -- 10-80Gb :-) Maybe, that's,
what triggers some pessimal case?..

Thanks! Yours,

	-mi