From owner-svn-src-head@FreeBSD.ORG Mon Jan 30 20:12:31 2012 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D758106567D; Mon, 30 Jan 2012 20:12:31 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 655888FC1A; Mon, 30 Jan 2012 20:12:31 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [96.47.65.170]) by cyrus.watson.org (Postfix) with ESMTPSA id 1F4EF46B0C; Mon, 30 Jan 2012 15:12:31 -0500 (EST) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 95E99B915; Mon, 30 Jan 2012 15:12:30 -0500 (EST) From: John Baldwin To: src-committers@freebsd.org Date: Mon, 30 Jan 2012 15:12:30 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <201201301935.q0UJZGW7099426@svn.freebsd.org> In-Reply-To: <201201301935.q0UJZGW7099426@svn.freebsd.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <201201301512.30116.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 30 Jan 2012 15:12:30 -0500 (EST) Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org Subject: Re: svn commit: r230782 - head/sys/kern X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jan 2012 20:12:31 -0000 On Monday, January 30, 2012 2:35:16 pm John Baldwin wrote: > Author: jhb > Date: Mon Jan 30 19:35:15 2012 > New Revision: 230782 > URL: http://svn.freebsd.org/changeset/base/230782 > > Log: > Refine the implementation of POSIX_FADV_NOREUSE for the read(2) case such > that instead of using direct I/O it allows read-ahead similar to > POSIX_FADV_NORMAL, but invokes VOP_ADVISE(POSIX_FADV_DONTNEED) after the > read(2) has completed to purge just-read data. The write(2) path continues > to use direct I/O for POSIX_FADV_NOREUSE for now. Note that NOREUSE works > optimally if an application reads and writes full fs blocks. Oops, forgot: Tested by: jilles The NOREUSE bits may still need further refinement. For example, if we allow something along the lines of 'POSIX_FADV_NOREUSE | POSIX_FADV_SEQUENTIAL', then we could change the VOP_ADVISE() here to use 0 as the starting offset which should do a better job of not leaving data in RAM due to reading partial blocks. Also, sequentially reading a file on unaligned block offsets with NOREUSE can result in extraneous reads currently, and we could possibly alleviate those by changing DONTNEED to only flush wholly contained-blocks rather than wholly-contained pages from the backing VM object. However, without the previous change I suggested that will exacerbate the problem of NOREUSE not actually purging any data from RAM. The problem with the | approach though is that it is not portable, so it is not likely that portable programs like vlc will use it. HP/UX had an extended variant of fadvise() that allowed multiple policies to be set on a range, apparently to handle exactly this case (sequential and noreuse). The problem seems to be that noreuse is really orthogonal to the other access-pattern hints (normal vs random vs sequential). Finally, I've wondered if POSIX_FADV_SEQUENTIAL shouldn't just mandate the maximum read-ahead and write-clustering rather than using the heuristics. It's not completely clear if we did that what the "right" thing to do if an application does posix_fadvise(POSIX_FADV_SEQUENTIAL) followed by fcntl(F_READAHEAD) with a different size, esp. given that posix_fadvise() can theoretically only apply to a range of the file descriptor whereas F_READAHEAD applies globally to the file descriptor. -- John Baldwin