Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 Nov 2011 17:47:57 -0800
From:      Peter Wemm <peter@wemm.org>
To:        Alfred Perlstein <alfred@freebsd.org>
Cc:        Bruce Cran <bruce@cran.org.uk>, Ed Schouten <ed@80386.nl>, Paul Saab <ps@mu.org>, Jilles Tjoelker <jilles@stack.nl>, arch@freebsd.org, freebsd-arch@freebsd.org
Subject:   Re: [PATCH] fadvise(2) system call
Message-ID:  <CAGE5yCp=qox=O0o=sadJYUqzPYOhz9LBXqaEmD037a4Fs5DdzA@mail.gmail.com>
In-Reply-To: <20111110012542.GA6110@elvis.mu.org>
References:  <201110281426.00013.jhb@freebsd.org> <4EB2C9DD.9090606@FreeBSD.org> <20111104160319.GD6110@elvis.mu.org> <201111080800.32717.jhb@freebsd.org> <4EBB104F.5010000@cran.org.uk> <CAMYpurzXM3Yeko_LxtsdKgPsGPKW75W2cUFUq59oSb=CcAqqMA@mail.gmail.com> <20111110012542.GA6110@elvis.mu.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Nov 9, 2011 at 5:25 PM, Alfred Perlstein <alfred@freebsd.org> wrote=
:
> * Paul Saab <ps@mu.org> [111109 16:32] wrote:
>> On Wed, Nov 9, 2011 at 3:44 PM, Bruce Cran <bruce@cran.org.uk> wrote:
>> > On 08/11/2011 13:00, John Baldwin wrote:
>> >>
>> >> I think it would be fine to add flags to applications like 'tar' to a=
llow
>> >> users to alter their behavior in specific use cases when it makes sen=
se.
>> >> However, I think there are more workloads for 'tar' than the ones you=
 are
>> >> thinking of and we should be hesitant to change applications to use n=
on-
>> >> default settings.
>> >
>> > Someone's done that for GNU tar on Linux, adding a --no-oscache switch=
:
>> > http://www.mysqlperformanceblog.com/2010/04/02/fadvise-may-be-not-what=
-you-expect/
>>
>> So adding this support is good, but not for general purpose. =A0It's
>> really only good when you're pumping gigs of data through tar. =A0I did
>> this for libarchive =A0(plus other work for O_DIRECT reading and
>> creating the archive) for copying large amounts of data without
>> impacting a running system.. It worked great for this, but then it
>> absolutely fails when extracting a tar archive with millions of little
>> files because of all the sync operations.
>
> I've thought about this and it almost makes sense to have a secondary
> LRU that such pages would wind up in that is much smaller than the system
> one. =A0I'm pretty sure there are a number of papers on this, but I've no=
t
> looked over them in a long while.

We actually do have a fairly extensive anti-swamping mechanisms in
place, but they are in somewhat obscure places or are side effects of
other policies elsewhere.  eg: the vmio kva mapping space is limited
and the dirty fraction is enforced there.  This provides the file
write based anti-swamping back pressure.

This of course is completely bypassed with mmap reads/writes.  That's
why writing to a huge mmap file hits like a truck, but doing write()
to the same file does not.

>>
>> Anyway, this is a good option to enable and has very practical uses
>> out there, but it should be turned on with an option and not on by
>> default.
>
> What about the operation of just reading the tar archive itself?

Personally, I really don't want to blow away useful cache contents
with a tar file unless I explicitly say so.  I'm generally more
worried about keeping usefully cached random bits of files that are
scattered all over the drive than with a sequentially readable tar
file that is a best case scenario for file reads.

I'd rather read a 2GB file sequentially twice than to kick out 2GB of
random access data.

In any case, fadvise() is a tool that we should have.  Deciding on
tar(1) policy is an entirely separate thing.

--=20
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGE5yCp=qox=O0o=sadJYUqzPYOhz9LBXqaEmD037a4Fs5DdzA>