Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 26 Mar 2006 07:13:52 +1100
From:      Peter Jeremy <peterjeremy@optushome.com.au>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        stable@freebsd.org
Subject:   Re: Reading via mmap stinks (Re: weird bugs with mmap-ing via NFS)
Message-ID:  <20060325201351.GH703@turion.vk2pj.dyndns.org>
In-Reply-To: <200603251829.k2PITH5D014732@apollo.backplane.com>
References:  <200603211607.30372.mi%2Bmx@aldan.algebra.com> <200603231403.36136.mi%2Bmx@aldan.algebra.com> <200603232048.k2NKm4QL067644@apollo.backplane.com> <200603231626.19102.mi%2Bmx@aldan.algebra.com> <200603232316.k2NNGBka068754@apollo.backplane.com> <20060324084940.GA703@turion.vk2pj.dyndns.org> <200603241800.k2OI0KF8005579@apollo.backplane.com> <20060325094207.GD703@turion.vk2pj.dyndns.org> <200603251829.k2PITH5D014732@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 2006-Mar-25 10:29:17 -0800, Matthew Dillon wrote:
>    Really odd.  Note that if your disk can only do 25 MBytes/sec, the
>    calculation is: 2052167894 / 25MB = ~80 seconds, not ~60 seconds 
>    as you would expect from your numbers.

systat was reporting 25-26 MB/sec.  dd'ing the underlying partition gives
27MB/sec (with 24 and 28 for adjacent partions).

>    This type of situation *IS* possible as a side effect of other
>    heuristics.  It is particularly possible when you combine read() with
>    mmap because read() uses a different heuristic then mmap() to
>    implement the read-ahead.  There is also code in there which depresses
>    the page priority of 'old' already-read pages in the sequential case.
>    So, for example, if you do a linear grep of 2GB you might end up with
>    a cache state that looks like this:

If I've understood you correctly, this also implies that the timing
depends on the previous two scans, not just the previous scan.  I
didn't test all combinations of this but would have expected to see
two distinct sets of mmap/read timings - one for read/mmap/read and
one for mmap/mmap/read.

>    I need to change it to randomly retain swaths of pages, the
>    idea being that it should take repeated runs to rebalance the VM cache
>    rather then allowing a single run to blow it out or allowing a 
>    static set of pages to be retained indefinitely, which is what your
>    tests seem to show is occuring.

I dont think this sort of test is a clear indication that something is
wrong.  There's only one active process at any time and it's performing
a sequential read of a large dataset.  In this case, evicting already
cached data to read new data is not necessarily productive (a simple-
minded algorithm will be evicting data this is going to be accessed in
the near future).

Based on the timings, mmap/read case manages to retain ~15% of the file
in cache.  Given the amount of RAM available, the theoretical limit is
about 40% so this isn't too bad.  It would be nicer if both read and
mmap managed this gain, irrespective of how the data had been previously
accessed.

-- 
Peter Jeremy



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060325201351.GH703>