Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 05 Jul 2009 21:51:05 +0300
From:      Alexander Motin <mav@FreeBSD.org>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        freebsd-arch@freebsd.org
Subject:   Re: DFLTPHYS vs MAXPHYS
Message-ID:  <4A50F619.4020101@FreeBSD.org>
In-Reply-To: <20090706034250.C2240@besplex.bde.org>
References:  <4A4FAA2D.3020409@FreeBSD.org> <20090705100044.4053e2f9@ernst.jennejohn.org> <4A50667F.7080608@FreeBSD.org> <20090705223126.I42918@delplex.bde.org> <4A50BA9A.9080005@FreeBSD.org> <20090706005851.L1439@besplex.bde.org> <4A50DEE8.6080406@FreeBSD.org> <20090706034250.C2240@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Bruce Evans wrote:
> On Sun, 5 Jul 2009, Alexander Motin wrote:
> 
>> Bruce Evans wrote:
>>> I was thinking more of transfers to userland.  Increasing user buffer
>>> sizes above about half the L2 cache size guarantees busting the L2
>>> cache, if the application actually looks at all of its data.  If the
>>> data is read using read(), then the L2 cache will be busted twice (or
>>> a bit less with nontemporal copying), first by copying out the data
>>> and then by looking at it.  If the data is read using mmap(), then the
>>> L2 cache will only be busted once.  This effect has always been very
>>> noticeable using dd.  Larger buffer sizes are also bad for latency.
>> ...
>> How to reproduce that dd experiment? I have my system running with 
>> MAXPHYS of 512K and here is what I have:
> 
> I used a regular file with the same size as main memory (1G), and for
> today's test, not quite dd, but a program that throws away the data
> (so as to avoid overcall for write syscalls) and prints status info
> in a more suitable form than even dd's ^T.
> 
> Your results show that physio() behaves quite differently than copying
> reading a regular file.  I see similar behaviour input from a disk file.
> 
>> # dd if=/dev/ada0 of=/dev/null bs=512k count=1000
>> 1000+0 records in
>> 1000+0 records out
>> 524288000 bytes transferred in 2.471564 secs (212128024 bytes/sec)
> 
> 512MB would be too small with buffering for a regular file, but should
> be OK with a disk file.
> 
>> # dd if=/dev/ada0 of=/dev/null bs=256k count=2000
>> 2000+0 records in
>> 2000+0 records out
>> 524288000 bytes transferred in 2.666643 secs (196609752 bytes/sec)
>> # dd if=/dev/ada0 of=/dev/null bs=128k count=4000
>> 4000+0 records in
>> 4000+0 records out
>> 524288000 bytes transferred in 2.759498 secs (189993969 bytes/sec)
>> # dd if=/dev/ada0 of=/dev/null bs=64k count=8000
>> 8000+0 records in
>> 8000+0 records out
>> 524288000 bytes transferred in 2.718900 secs (192830927 bytes/sec)
>>
>> CPU load instead grows from 10% at 512K to 15% at 64K. May be trashing 
>> effect will only be noticeable at block comparable to cache size, but 
>> modern CPUs have megabytes of cache.
> 
> I used systat -v to estimate the load.  Its average jumps around more 
> than I
> like, but I don't have anything better.  Sys time from dd and others is 
> even
> more useless than it used to be since lots of the i/o runs in threads and
> the system doesn't know how to charge the application for thread time.
> 
> My results (MAXPHYS is 64K, transfer rate 50MB/S, under FreeBSD-~5.2
> de-geomed):
> 
> regular file:
> 
> block size    %idle
> ----------    -----
> 1M            87
> 16K           91
> 4K            88 (?)
> 512           72 (?)
> 
> disk file:
> 
> block size    %idle
> ----------    -----
> 1M            96
> 64K           96
> 32K           93
> 16K           87
> 8K            82 (firmware can't keep up and rate drops to 37MB/S)
> 
> In the case of the regular file, almost all i/o is clustered so the driver
> sees mainly the cluster size (driver max size of 64K before geom).  Upper
> layers then do a good job of only adding a few percent CPU when 
> declustering
> to 16K fs-blocks.

In this tests you've got almost only negative side of effect, as you 
have said, due to cache misses. Do you really have CPU with so small L2 
cache? Some kind of P3 or old Celeron? But with 64K MAXPHYS you just 
didn't get any benefit from using bigger block size.

-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A50F619.4020101>