FreeBSD Mail Archives

Date:      Thu, 08 Mar 2007 05:41:09 +0100
From:      Fluffles <etc@fluffles.net>
To:        Artem Kuchin <matrix@itlegion.ru>
Cc:        freebsd-stable@freebsd.org, freebsd-geom@freebsd.org
Subject:   Re: Some Unix benchmarks for those who are interesed
Message-ID:  <45EF93E5.50804@fluffles.net>
In-Reply-To: <007901c760fc$71e708a0$0c00a8c0@Artem>
References:  <20070306020826.GA18228@nowhere>	<45ECF00D.3070101@samsco.org><20070306050312.GA2437@nowhere><008101c75fcc$210c74a0$0c00a8c0@Artem>	<esk9vq$uhh$1@sea.gmane.org><001a01c7601d$5d635ee0$0c00a8c0@Artem>	<eskka8$adn$1@sea.gmane.org><001801c7603a$5339e020$0c00a8c0@Artem>	<eskpd1$sm4$1@sea.gmane.org>	<20070307105144.1d4a382f@daydream.goid.lan><002801c760e2$5cb5eb50$0c00a8c0@Artem>	<esmvnp$khs$1@sea.gmane.org><005b01c760e6$9a798bf0$0c00a8c0@Artem>	<esn2s6$1i9$1@sea.gmane.org> <001601c760ee$f76fa300$0c00a8c0@Artem> <45EF2215.2080402@fluffles.net> <007901c760fc$71e708a0$0c00a8c0@Artem>

Artem Kuchin wrote:
>
> ----- Original Message ----- From: "Fluffles" <etc@fluffles.net>
>> If you use dd on the raw device (meaning no UFS/VFS) there is no
>> read-ahead. This means that the following DD-command will give lower STR
>> read than the second:
>>
>> no read-ahead:
>> dd if=/dev/mirror/data of=/dev/null bs=1m count=1000
>> read-ahead and multiple I/O queue depth:
>> dd if=/mounted/mirror/volume of=/dev/null bs=1m count=1000
>>
>> You can test read STR best with bonnie (see
>> /usr/ports/benchmarks/bonnie); or just with DD on a mounted volume. You
>> should mount with -o noatime to avoid useless writes during reading, or
>> use soft updates to prevent meta data from taking it's toll on I/O
>> performance.
>>
>
> Totall disagree. On the following reasons:
> 1) Read ahead is simply useless when stream-reading  (sequential) 1GB
> of data

I happen to have run a great number of benchmarks with various geom
layers (such as: gstripe, gmirror, graid3, graid5) and as far as i
recall the read speeds i got with DD (1GB transfer) were always lower
than a bonnie benchmark on a mounted (thus UFS/VFS) volume. Since im no
dev i cannot explain this with absolute certainty, but i would guess
this is due to the lack of read-ahead and an I/O queue of only 1 when
using DD. This did not occur on a plain disk though, without any geom
layers attached to it. Some benchmark output:

gstripe (4 disks on nVidia controller [Embedded], 128KB stripesize, Test
System 1)
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 DD benchmark(1GB)      Results in MB/s                 avg
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
4k      READ            47.1    46.9    46.8            46.9
        WRITE           40.9    40.9    41.0            40.9
16k     READ            92.7    92.8    92.6            92.7
        WRITE           76.3    76.1    76.2            76.2
64k     READ            120.8   120.6   120.6           120.6
        WRITE           96.1    96.2    96.1            96.1
128k    READ            123.0   122.8   122.8           122.8
        WRITE           96.3    96.4    96.2            96.3
1m      READ            122.7   122.9   122.6           122.7
        WRITE           89.4    89.4    89.4            89.4

              -------Sequential Output-------- ---Sequential Input--
--Random--
              -Per Char- --Block--- -Rewrite-- -Per Char- --Block---
--Seeks---
Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU 
/sec %CPU
         4096 104288 90.4 237690 74.4 71008 22.0 87837 91.9 250858 44.6
114.8  0.7

analysis: geom_stripe performs worse in a raw-disk situation; but when
UFS optimizations come along the performance is more than doubled.

geom_raid5 with 8 SATA disks (128KB stripe, graid5-tng, Test System 2)
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 DD benchmark(1GB)      Results in MB/s                 avg
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
4k      READ            58.1    58.7    59.0            58.6
        WRITE           155.5   155.8   154.3           155.2
16k     READ            130.0   125.6   129.5           128.3
        WRITE           308.5   306.3   306.9           307.2
64k     READ            183.8   183.9   188.9           185.5
        WRITE           416.9   416.7   415.8           416.4
128k    READ            197.3   194.4   197.6           196.4
        WRITE           421.0   426.2   399.7           415.6
1m      READ            193.0   196.8   198.1           195.9
        WRITE           327.6   330.3   331.0           329.6

              -------Sequential Output-------- ---Sequential Input--
--Random--
              -Per Char- --Block--- -Rewrite-- -Per Char- --Block---
--Seeks---
Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU 
/sec %CPU
         4096 137897 96.7 310917 76.7 65233 16.0 101410 95.8 407013 45.5
475.5  3.0

Analysis: as you can see, read performance by DD is ~200MB/s while
bonnie gives us some ~400MB/s. The writes are again. This is due to the
fact that geom_raid5 uses write I/O request combining in order to avoid
the 'raid5 write hole' and is thus able to get *write* speeds of
400MB/s, which is quite remarkable for software RAID5. Adding the higher
I/O queue of UFS (7) and the fact that UFS does not write sequentially
on the medium (maximum number of blocks per cylinder), this gives the
combining-algoritm more work, which leads to some decreased write
performance from 400MB/s to ~300MB/s; still very good. CPU was bottleneck.

> 2) atime is NOT updated when using dd on any device, atime is related
> to file/inode
> operations which are not performed by dd

Well i did gave one DD command on a mounted volume, then it is related
to a file/inode, like this:
dd if=/mounted/mirror/volume of=/dev/null bs=1m count=1000

Then you are working on UFS/VFS level, not?

> 3) soft update are also useless (no bad, no good) for long sequential
> read

I agree, but due to the fact the volume is mounted and the normal mount
utility does not use the noatime option (prevents access time
metaupdates), each file read would result in an update of the UFS
metadata (the access time), although it may only update once every
while. Soft Updates can help in this scenario if there are a lot of
metadata updates; by collecting them and applying them once every 28-30
seconds (default). Other than the metadata, SU does not help, agreed.

> basically, long sequatial reads/write ignore anything but real drive
> speed (plate on
> the spindle) if they are performed long enough.

For parity RAID this is probably not true (when writing anyway), but for
simple levels or a plain disk, yes.

- Veronica

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?45EF93E5.50804>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation