From owner-freebsd-geom@FreeBSD.ORG Thu Mar 8 04:41:13 2007 Return-Path: X-Original-To: freebsd-geom@freebsd.org Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6F27D16A401; Thu, 8 Mar 2007 04:41:13 +0000 (UTC) (envelope-from etc@fluffles.net) Received: from auriate.fluffles.net (cust.95.160.adsl.cistron.nl [195.64.95.160]) by mx1.freebsd.org (Postfix) with ESMTP id EA22A13C47E; Thu, 8 Mar 2007 04:41:12 +0000 (UTC) (envelope-from etc@fluffles.net) Received: from destiny ([10.0.0.21]) by auriate.fluffles.net with esmtpa (Exim 4.63 (FreeBSD)) (envelope-from ) id 1HPARC-000Ltj-WE; Thu, 08 Mar 2007 05:41:11 +0100 Message-ID: <45EF93E5.50804@fluffles.net> Date: Thu, 08 Mar 2007 05:41:09 +0100 From: Fluffles User-Agent: Thunderbird 1.5.0.8 (X11/20061114) MIME-Version: 1.0 To: Artem Kuchin References: <20070306020826.GA18228@nowhere> <45ECF00D.3070101@samsco.org><20070306050312.GA2437@nowhere><008101c75fcc$210c74a0$0c00a8c0@Artem> <001a01c7601d$5d635ee0$0c00a8c0@Artem> <001801c7603a$5339e020$0c00a8c0@Artem> <20070307105144.1d4a382f@daydream.goid.lan><002801c760e2$5cb5eb50$0c00a8c0@Artem> <005b01c760e6$9a798bf0$0c00a8c0@Artem> <001601c760ee$f76fa300$0c00a8c0@Artem> <45EF2215.2080402@fluffles.net> <007901c760fc$71e708a0$0c00a8c0@Artem> In-Reply-To: <007901c760fc$71e708a0$0c00a8c0@Artem> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org Subject: Re: Some Unix benchmarks for those who are interesed X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Mar 2007 04:41:13 -0000 Artem Kuchin wrote: > > ----- Original Message ----- From: "Fluffles" >> If you use dd on the raw device (meaning no UFS/VFS) there is no >> read-ahead. This means that the following DD-command will give lower STR >> read than the second: >> >> no read-ahead: >> dd if=/dev/mirror/data of=/dev/null bs=1m count=1000 >> read-ahead and multiple I/O queue depth: >> dd if=/mounted/mirror/volume of=/dev/null bs=1m count=1000 >> >> You can test read STR best with bonnie (see >> /usr/ports/benchmarks/bonnie); or just with DD on a mounted volume. You >> should mount with -o noatime to avoid useless writes during reading, or >> use soft updates to prevent meta data from taking it's toll on I/O >> performance. >> > > Totall disagree. On the following reasons: > 1) Read ahead is simply useless when stream-reading (sequential) 1GB > of data I happen to have run a great number of benchmarks with various geom layers (such as: gstripe, gmirror, graid3, graid5) and as far as i recall the read speeds i got with DD (1GB transfer) were always lower than a bonnie benchmark on a mounted (thus UFS/VFS) volume. Since im no dev i cannot explain this with absolute certainty, but i would guess this is due to the lack of read-ahead and an I/O queue of only 1 when using DD. This did not occur on a plain disk though, without any geom layers attached to it. Some benchmark output: gstripe (4 disks on nVidia controller [Embedded], 128KB stripesize, Test System 1) -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- DD benchmark(1GB) Results in MB/s avg -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 4k READ 47.1 46.9 46.8 46.9 WRITE 40.9 40.9 41.0 40.9 16k READ 92.7 92.8 92.6 92.7 WRITE 76.3 76.1 76.2 76.2 64k READ 120.8 120.6 120.6 120.6 WRITE 96.1 96.2 96.1 96.1 128k READ 123.0 122.8 122.8 122.8 WRITE 96.3 96.4 96.2 96.3 1m READ 122.7 122.9 122.6 122.7 WRITE 89.4 89.4 89.4 89.4 -------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 4096 104288 90.4 237690 74.4 71008 22.0 87837 91.9 250858 44.6 114.8 0.7 analysis: geom_stripe performs worse in a raw-disk situation; but when UFS optimizations come along the performance is more than doubled. geom_raid5 with 8 SATA disks (128KB stripe, graid5-tng, Test System 2) -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- DD benchmark(1GB) Results in MB/s avg -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 4k READ 58.1 58.7 59.0 58.6 WRITE 155.5 155.8 154.3 155.2 16k READ 130.0 125.6 129.5 128.3 WRITE 308.5 306.3 306.9 307.2 64k READ 183.8 183.9 188.9 185.5 WRITE 416.9 416.7 415.8 416.4 128k READ 197.3 194.4 197.6 196.4 WRITE 421.0 426.2 399.7 415.6 1m READ 193.0 196.8 198.1 195.9 WRITE 327.6 330.3 331.0 329.6 -------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 4096 137897 96.7 310917 76.7 65233 16.0 101410 95.8 407013 45.5 475.5 3.0 Analysis: as you can see, read performance by DD is ~200MB/s while bonnie gives us some ~400MB/s. The writes are again. This is due to the fact that geom_raid5 uses write I/O request combining in order to avoid the 'raid5 write hole' and is thus able to get *write* speeds of 400MB/s, which is quite remarkable for software RAID5. Adding the higher I/O queue of UFS (7) and the fact that UFS does not write sequentially on the medium (maximum number of blocks per cylinder), this gives the combining-algoritm more work, which leads to some decreased write performance from 400MB/s to ~300MB/s; still very good. CPU was bottleneck. > 2) atime is NOT updated when using dd on any device, atime is related > to file/inode > operations which are not performed by dd Well i did gave one DD command on a mounted volume, then it is related to a file/inode, like this: dd if=/mounted/mirror/volume of=/dev/null bs=1m count=1000 Then you are working on UFS/VFS level, not? > 3) soft update are also useless (no bad, no good) for long sequential > read I agree, but due to the fact the volume is mounted and the normal mount utility does not use the noatime option (prevents access time metaupdates), each file read would result in an update of the UFS metadata (the access time), although it may only update once every while. Soft Updates can help in this scenario if there are a lot of metadata updates; by collecting them and applying them once every 28-30 seconds (default). Other than the metadata, SU does not help, agreed. > basically, long sequatial reads/write ignore anything but real drive > speed (plate on > the spindle) if they are performed long enough. For parity RAID this is probably not true (when writing anyway), but for simple levels or a plain disk, yes. - Veronica