From owner-freebsd-stable Thu Jul 25 10:15:35 2002 Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8E7B637B400 for ; Thu, 25 Jul 2002 10:15:30 -0700 (PDT) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 276EA43E65 for ; Thu, 25 Jul 2002 10:15:30 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.5/8.12.4) with ESMTP id g6PHFICV034259; Thu, 25 Jul 2002 10:15:18 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.5/8.12.4/Submit) id g6PHFGDD034256; Thu, 25 Jul 2002 10:15:16 -0700 (PDT) (envelope-from dillon) Date: Thu, 25 Jul 2002 10:15:16 -0700 (PDT) From: Matthew Dillon Message-Id: <200207251715.g6PHFGDD034256@apollo.backplane.com> To: Peter Jeremy Cc: Andreas Koch , freebsd-stable@FreeBSD.ORG Subject: Re: 4.6-RC: Glacial speed of dump backups References: <20020606204948.GA4540@ultra4.eis.cs.tu-bs.de> <20020722081614.E367@gsmx07.alcatel.com.au> <20020722100408.GP26095@ultra4.eis.cs.tu-bs.de> <200207221943.g6MJhIBX054785@apollo.backplane.com> <20020725164416.A52778@gsmx07.alcatel.com.au> Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Interesting. What was the cache block size reported by dump? If you have the time, it may be worth playing with the cache block size. First, on line 97 of cache.c change the 'nbytes > BlockSize' to 'nbytes >= BlockSize': if (nbytes >= BlockSize || ((offset ^ (offset + nbytes - 1)) & mask) != 0) { return(pread(fd, buf, nbytes, offset)); } Then play with the BLKFACTOR #define at the top. Try values of 1 and 2 as well as the default of 4. The NetBSD caching code appears to try to avoid caching whole blocks, operating under the assumption that if a read for a whole block occurs dump is not likely to re-request the block. Changing the conditional above and setting the BLKFACTOR to 1 in my code will mimic this behavior. I haven't run any tests on this myself yet. I'm not sure why dump failed w/ a 64MB cache. I will investigate. -Matt Matthew Dillon :>Here are the preliminary results when I test this dumping /usr :>to /dev/null: :> :> DUMP: finished in 140 seconds, throughput 6413 KBytes/sec (8 MB cache) :> DUMP: finished in 144 seconds, throughput 6235 KBytes/sec (4 MB cache) :> DUMP: finished in 234 seconds, throughput 3836 KBytes/sec (0 MB cache) : :I've also done some testing of your 2nd patchset with mixed results. :In all cases, I'm dumping /usr, though it has different contents on :each system. I've included the relevant probe messages and a 'df -ki'. :The first two systems were in multi-user but otherwise idle. The 3rd :system is doing a fair amount of network-related processing but has :no local disk activity. : :System 1: A Compaq Armada 1592DT running -STABLE from early this week. :CPU: Pentium/P55C (quarter-micron) (233.87-MHz 586-class CPU) :real memory = 100663296 (98304K bytes) :atapci0: port 0x1000-0x100f at device 20.0 on pci0 :atapci0: Busmastering DMA not supported :ata0: at 0x1f0 irq 14 on atapci0 :ad0: 3102MB [6304/16/63] at ata0-master BIOSPIO :acd0: CDROM at ata0-slave BIOSPIO : :/dev/ad0s1f 2818735 2537867 55370 98% 216823 138119 61% /usr : :/usr contains XF86 3.3.6, a copy of the CVS repository and a checked :out -stable tree as well as a built kernel, but nothing else in /usr/obj : : DUMP: finished in 4298 seconds, throughput 633 KBytes/sec /sbin/dump : DUMP: finished in 4304 seconds, throughput 632 KBytes/sec dump -C 0 : DUMP: finished in 4855 seconds, throughput 560 KBytes/sec dump -C 4 : DUMP: finished in 4477 seconds, throughput 607 KBytes/sec dump -C 8 : :System 2: A Dell GXi running 4.4-STABLE from mid December 2001. :CPU: Pentium/P54C (132.95-MHz 586-class CPU) :real memory = 100663296 (98304K bytes) :atapci0: port 0xffa0-0xffaf at device 7.1 on pci0 :ata0: at 0x1f0 irq 14 on atapci0 :ad0: 2014MB [4092/16/63] at ata0-master WDMA2 :ad1: 1040MB [2114/16/63] at ata0-slave WDMA2 : :/dev/ad0s1f 1562414 279462 1157960 19% 12007 183831 6% /usr : :/usr is a pretty basic -STABLE. No sources or X but a couple of ports. : : DUMP: finished in 421 seconds, throughput 685 KBytes/sec : DUMP: finished in 423 seconds, throughput 681 KBytes/sec : DUMP: finished in 234 seconds, throughput 1232 KBytes/sec : DUMP: finished in 229 seconds, throughput 1259 KBytes/sec : :System 3: A Compaq Proliant P1850R running a 4.6-PRERELEASE from mid-May. :CPU: Pentium III/Pentium III Xeon/Celeron (598.17-MHz 686-class CPU) :real memory = 134201344 (131056K bytes) :sym0: <875> port 0x2000-0x20ff mem 0xc6aff000-0xc6afffff,0xc6afaf00-0xc6afafff irq 10 at device 6.0 on pci0 :sym0: No NVRAM, ID 7, Fast-20, SE, parity checking :da0 at sym0 bus 0 target 0 lun 0 :da0: Fixed Direct Access SCSI-2 device :da0: 40.000MB/s transfers (20.000MHz, offset 15, 16bit), Tagged Queueing Enabled :da0: 8678MB (17773524 512 byte sectors: 255H 63S/T 1106C) : :/dev/da0s1f 8298907 7568247 66748 99% 530481 851917 38% /usr : :/usr contains lots and lots of assorted small files. : : DUMP: finished in 6739 seconds, throughput 1189 KBytes/sec : DUMP: finished in 6724 seconds, throughput 1192 KBytes/sec : DUMP: finished in 6535 seconds, throughput 1226 KBytes/sec : : :Overall, the caching doesn't seem to help on either my laptop or the :Proliant server. In the case of the laptop, I presume the cost of :PIO'ing the data from a fairly slow disk into the cache outweighs the :gains from the improved access pattern. I have no idea why the :Proliant performs so badly - systat shows that the disk is averaging :about 7MB/sec so it looks like the problem is that the cache is too :small. I've tried upping the cache to 64MB but dump then hangs in :pass III. : :Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message