From owner-freebsd-fs@FreeBSD.ORG Thu Oct 30 12:33:19 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 39AC016A4CE; Thu, 30 Oct 2003 12:33:19 -0800 (PST) Received: from sploot.vicor-nb.com (sploot.vicor-nb.com [208.206.78.81]) by mx1.FreeBSD.org (Postfix) with ESMTP id 409CD43FDD; Thu, 30 Oct 2003 12:33:18 -0800 (PST) (envelope-from kmarx@vicor.com) Received: from vicor.com (localhost [127.0.0.1]) by sploot.vicor-nb.com (8.12.8/8.12.8) with ESMTP id h9UKSAT1010092; Thu, 30 Oct 2003 12:28:10 -0800 (PST) (envelope-from kmarx@vicor.com) Message-ID: <3FA1745A.2090205@vicor.com> Date: Thu, 30 Oct 2003 12:28:10 -0800 From: Ken Marx User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030402 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Don Lewis References: <200310301928.h9UJSreF032920@gw.catspoiler.org> In-Reply-To: <200310301928.h9UJSreF032920@gw.catspoiler.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-fs@FreeBSD.org cc: gluk@ptci.ru cc: julian@elischer.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 20:33:19 -0000 Don Lewis wrote: > On 30 Oct, Ken Marx wrote: > >> >>Don Lewis wrote: > > [snip] > >>>You might try the lightly tested patch below. It tweaks the dirpref >>>algorithm so that cylinder groups with free space >= 75% of the average >>>free space and free inodes >= 75% of the average number of free inodes >>>are candidates for allocating the directory. It will not chose a >>>cylinder group that does not have at least one free block and one free >>>inode. >>> >>>It also decreases maxcontigdirs as the free space decreases so that a >>>cluster of directories is less likely to cause the cylinder group to >>>overflow. I think it would be better to tune maxcontigdirs individually >>>for each cylinder group, based on the free space in that cylinder group, >>>but that is more complex ... > > [snip] > >>Anyway - I just tested your patch. Again, unloaded system, repeatedly >>untaring a 1.5gb file, starting at 97% capacity. and: >> >> tunefs: average file size: (-f) 49152 >> tunefs: average number of files in a directory: (-s) 1500 >> ... >> >>Takes about 74 system secs per 1.5gb untar: >>------------------------------------------- >>/dev/da0s1e 558889580 497843972 16334442 97% 6858407 63316311 10% /raid >> 119.23 real 1.28 user 73.09 sys >>/dev/da0s1e 558889580 499371100 14807314 97% 6879445 63295273 10% /raid >> 111.69 real 1.32 user 73.65 sys >>/dev/da0s1e 558889580 500898228 13280186 97% 6900483 63274235 10% /raid >> 116.67 real 1.44 user 74.19 sys >>/dev/da0s1e 558889580 502425356 11753058 98% 6921521 63253197 10% /raid >> 114.73 real 1.25 user 75.01 sys >>/dev/da0s1e 558889580 503952484 10225930 98% 6942559 63232159 10% /raid >> 116.95 real 1.30 user 74.10 sys >>/dev/da0s1e 558889580 505479614 8698800 98% 6963597 63211121 10% /raid >> 115.29 real 1.39 user 74.25 sys >>/dev/da0s1e 558889580 507006742 7171672 99% 6984635 63190083 10% /raid >> 114.01 real 1.16 user 74.04 sys >>/dev/da0s1e 558889580 508533870 5644544 99% 7005673 63169045 10% /raid >> 119.95 real 1.32 user 75.05 sys >>/dev/da0s1e 558889580 510060998 4117416 99% 7026711 63148007 10% /raid >> 114.89 real 1.33 user 74.66 sys >>/dev/da0s1e 558889580 511588126 2590288 99% 7047749 63126969 10% /raid >> 114.91 real 1.58 user 74.64 sys >>/dev/da0s1e 558889580 513115254 1063160 100% 7068787 63105931 10% /raid >>tot: 1161.06 real 13.45 user 742.89 sys >> >>Compares pretty favorably to our naive, retro 4.4 dirpref hack >>that averages in the mid-high 60's: >>-------------------------------------------------------------------- >>/dev/da0s1e 558889580 497843952 16334462 97% 6858406 63316312 10% /raid >> 110.19 real 1.42 user 65.54 sys >>/dev/da0s1e 558889580 499371080 14807334 97% 6879444 63295274 10% /raid >> 105.47 real 1.47 user 65.09 sys >>/dev/da0s1e 558889580 500898208 13280206 97% 6900482 63274236 10% /raid >> 110.17 real 1.48 user 64.98 sys >>/dev/da0s1e 558889580 502425336 11753078 98% 6921520 63253198 10% /raid >> 131.88 real 1.49 user 71.20 sys >>/dev/da0s1e 558889580 503952464 10225950 98% 6942558 63232160 10% /raid >> 111.61 real 1.62 user 67.47 sys >>/dev/da0s1e 558889580 505479594 8698820 98% 6963596 63211122 10% /raid >> 131.36 real 1.67 user 90.79 sys >>/dev/da0s1e 558889580 507006722 7171692 99% 6984634 63190084 10% /raid >> 115.34 real 1.49 user 65.61 sys >>/dev/da0s1e 558889580 508533850 5644564 99% 7005672 63169046 10% /raid >> 110.26 real 1.39 user 65.26 sys >>/dev/da0s1e 558889580 510060978 4117436 99% 7026710 63148008 10% /raid >> 116.15 real 1.51 user 65.47 sys >>/dev/da0s1e 558889580 511588106 2590308 99% 7047748 63126970 10% /raid >> 112.74 real 1.37 user 65.01 sys >>/dev/da0s1e 558889580 513115234 1063180 100% 7068786 63105932 10% /raid >> 1158.36 real 15.01 user 686.57 sys >> >>Without either, we'd expect timings of 5-20 minutes when things are >>going poorly. >> >>Happy to test further if you have tweaks to your patch or >>things you'd like us to test in particular. E.g., load, >>newfs, etc. > > > You might want to try your hash patch along my patch to see if > decreasing the maximum hash chain lengths makes a difference in system > time. > > Sorry - should hvae mentioned: Both tests included our hash patch. I just re-ran with the hash stuff back to original. Sorry to say there's no appreciable difference. Using your dirpref patch, still apprx 75 sys sec/1.5gb: tot: 1185.54 real 14.15 user 747.43 sys Prehaps the dirpref patch lowers the frequency of having to search so much, and hence exercises the hashtable less. Or I'm doing something lame. k -- Ken Marx, kmarx@vicor-nb.com We have to move. We must not stand pat and achieve closure on customer's customer. - http://www.bigshed.com/cgi-bin/speak.cgi