From owner-freebsd-fs@FreeBSD.ORG Mon Mar 28 15:35:09 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 792A016A4CE; Mon, 28 Mar 2005 15:35:09 +0000 (GMT) Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by mx1.FreeBSD.org (Postfix) with SMTP id 596D043D5E; Mon, 28 Mar 2005 15:35:08 +0000 (GMT) (envelope-from dwmalone@maths.tcd.ie) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 28 Mar 2005 16:35:07 +0100 (BST) Date: Mon, 28 Mar 2005 16:35:06 +0100 From: David Malone To: Scott Message-ID: <20050328153506.GA198@walton.maths.tcd.ie> References: <4247D19F.6010502@samsco.org> <200503272145.aa71162@salmon.maths.tcd.ie> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200503272145.aa71162@salmon.maths.tcd.ie> User-Agent: Mutt/1.5.6i Sender: dwmalone@maths.tcd.ie cc: freebsd-fs@FreeBSD.org cc: Robert Watson Subject: Re: UFS Subdirectory limit. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2005 15:35:09 -0000 Here's the benchmark results comparing a two level scheme (which I've labeled "sqrt") with a single directory with 150000 subdirectories (which I've labeled "flat"). The benchmark is in 4 phases: mkdir) This builds the directory structure. write) This writes a small amount of data into 100000 files in a pseudo random sequence of subdirectories. read) This reads back the data from each of the 100000 files (in the same order they were written). rm) This does an "rm -fr" of the whole tree. I just used /usr/bin/time on each phase and synced out the data between each phase. The results (averaged over 4 runs, see the end of the mail for the output of ministat on the data). real time user time sys time mkdir write read rm | mkdir write read rm | mkdir write read rm sqrt 499 4302 2409 1569 | 1.84 1.94 1.72 1.69 | 29.9 33.5 21.3 161.6 flat 1172 4318 2407 1717 | 1.47 1.62 1.52 1.66 | 26.1 33.5 20.6 158.1 So, it seems that while making the directory structure takes a bit longer for the flat method, there's no significant penality in real time for using it. The user times are pretty irrelevant (though the flat scheme is slightly faster, probably because some of the phases don't do sqrts ;-). Interestingly, the system times for the flat structure are actually *better* than the two level structure! I think this supports Don's suggestion that the layout of data on the disk with very large directories is not as good as it could be. (The test was done on an amd64 machine with gobs of ram. I used my patch to get large directories, which saves a metadata op per mkdir and rmdir, even in the sqrt case. I upped the amount of memory available to dirhash, though it didn't actually use more than about 2.5MB during the benchmark. Maxvnodes is set to 100000, so 150K dirs plus 100K files should be enough to make the name cache and vnode cache work hard.) David. x sqrt-real-mkdir + flat-real-mkdir +--------------------------------------------------------------------------+ | + | |x x x x + ++| ||__AM_| |A|| +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 4 469.16 527.83 513.14 499.2525 26.256259 + 4 1157.18 1182.42 1176.14 1172.1225 10.737021 Difference at 95.0% confidence 672.87 +/- 34.7068 134.775% +/- 6.95175% (Student's t, pooled s = 20.0583) x sqrt-real-write + flat-real-write +--------------------------------------------------------------------------+ |x + x x + x + + | | |________|________A____M___________|___A________________M_____________|| +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 4 4286.79 4317.17 4305.92 4302.0775 12.774457 + 4 4288.35 4339.77 4330.78 4318.195 22.606829 No difference proven at 95.0% confidence x sqrt-real-read + flat-real-read +--------------------------------------------------------------------------+ |+ + xx * +x| | |_________________|_____________A_______A_____M_______________|| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 4 2404.34 2417.4 2411.22 2409.3875 6.2196322 + 4 2396.65 2417.16 2411.18 2407.08 8.9677905 No difference proven at 95.0% confidence x sqrt-real-rm + flat-real-rm +--------------------------------------------------------------------------+ | x + | | x x x + + + | ||___AM_| |__A_M|| +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 4 1562.17 1578.65 1572.48 1568.9625 8.0307466 + 4 1707.86 1722.16 1721.65 1717.3925 6.6327841 Difference at 95.0% confidence 148.43 +/- 12.7436 9.46039% +/- 0.812231% (Student's t, pooled s = 7.36501) x sqrt-user-mkdir + flat-user-mkdir +--------------------------------------------------------------------------+ | + | |+ + + x x x x| | |______________AM_____________| |__AM_| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 4 1.81 1.87 1.85 1.84 0.025819889 + 4 1.32 1.6 1.48 1.47 0.11489125 Difference at 95.0% confidence -0.37 +/- 0.144075 -20.1087% +/- 7.83019% (Student's t, pooled s = 0.0832666) x sqrt-user-write + flat-user-write +--------------------------------------------------------------------------+ |+ + + + x x x x| | |____A_M___| |_____________A__M___________| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 4 1.82 2.06 1.95 1.935 0.099498744 + 4 1.57 1.66 1.63 1.62 0.037416574 Difference at 95.0% confidence -0.315 +/- 0.13006 -16.2791% +/- 6.72144% (Student's t, pooled s = 0.0751665) x sqrt-user-read + flat-user-read +--------------------------------------------------------------------------+ |+ + x+ + x x x | | |_______________________A_______M_____|_________|_____A______M________|| +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 4 1.56 1.81 1.77 1.72 0.11045361 + 4 1.33 1.71 1.57 1.515 0.16278821 No difference proven at 95.0% confidence x sqrt-user-rm + flat-user-rm +--------------------------------------------------------------------------+ |x + + x + + x x | | |_________|________________A____A_____M______|______M__________|| +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 4 1.4 1.89 1.84 1.695 0.22218611 + 4 1.54 1.8 1.74 1.665 0.12476645 No difference proven at 95.0% confidence x sqrt-sys-mkdir + flat-sys-mkdir +--------------------------------------------------------------------------+ | ++ + + xx xx| ||_______A________| |___A__M|| +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 4 29.62 30.15 30.06 29.89 0.25495098 + 4 25.7 26.84 26.07 26.1 0.5178803 Difference at 95.0% confidence -3.79 +/- 0.706247 -12.6798% +/- 2.36282% (Student's t, pooled s = 0.408167) x sqrt-sys-write + flat-sys-write +--------------------------------------------------------------------------+ |+ x + x x x + +| | |________________|___________A_AM_________|__M_________________| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 4 33.21 33.8 33.57 33.5275 0.24281337 + 4 32.81 34.25 33.83 33.565 0.61846584 No difference proven at 95.0% confidence x sqrt-sys-read + flat-sys-read +--------------------------------------------------------------------------+ |+ + + x + x x x| | |___________A___M_______||_______________M__A__________________| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 4 20.94 21.97 21.27 21.33 0.44773504 + 4 20.33 21 20.71 20.6325 0.29033027 Difference at 95.0% confidence -0.6975 +/- 0.652893 -3.27004% +/- 3.06092% (Student's t, pooled s = 0.377332) x sqrt-sys-rm + flat-sys-rm +--------------------------------------------------------------------------+ |x + + * x + x | | |_______________________A_____A____M________M__|____________|| +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 4 138.41 175.09 168.91 161.61 16.115177 + 4 141.94 170.84 164.4 158.175 12.513687 No difference proven at 95.0% confidence