Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Mar 2005 16:35:06 +0100
From:      David Malone <dwmalone@maths.tcd.ie>
To:        Scott <scottl@samsco.org>
Cc:        Robert Watson <rwatson@FreeBSD.org>
Subject:   Re: UFS Subdirectory limit.
Message-ID:  <20050328153506.GA198@walton.maths.tcd.ie>
In-Reply-To: <200503272145.aa71162@salmon.maths.tcd.ie>
References:  <4247D19F.6010502@samsco.org> <200503272145.aa71162@salmon.maths.tcd.ie>

next in thread | previous in thread | raw e-mail | index | archive | help
Here's the benchmark results comparing a two level scheme (which
I've labeled "sqrt") with a single directory with 150000 subdirectories
(which I've labeled "flat").

The benchmark is in 4 phases:

	mkdir) This builds the directory structure.
	write) This writes a small amount of data into 100000 files
	       in a pseudo random sequence of subdirectories.
	read)  This reads back the data from each of the 100000
	       files (in the same order they were written).
	rm)    This does an "rm -fr" of the whole tree.

I just used /usr/bin/time on each phase and synced out the data
between each phase. The results (averaged over 4 runs, see the end
of the mail for the output of ministat on the data).

           real time               user time               sys time
     mkdir write read   rm | mkdir write read   rm | mkdir write read    rm
sqrt  499   4302 2409 1569 |  1.84  1.94 1.72 1.69 |  29.9 33.5  21.3 161.6
flat 1172   4318 2407 1717 |  1.47  1.62 1.52 1.66 |  26.1 33.5  20.6 158.1

So, it seems that while making the directory structure takes a bit
longer for the flat method, there's no significant penality in real
time for using it. The user times are pretty irrelevant (though the
flat scheme is slightly faster, probably because some of the phases
don't do sqrts ;-).

Interestingly, the system times for the flat structure are actually
*better* than the two level structure! I think this supports Don's
suggestion that the layout of data on the disk with very large
directories is not as good as it could be.

(The test was done on an amd64 machine with gobs of ram. I used my
patch to get large directories, which saves a metadata op per mkdir
and rmdir, even in the sqrt case. I upped the amount of memory
available to dirhash, though it didn't actually use more than about
2.5MB during the benchmark. Maxvnodes is set to 100000, so 150K
dirs plus 100K files should be enough to make the name cache and
vnode cache work hard.)

	David.


x sqrt-real-mkdir
+ flat-real-mkdir
+--------------------------------------------------------------------------+
|                                                                        + |
|x x x x                                                               + ++|
||__AM_|                                                                |A||
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4        469.16        527.83        513.14      499.2525     26.256259
+   4       1157.18       1182.42       1176.14     1172.1225     10.737021
Difference at 95.0% confidence
	672.87 +/- 34.7068
	134.775% +/- 6.95175%
	(Student's t, pooled s = 20.0583)
x sqrt-real-write
+ flat-real-write
+--------------------------------------------------------------------------+
|x +             x         x          +   x                 +            + |
|   |________|________A____M___________|___A________________M_____________||
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4       4286.79       4317.17       4305.92     4302.0775     12.774457
+   4       4288.35       4339.77       4330.78      4318.195     22.606829
No difference proven at 95.0% confidence
x sqrt-real-read
+ flat-real-read
+--------------------------------------------------------------------------+
|+                       +  xx                      *                    +x|
|     |_________________|_____________A_______A_____M_______________||     |
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4       2404.34        2417.4       2411.22     2409.3875     6.2196322
+   4       2396.65       2417.16       2411.18       2407.08     8.9677905
No difference proven at 95.0% confidence
x sqrt-real-rm
+ flat-real-rm
+--------------------------------------------------------------------------+
| x                                                                      + |
| x   x  x                                                         +   + + |
||___AM_|                                                           |__A_M||
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4       1562.17       1578.65       1572.48     1568.9625     8.0307466
+   4       1707.86       1722.16       1721.65     1717.3925     6.6327841
Difference at 95.0% confidence
	148.43 +/- 12.7436
	9.46039% +/- 0.812231%
	(Student's t, pooled s = 7.36501)
x sqrt-user-mkdir
+ flat-user-mkdir
+--------------------------------------------------------------------------+
|                     +                                                    |
|+                    +               +                           x  x x  x|
|     |______________AM_____________|                              |__AM_| |
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4          1.81          1.87          1.85          1.84   0.025819889
+   4          1.32           1.6          1.48          1.47    0.11489125
Difference at 95.0% confidence
	-0.37 +/- 0.144075
	-20.1087% +/- 7.83019%
	(Student's t, pooled s = 0.0832666)
x sqrt-user-write
+ flat-user-write
+--------------------------------------------------------------------------+
|+      + +   +                       x             x     x               x|
|  |____A_M___|                          |_____________A__M___________|    |
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4          1.82          2.06          1.95         1.935   0.099498744
+   4          1.57          1.66          1.63          1.62   0.037416574
Difference at 95.0% confidence
	-0.315 +/- 0.13006
	-16.2791% +/- 6.72144%
	(Student's t, pooled s = 0.0751665)
x sqrt-user-read
+ flat-user-read
+--------------------------------------------------------------------------+
|+                 +               x+                   +    x   x     x   |
|   |_______________________A_______M_____|_________|_____A______M________||
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4          1.56          1.81          1.77          1.72    0.11045361
+   4          1.33          1.71          1.57         1.515    0.16278821
No difference proven at 95.0% confidence
x sqrt-user-rm
+ flat-user-rm
+--------------------------------------------------------------------------+
|x                   +    +         x            +       +     x      x    |
|          |_________|________________A____A_____M______|______M__________||
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4           1.4          1.89          1.84         1.695    0.22218611
+   4          1.54           1.8          1.74         1.665    0.12476645
No difference proven at 95.0% confidence
x sqrt-sys-mkdir
+ flat-sys-mkdir
+--------------------------------------------------------------------------+
|  ++    +           +                                            xx     xx|
||_______A________|                                               |___A__M||
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4         29.62         30.15         30.06         29.89    0.25495098
+   4          25.7         26.84         26.07          26.1     0.5178803
Difference at 95.0% confidence
	-3.79 +/- 0.706247
	-12.6798% +/- 2.36282%
	(Student's t, pooled s = 0.408167)
x sqrt-sys-write
+ flat-sys-write
+--------------------------------------------------------------------------+
|+                   x       +       x  x          x +                    +|
|       |________________|___________A_AM_________|__M_________________|   |
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4         33.21          33.8         33.57       33.5275    0.24281337
+   4         32.81         34.25         33.83        33.565    0.61846584
No difference proven at 95.0% confidence
x sqrt-sys-read
+ flat-sys-read
+--------------------------------------------------------------------------+
|+      +         +         x  +     x     x                              x|
| |___________A___M_______||_______________M__A__________________|         |
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4         20.94         21.97         21.27         21.33    0.44773504
+   4         20.33            21         20.71       20.6325    0.29033027
Difference at 95.0% confidence
	-0.6975 +/- 0.652893
	-3.27004% +/- 3.06092%
	(Student's t, pooled s = 0.377332)
x sqrt-sys-rm
+ flat-sys-rm
+--------------------------------------------------------------------------+
|x      +                        +               *        x  +       x     |
|             |_______________________A_____A____M________M__|____________||
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   4        138.41        175.09        168.91        161.61     16.115177
+   4        141.94        170.84         164.4       158.175     12.513687
No difference proven at 95.0% confidence



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050328153506.GA198>