FreeBSD Mail Archives

Date:      Thu, 15 Aug 2013 11:26:01 -0700
From:      Charles Swiger <cswiger@mac.com>
To:        aurfalien <aurfalien@gmail.com>
Cc:        FreeBSD Questions <freebsd-questions@freebsd.org>
Subject:   Re: copying milllions of small files and millions of dirs
Message-ID:  <CC3CFFD3-6742-447B-AA5D-2A4F6C483883@mac.com>
In-Reply-To: <7E7AEB5A-7102-424E-8B1E-A33E0A2C8B2C@gmail.com>
References:  <7E7AEB5A-7102-424E-8B1E-A33E0A2C8B2C@gmail.com>

On Aug 15, 2013, at 11:13 AM, aurfalien <aurfalien@gmail.com> wrote:
> Is there a faster way to copy files over NFS?

Probably.

> Currently breaking up a simple rsync over 7 or so scripts which copies =
22 dirs having ~500,000 dirs or files each.

There's a maximum useful concurrency which depends on how many disk =
spindles and what flavor of RAID is in use; exceeding it will result in =
thrashing the disks and heavily reducing throughput due to competing I/O =
requests.  Try measuring aggregate performance when running fewer rsyncs =
at once and see whether it improves.

Of course, putting half a million files into a single directory level is =
also a bad idea, even with dirhash support.  You'd do better to break =
them up into subdirs containing fewer than ~10K files apiece.

> Obviously reading all the meta data is a PITA.

Yes.

> Doin 10Gb/jumbos but in this case it don't make much of a hoot of a =
diff.

Yeah, probably not-- you're almost certainly I/O bound, not network =
bound.

Regards,
--=20
-Chuck

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CC3CFFD3-6742-447B-AA5D-2A4F6C483883>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation