Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Aug 2013 10:19:23 +0100
From:      Frank Leonhardt <frank2@fjl.co.uk>
To:        freebsd-questions@freebsd.org
Subject:   Re: copying milllions of small files and millions of dirs
Message-ID:  <5213349B.10908@fjl.co.uk>
In-Reply-To: <CALfReyeWxHjmqXhWiK4jbCvh3MktqKqnTBQjYgC0wDTgBcK5jg@mail.gmail.com>
References:  <7E7AEB5A-7102-424E-8B1E-A33E0A2C8B2C@gmail.com> <20130816064612.GH1190@petole.demisel.net> <1376934082.25499.11612497.1C73C726@webmail.messagingengine.com> <B629E9F8-C01A-4D09-8054-C63F69846F5C@gmail.com> <CALfReyeWxHjmqXhWiK4jbCvh3MktqKqnTBQjYgC0wDTgBcK5jg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 20/08/2013 08:32, krad wrote:
> When i migrated a large mailspool in maildir format from the old nfs server
> to the new one in a previous job, I 1st generated a list of the top level
> maildirs. I then generated the rsync commands + plus a few other bits and
> pieces for each maildir to make a single transaction like function. I then
> pumped all this auto generated scripts into xjobs and ran them in parallel.
> This vastly speeded up the process as sequentially running the tree was far
> to slow. THis was for about 15 million maildirs in a hashed structure btw
> so a fair amount of files.
>
>
> eg
>
> find /maildir -type d -maxdepth 4 | while read d
> do
> r=$(($RANDOM*$RANDOM))
> echo rsync -a $d/ /newpath/$d/ > /tmp/scripts/$r
> echo some other stuff >> /tmp/scripts/$r
> done
>
> ls /tmp/scripts/| while read f
> echo /tmp/scripts/$f
> done | xjobs -j 20
>

This isn't what I'd have expected, as running operations in parallel on 
mechanical drives would normally result in superfluous head movements 
and thus exacerbate the I/O bottleneck. The system must be optimising 
the requests from 20 parallel jobs better than I thought it would to 
climb out from that hole far enough to get a net benefit. Did you 
remember how any other approaches performed?




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5213349B.10908>