Date: Mon, 17 May 2021 11:57:58 -0400 From: mike tancsa <mike@sentex.net> To: Alan Somers <asomers@freebsd.org> Cc: freebsd-fs <freebsd-fs@freebsd.org> Subject: Re: speeding up zfs send | recv (update) Message-ID: <f6ea3387-faf8-4c63-d1e7-906fa397b00b@sentex.net> In-Reply-To: <CAOtMX2gifUmgqwSKpRGcfzCm_=BX_szNF1AF8WTMfAmbrJ5UWA@mail.gmail.com> References: <866d6937-a4e8-bec3-d61b-07df3065fca9@sentex.net> <CAOtMX2gifUmgqwSKpRGcfzCm_=BX_szNF1AF8WTMfAmbrJ5UWA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 5/13/2021 11:37 AM, Alan Somers wrote: > On Thu, May 13, 2021 at 8:45 AM mike tancsa <mike@sentex.net > <mailto:mike@sentex.net>> wrote: > > For offsite storage, I have been doing a zfs send across a 10G > link and > noticed something I don't understand with respect to speed.=C2=A0 I= have a > > > Is this a high latency link?=C2=A0 ZFS send streams can be bursty.=C2=A0= Piping > the stream through mbuffer helps with that.=C2=A0 Just google "zfs send= > mbuffer" for some examples.=C2=A0 And be aware that your speed may be > limited by the sender.=C2=A0 Especially if those small files are random= ly > spread across the platter, your sending server's disks may be the > limiting factor.=C2=A0 Use gstat to check. > -Alan Just a quick follow up.=C2=A0 I was doing some tests with just mbuffer, mbuffer and ssh, and just ssh (aes128-gcm) with an compressed stream and non compressed stream (zfs send vs zfs send -c).=C2=A0 Generally, didnt f= ind too much of a difference.=C2=A0 I was testing on a production server that= is generally uniformly busy, so it wont be 100% reliable, but I think close enough as there is not much variance in the back ground load nor in the results. I tried this both with datasets that were backups of mailspools, so LOTS of little files and big directories as well as with zfs datasets with a few big files.=C2=A0 On the mail spool just via mbuffer (no ssh involved at all) zfs send summary:=C2=A0 514 GiByte in 1h 09min 35.9sec - average of=C2=A0 126 MiB/= s zfs send -c summary:=C2=A0 418 GiByte in 1h 05min 58.5sec - average of=C2=A0 108 MiB/= s and the same dataset, sending just through OpenSSH took 1h:06m (zfs send) and 1h:01m (zfs send -c) On the large dataset (large VMDK files), similar pattern. I did find one interesting thing, when I was testing with a smaller dataset of just 12G.=C2=A0 As the server has 65G of ram, 29 allocated to ARC, sending a z= fs stream with -c made a giant difference. I guess there is some efficiency with sending something thats already compressed in arc ? Or maybe its just all cache effect. Testing with one with about 1TB of referenced data using mbuffer with and without ssh=C2=A0 and just ssh zfs send with mbuffer and ssh summary:=C2=A0 772 GiByte in 51min 06.2sec - average of=C2=A0 258 MiB/s zfs send -c summary:=C2=A0 772 GiByte in 1h 22min 09.3sec - average of=C2=A0 160 MiB/= s And the same dataset just with ssh -- zfs send 53min and zfs send -c 55mi= n and just mbuffer (no ssh) summary:=C2=A0 772 GiByte in 56min 45.7sec - average of=C2=A0 232 MiB/s (= zfs send -c) summary: 1224 GiByte in 53min 20.4sec - average of=C2=A0 392 MiB/s (zfs s= end) This seems to imply the disk is the bottleneck. mbuffer doesnt seem to make much of a difference either way.=C2=A0 Straight up ssh looks to be f= ine / best. Next step is going to allocate a pair of SSDs as special allocation class vdevs to see if it starts to make a difference for all that metadata. I guess I will have to send/resend the datasets to make sure they make full use of the special vdevs =C2=A0=C2=A0=C2=A0 ----Mike
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?f6ea3387-faf8-4c63-d1e7-906fa397b00b>