Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 May 2021 11:57:58 -0400
From:      mike tancsa <mike@sentex.net>
To:        Alan Somers <asomers@freebsd.org>
Cc:        freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: speeding up zfs send | recv (update)
Message-ID:  <f6ea3387-faf8-4c63-d1e7-906fa397b00b@sentex.net>
In-Reply-To: <CAOtMX2gifUmgqwSKpRGcfzCm_=BX_szNF1AF8WTMfAmbrJ5UWA@mail.gmail.com>
References:  <866d6937-a4e8-bec3-d61b-07df3065fca9@sentex.net> <CAOtMX2gifUmgqwSKpRGcfzCm_=BX_szNF1AF8WTMfAmbrJ5UWA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 5/13/2021 11:37 AM, Alan Somers wrote:
> On Thu, May 13, 2021 at 8:45 AM mike tancsa <mike@sentex.net
> <mailto:mike@sentex.net>> wrote:
>
>     For offsite storage, I have been doing a zfs send across a 10G
>     link and
>     noticed something I don't understand with respect to speed.=C2=A0 I=
 have a
>
>
> Is this a high latency link?=C2=A0 ZFS send streams can be bursty.=C2=A0=
 Piping
> the stream through mbuffer helps with that.=C2=A0 Just google "zfs send=

> mbuffer" for some examples.=C2=A0 And be aware that your speed may be
> limited by the sender.=C2=A0 Especially if those small files are random=
ly
> spread across the platter, your sending server's disks may be the
> limiting factor.=C2=A0 Use gstat to check.
> -Alan


Just a quick follow up.=C2=A0 I was doing some tests with just mbuffer,
mbuffer and ssh, and just ssh (aes128-gcm) with an compressed stream and
non compressed stream (zfs send vs zfs send -c).=C2=A0 Generally, didnt f=
ind
too much of a difference.=C2=A0 I was testing on a production server that=
 is
generally uniformly busy, so it wont be 100% reliable, but I think close
enough as there is not much variance in the back ground load nor in the
results.

I tried this both with datasets that were backups of mailspools, so LOTS
of little files and big directories as well as with zfs datasets with a
few big files.=C2=A0

On the mail spool just via mbuffer (no ssh involved at all)

zfs send
summary:=C2=A0 514 GiByte in 1h 09min 35.9sec - average of=C2=A0 126 MiB/=
s
zfs send -c
summary:=C2=A0 418 GiByte in 1h 05min 58.5sec - average of=C2=A0 108 MiB/=
s

and the same dataset, sending just through OpenSSH took 1h:06m (zfs
send) and 1h:01m (zfs send -c)


On the large dataset (large VMDK files), similar pattern. I did find one
interesting thing, when I was testing with a smaller dataset of just
12G.=C2=A0 As the server has 65G of ram, 29 allocated to ARC, sending a z=
fs
stream with -c made a giant difference. I guess there is some efficiency
with sending something thats already compressed in arc ? Or maybe its
just all cache effect.

Testing with one with about 1TB of referenced data using mbuffer with
and without ssh=C2=A0 and just ssh

zfs send with mbuffer and ssh
summary:=C2=A0 772 GiByte in 51min 06.2sec - average of=C2=A0 258 MiB/s
zfs send -c
summary:=C2=A0 772 GiByte in 1h 22min 09.3sec - average of=C2=A0 160 MiB/=
s

And the same dataset just with ssh -- zfs send 53min and zfs send -c 55mi=
n

and just mbuffer (no ssh)

summary:=C2=A0 772 GiByte in 56min 45.7sec - average of=C2=A0 232 MiB/s (=
zfs send -c)
summary: 1224 GiByte in 53min 20.4sec - average of=C2=A0 392 MiB/s (zfs s=
end)

This seems to imply the disk is the bottleneck. mbuffer doesnt seem to
make much of a difference either way.=C2=A0 Straight up ssh looks to be f=
ine
/ best.

Next step is going to allocate a pair of SSDs as special allocation
class vdevs to see if it starts to make a difference for all that
metadata. I guess I will have to send/resend the datasets to make sure
they make full use of the special vdevs

=C2=A0=C2=A0=C2=A0 ----Mike





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?f6ea3387-faf8-4c63-d1e7-906fa397b00b>