Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 20 May 2021 11:25:30 +0100
From:      Steven Hartland <killing@multiplay.co.uk>
To:        mike tancsa <mike@sentex.net>
Cc:        Alan Somers <asomers@freebsd.org>, freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: speeding up zfs send | recv (update)
Message-ID:  <CAHEMsqYf4JOL22R3%2B13kqSOaDMydnpsq9Z2mR4EFBg7u78FjSQ@mail.gmail.com>
In-Reply-To: <f6ea3387-faf8-4c63-d1e7-906fa397b00b@sentex.net>
References:  <866d6937-a4e8-bec3-d61b-07df3065fca9@sentex.net> <CAOtMX2gifUmgqwSKpRGcfzCm_=BX_szNF1AF8WTMfAmbrJ5UWA@mail.gmail.com> <f6ea3387-faf8-4c63-d1e7-906fa397b00b@sentex.net>

next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000624b2305c2c05e06
Content-Type: text/plain; charset="UTF-8"

What is your pool structure / disk types?

On Mon, 17 May 2021 at 16:58, mike tancsa <mike@sentex.net> wrote:

> On 5/13/2021 11:37 AM, Alan Somers wrote:
> > On Thu, May 13, 2021 at 8:45 AM mike tancsa <mike@sentex.net
> > <mailto:mike@sentex.net>> wrote:
> >
> >     For offsite storage, I have been doing a zfs send across a 10G
> >     link and
> >     noticed something I don't understand with respect to speed.  I have a
> >
> >
> > Is this a high latency link?  ZFS send streams can be bursty.  Piping
> > the stream through mbuffer helps with that.  Just google "zfs send
> > mbuffer" for some examples.  And be aware that your speed may be
> > limited by the sender.  Especially if those small files are randomly
> > spread across the platter, your sending server's disks may be the
> > limiting factor.  Use gstat to check.
> > -Alan
>
>
> Just a quick follow up.  I was doing some tests with just mbuffer,
> mbuffer and ssh, and just ssh (aes128-gcm) with an compressed stream and
> non compressed stream (zfs send vs zfs send -c).  Generally, didnt find
> too much of a difference.  I was testing on a production server that is
> generally uniformly busy, so it wont be 100% reliable, but I think close
> enough as there is not much variance in the back ground load nor in the
> results.
>
> I tried this both with datasets that were backups of mailspools, so LOTS
> of little files and big directories as well as with zfs datasets with a
> few big files.
>
> On the mail spool just via mbuffer (no ssh involved at all)
>
> zfs send
> summary:  514 GiByte in 1h 09min 35.9sec - average of  126 MiB/s
> zfs send -c
> summary:  418 GiByte in 1h 05min 58.5sec - average of  108 MiB/s
>
> and the same dataset, sending just through OpenSSH took 1h:06m (zfs
> send) and 1h:01m (zfs send -c)
>
>
> On the large dataset (large VMDK files), similar pattern. I did find one
> interesting thing, when I was testing with a smaller dataset of just
> 12G.  As the server has 65G of ram, 29 allocated to ARC, sending a zfs
> stream with -c made a giant difference. I guess there is some efficiency
> with sending something thats already compressed in arc ? Or maybe its
> just all cache effect.
>
> Testing with one with about 1TB of referenced data using mbuffer with
> and without ssh  and just ssh
>
> zfs send with mbuffer and ssh
> summary:  772 GiByte in 51min 06.2sec - average of  258 MiB/s
> zfs send -c
> summary:  772 GiByte in 1h 22min 09.3sec - average of  160 MiB/s
>
> And the same dataset just with ssh -- zfs send 53min and zfs send -c 55min
>
> and just mbuffer (no ssh)
>
> summary:  772 GiByte in 56min 45.7sec - average of  232 MiB/s (zfs send -c)
> summary: 1224 GiByte in 53min 20.4sec - average of  392 MiB/s (zfs send)
>
> This seems to imply the disk is the bottleneck. mbuffer doesnt seem to
> make much of a difference either way.  Straight up ssh looks to be fine
> / best.
>
> Next step is going to allocate a pair of SSDs as special allocation
> class vdevs to see if it starts to make a difference for all that
> metadata. I guess I will have to send/resend the datasets to make sure
> they make full use of the special vdevs
>
>     ----Mike
>
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>

--000000000000624b2305c2c05e06
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">What is your pool structure / disk types?</div><br><div cl=
ass=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, 17 May 20=
21 at 16:58, mike tancsa &lt;<a href=3D"mailto:mike@sentex.net">mike@sentex=
.net</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"mar=
gin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1=
ex">On 5/13/2021 11:37 AM, Alan Somers wrote:<br>
&gt; On Thu, May 13, 2021 at 8:45 AM mike tancsa &lt;<a href=3D"mailto:mike=
@sentex.net" target=3D"_blank">mike@sentex.net</a><br>
&gt; &lt;mailto:<a href=3D"mailto:mike@sentex.net" target=3D"_blank">mike@s=
entex.net</a>&gt;&gt; wrote:<br>
&gt;<br>
&gt;=C2=A0 =C2=A0 =C2=A0For offsite storage, I have been doing a zfs send a=
cross a 10G<br>
&gt;=C2=A0 =C2=A0 =C2=A0link and<br>
&gt;=C2=A0 =C2=A0 =C2=A0noticed something I don&#39;t understand with respe=
ct to speed.=C2=A0 I have a<br>
&gt;<br>
&gt;<br>
&gt; Is this a high latency link?=C2=A0 ZFS send streams can be bursty.=C2=
=A0 Piping<br>
&gt; the stream through mbuffer helps with that.=C2=A0 Just google &quot;zf=
s send<br>
&gt; mbuffer&quot; for some examples.=C2=A0 And be aware that your speed ma=
y be<br>
&gt; limited by the sender.=C2=A0 Especially if those small files are rando=
mly<br>
&gt; spread across the platter, your sending server&#39;s disks may be the<=
br>
&gt; limiting factor.=C2=A0 Use gstat to check.<br>
&gt; -Alan<br>
<br>
<br>
Just a quick follow up.=C2=A0 I was doing some tests with just mbuffer,<br>
mbuffer and ssh, and just ssh (aes128-gcm) with an compressed stream and<br=
>
non compressed stream (zfs send vs zfs send -c).=C2=A0 Generally, didnt fin=
d<br>
too much of a difference.=C2=A0 I was testing on a production server that i=
s<br>
generally uniformly busy, so it wont be 100% reliable, but I think close<br=
>
enough as there is not much variance in the back ground load nor in the<br>
results.<br>
<br>
I tried this both with datasets that were backups of mailspools, so LOTS<br=
>
of little files and big directories as well as with zfs datasets with a<br>
few big files.=C2=A0<br>
<br>
On the mail spool just via mbuffer (no ssh involved at all)<br>
<br>
zfs send<br>
summary:=C2=A0 514 GiByte in 1h 09min 35.9sec - average of=C2=A0 126 MiB/s<=
br>
zfs send -c<br>
summary:=C2=A0 418 GiByte in 1h 05min 58.5sec - average of=C2=A0 108 MiB/s<=
br>
<br>
and the same dataset, sending just through OpenSSH took 1h:06m (zfs<br>
send) and 1h:01m (zfs send -c)<br>
<br>
<br>
On the large dataset (large VMDK files), similar pattern. I did find one<br=
>
interesting thing, when I was testing with a smaller dataset of just<br>
12G.=C2=A0 As the server has 65G of ram, 29 allocated to ARC, sending a zfs=
<br>
stream with -c made a giant difference. I guess there is some efficiency<br=
>
with sending something thats already compressed in arc ? Or maybe its<br>
just all cache effect.<br>
<br>
Testing with one with about 1TB of referenced data using mbuffer with<br>
and without ssh=C2=A0 and just ssh<br>
<br>
zfs send with mbuffer and ssh<br>
summary:=C2=A0 772 GiByte in 51min 06.2sec - average of=C2=A0 258 MiB/s<br>
zfs send -c<br>
summary:=C2=A0 772 GiByte in 1h 22min 09.3sec - average of=C2=A0 160 MiB/s<=
br>
<br>
And the same dataset just with ssh -- zfs send 53min and zfs send -c 55min<=
br>
<br>
and just mbuffer (no ssh)<br>
<br>
summary:=C2=A0 772 GiByte in 56min 45.7sec - average of=C2=A0 232 MiB/s (zf=
s send -c)<br>
summary: 1224 GiByte in 53min 20.4sec - average of=C2=A0 392 MiB/s (zfs sen=
d)<br>
<br>
This seems to imply the disk is the bottleneck. mbuffer doesnt seem to<br>
make much of a difference either way.=C2=A0 Straight up ssh looks to be fin=
e<br>
/ best.<br>
<br>
Next step is going to allocate a pair of SSDs as special allocation<br>
class vdevs to see if it starts to make a difference for all that<br>
metadata. I guess I will have to send/resend the datasets to make sure<br>
they make full use of the special vdevs<br>
<br>
=C2=A0=C2=A0=C2=A0 ----Mike<br>
<br>
<br>
_______________________________________________<br>
<a href=3D"mailto:freebsd-fs@freebsd.org" target=3D"_blank">freebsd-fs@free=
bsd.org</a> mailing list<br>
<a href=3D"https://lists.freebsd.org/mailman/listinfo/freebsd-fs" rel=3D"no=
referrer" target=3D"_blank">https://lists.freebsd.org/mailman/listinfo/free=
bsd-fs</a><br>
To unsubscribe, send any mail to &quot;<a href=3D"mailto:freebsd-fs-unsubsc=
ribe@freebsd.org" target=3D"_blank">freebsd-fs-unsubscribe@freebsd.org</a>&=
quot;<br>
</blockquote></div>

--000000000000624b2305c2c05e06--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHEMsqYf4JOL22R3%2B13kqSOaDMydnpsq9Z2mR4EFBg7u78FjSQ>