From owner-freebsd-fs@freebsd.org Mon May 17 15:58:00 2021 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 5928C64BCB9 for ; Mon, 17 May 2021 15:58:00 +0000 (UTC) (envelope-from mike@sentex.net) Received: from pyroxene2a.sentex.ca (pyroxene19.sentex.ca [IPv6:2607:f3e0:0:3::19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "pyroxene.sentex.ca", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FkP131XWQz3vJT; Mon, 17 May 2021 15:57:59 +0000 (UTC) (envelope-from mike@sentex.net) Received: from [IPv6:2607:f3e0:0:4:4c28:b8e3:861c:babc] ([IPv6:2607:f3e0:0:4:4c28:b8e3:861c:babc]) by pyroxene2a.sentex.ca (8.16.1/8.15.2) with ESMTPS id 14HFvvHA087768 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO); Mon, 17 May 2021 11:57:58 -0400 (EDT) (envelope-from mike@sentex.net) To: Alan Somers Cc: freebsd-fs References: <866d6937-a4e8-bec3-d61b-07df3065fca9@sentex.net> From: mike tancsa Subject: Re: speeding up zfs send | recv (update) Message-ID: Date: Mon, 17 May 2021 11:57:58 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-Language: en-US X-Rspamd-Queue-Id: 4FkP131XWQz3vJT X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of mike@sentex.net designates 2607:f3e0:0:3::19 as permitted sender) smtp.mailfrom=mike@sentex.net X-Spamd-Result: default: False [-0.00 / 15.00]; RCVD_TLS_ALL(0.00)[]; ARC_NA(0.00)[]; FREEFALL_USER(0.00)[mike]; FROM_HAS_DN(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2607:f3e0:0:3::19:from]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; MIME_GOOD(-0.10)[text/plain]; HFILTER_HELO_IP_A(1.00)[pyroxene2a.sentex.ca]; HFILTER_HELO_NORES_A_OR_MX(0.30)[pyroxene2a.sentex.ca]; NEURAL_SPAM_MEDIUM(1.00)[0.998]; R_SPF_ALLOW(-0.20)[+ip6:2607:f3e0::/32]; SPAMHAUS_ZRD(0.00)[2607:f3e0:0:3::19:from:127.0.2.255]; DMARC_NA(0.00)[sentex.net]; TO_DN_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:11647, ipnet:2607:f3e0::/32, country:CA]; RCVD_COUNT_TWO(0.00)[2]; MAILMAN_DEST(0.00)[freebsd-fs] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 May 2021 15:58:00 -0000 On 5/13/2021 11:37 AM, Alan Somers wrote: > On Thu, May 13, 2021 at 8:45 AM mike tancsa > wrote: > > For offsite storage, I have been doing a zfs send across a 10G > link and > noticed something I don't understand with respect to speed.=C2=A0 I= have a > > > Is this a high latency link?=C2=A0 ZFS send streams can be bursty.=C2=A0= Piping > the stream through mbuffer helps with that.=C2=A0 Just google "zfs send= > mbuffer" for some examples.=C2=A0 And be aware that your speed may be > limited by the sender.=C2=A0 Especially if those small files are random= ly > spread across the platter, your sending server's disks may be the > limiting factor.=C2=A0 Use gstat to check. > -Alan Just a quick follow up.=C2=A0 I was doing some tests with just mbuffer, mbuffer and ssh, and just ssh (aes128-gcm) with an compressed stream and non compressed stream (zfs send vs zfs send -c).=C2=A0 Generally, didnt f= ind too much of a difference.=C2=A0 I was testing on a production server that= is generally uniformly busy, so it wont be 100% reliable, but I think close enough as there is not much variance in the back ground load nor in the results. I tried this both with datasets that were backups of mailspools, so LOTS of little files and big directories as well as with zfs datasets with a few big files.=C2=A0 On the mail spool just via mbuffer (no ssh involved at all) zfs send summary:=C2=A0 514 GiByte in 1h 09min 35.9sec - average of=C2=A0 126 MiB/= s zfs send -c summary:=C2=A0 418 GiByte in 1h 05min 58.5sec - average of=C2=A0 108 MiB/= s and the same dataset, sending just through OpenSSH took 1h:06m (zfs send) and 1h:01m (zfs send -c) On the large dataset (large VMDK files), similar pattern. I did find one interesting thing, when I was testing with a smaller dataset of just 12G.=C2=A0 As the server has 65G of ram, 29 allocated to ARC, sending a z= fs stream with -c made a giant difference. I guess there is some efficiency with sending something thats already compressed in arc ? Or maybe its just all cache effect. Testing with one with about 1TB of referenced data using mbuffer with and without ssh=C2=A0 and just ssh zfs send with mbuffer and ssh summary:=C2=A0 772 GiByte in 51min 06.2sec - average of=C2=A0 258 MiB/s zfs send -c summary:=C2=A0 772 GiByte in 1h 22min 09.3sec - average of=C2=A0 160 MiB/= s And the same dataset just with ssh -- zfs send 53min and zfs send -c 55mi= n and just mbuffer (no ssh) summary:=C2=A0 772 GiByte in 56min 45.7sec - average of=C2=A0 232 MiB/s (= zfs send -c) summary: 1224 GiByte in 53min 20.4sec - average of=C2=A0 392 MiB/s (zfs s= end) This seems to imply the disk is the bottleneck. mbuffer doesnt seem to make much of a difference either way.=C2=A0 Straight up ssh looks to be f= ine / best. Next step is going to allocate a pair of SSDs as special allocation class vdevs to see if it starts to make a difference for all that metadata. I guess I will have to send/resend the datasets to make sure they make full use of the special vdevs =C2=A0=C2=A0=C2=A0 ----Mike