From owner-freebsd-questions@freebsd.org  Mon Dec  7 22:23:26 2020
Return-Path: <owner-freebsd-questions@freebsd.org>
Delivered-To: freebsd-questions@mailman.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.nyi.freebsd.org (Postfix) with ESMTP id 38FDC4B8C5F
 for <freebsd-questions@mailman.nyi.freebsd.org>;
 Mon,  7 Dec 2020 22:23:26 +0000 (UTC)
 (envelope-from paul@gromit.dlib.vt.edu)
Received: from gromit.dlib.vt.edu (gromit.dlib.vt.edu [128.173.49.70])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "gromit.dlib.vt.edu",
 Issuer "Chumby Certificate Authority" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 4CqdB53srCz3w7n
 for <freebsd-questions@freebsd.org>; Mon,  7 Dec 2020 22:23:25 +0000 (UTC)
 (envelope-from paul@gromit.dlib.vt.edu)
Received: from mather.gromit23.net (unknown [73.99.214.146])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by gromit.dlib.vt.edu (Postfix) with ESMTPSA id EEDB6130;
 Mon,  7 Dec 2020 17:23:18 -0500 (EST)
Content-Type: text/plain;
	charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
Subject: Re: effect of differing spindle speeds on prospective zfs 	vdevs
From: Paul Mather <paul@gromit.dlib.vt.edu>
In-Reply-To: <mailman.101.1607256003.84014.freebsd-questions@freebsd.org>
Date: Mon, 7 Dec 2020 17:23:18 -0500
Cc: tech-lists@zyxst.net
Content-Transfer-Encoding: quoted-printable
Message-Id: <BC1C2E79-82C1-43C1-A9E6-9762F170161B@gromit.dlib.vt.edu>
References: <mailman.101.1607256003.84014.freebsd-questions@freebsd.org>
To: freebsd-questions@freebsd.org
X-Mailer: Apple Mail (2.3608.120.23.2.4)
X-Rspamd-Queue-Id: 4CqdB53srCz3w7n
X-Spamd-Bar: --
Authentication-Results: mx1.freebsd.org; dkim=none;
 dmarc=fail reason="No valid SPF, No valid DKIM" header.from=vt.edu
 (policy=none); 
 spf=none (mx1.freebsd.org: domain of paul@gromit.dlib.vt.edu has no SPF policy
 when checking 128.173.49.70) smtp.mailfrom=paul@gromit.dlib.vt.edu
X-Spamd-Result: default: False [-2.50 / 15.00]; RCVD_TLS_ALL(0.00)[];
 RCVD_VIA_SMTP_AUTH(0.00)[]; FREEFALL_USER(0.00)[paul];
 FROM_HAS_DN(0.00)[];
 RBL_DBL_DONT_QUERY_IPS(0.00)[128.173.49.70:from];
 MV_CASE(0.50)[]; MID_RHS_MATCH_FROM(0.00)[];
 MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[];
 ARC_NA(0.00)[];
 SPAMHAUS_ZRD(0.00)[128.173.49.70:from:127.0.2.255];
 RECEIVED_SPAMHAUS_PBL(0.00)[73.99.214.146:received];
 TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000];
 RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-1.00)[-1.000];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000];
 R_SPF_NA(0.00)[no SPF record]; FROM_EQ_ENVFROM(0.00)[];
 R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:1312, ipnet:128.173.0.0/16, country:US];
 RCVD_COUNT_TWO(0.00)[2]; MAILMAN_DEST(0.00)[freebsd-questions];
 DMARC_POLICY_SOFTFAIL(0.10)[vt.edu : No valid SPF, No valid DKIM,none]
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-questions>, 
 <mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions/>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
 <mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Dec 2020 22:23:26 -0000

On Sat, 5 Dec 2020 19:16:33 +0000, tech-lists <tech-lists@zyxst.net> =
wrote:


> Hi,
>=20
> On Sat, Dec 05, 2020 at 08:51:08AM -0500, Paul Mather wrote:
>> IIRC, ZFS pools have a single ashift for the entire pool, so you =
should=20
>> set it to accommodate the 4096/4096 devices to avoid performance=20
>> degradation.  I believe it defaults to that now, and should =
auto-detect=20
>> anyway.  But, in a mixed setup of vdevs like you have, you should be=20=

>> using ashift=3D12.
>>=20
>> I believe having an ashift=3D9 on your mixed-drive setup would have =
the=20
>> biggest performance impact in terms of reducing performance.
>=20
> Part of my confusion about the ashift thing is I thought ashift=3D9 =
was for
> 512/512 logical/physical. Is this still the case?
>=20
> On a different machine which has been running since FreeBSD12 was =
-current,
> one of the disks in the array went bang. zdb shows ashift=3D9 (as was =
default
> when it was created). The only available replacement was an otherwise=20=

> identical disk but 512 logical/4096 physical. zpool status mildly =
warns=20
> about preformance degradation like this:
>=20
> ada2    ONLINE       0     0     0  block size: 512B configured, 4096B =
native
>=20
>  state: ONLINE
> status: One or more devices are configured to use a non-native block =
size.
>      Expect reduced performance.
> action: Replace affected devices with devices that support the
>      configured block size, or migrate data to a properly configured
>      pool.
>=20
> The other part of my confusion is that I understood zfs to set its own=20=

> blocksize on the fly.


You're correct in that ZFS has its own concept of a block size (the =
"recordsize" property) but this is not the same as the block size =
concerning ashift.  When "zpool" complains about "non-native block size" =
it is talking about the physical block size of the underlying vdev.  =
That is the smallest unit of data that are read or written from the =
device.  (It also has an impact on where partitions can be addressed.)

When hard drives became larger the number of bits used to address =
logical blocks (LBAs) became insufficient to reference all blocks on the =
device.  One way around this, and to enable devices to store more total =
data, was to make the referenced blocks larger.  (Larger block sizes are =
also good in that they require relatively less space for ECC data.)  =
Hence, the 4K "advanced format" drives arrived.  Before that, block =
(a.k.a. sector) sizes typically had been 512 bytes for hard drives.  =
After, it became 4096 bytes.

For some drives, the device actually utilises 4096-byte sectors but =
advertises a 512-byte sector size to the outside world.  =46rom a read =
standpoint this doesn't create a problem.  It is when writing that you =
can incur performance issues.  This is because writing/updating a =
512-byte sector within a 4096-byte physical sector involves a =
read-modify-write operation: the original 4096-byte contents must be =
read, then the 512-byte subset updated, and finally the new 4096-byte =
whole re-written back to disk.  That involves more than simply writing a =
512-byte block as-is to a 512-byte sector.  (In similar fashion, =
partitions not aligned on a 4K boundary can incur performance =
degradation for 4096-byte physical sectors that advertise as 512-byte.)


> (I guess there must be some performance degradation but it's not
> yet enough for me to notice. Or it might only be noticable if low on =
space).


ZFS has a lot of caching, plus the use of ZIL "batches" writes, and all =
of this can ameliorate the effects of misaligned block sizes and =
partition boundaries.  (Large sequential writes are best for =
performance, especially in spinning disks that incur penalties for head =
movement and can incur rotational delays.)  But, if you have a =
write-intensive pool, you are unnecessarily causing yourself a =
performance hit by not using the correct ashift and/or partition =
boundaries.

BTW, low space mainly affects performance due to fragmentation.  It is a =
different issue vs. mismatched block size (ashift).

When I replaced my ashift=3D9 512-byte drives I eventually recreated the =
pool with ashift=3D12.  Using ashift=3D12 on pools with 512-byte sector =
size drives will not incur any performance penalty, which is why ashift =
defaults to 12 nowadays.  (I wouldn't be surprised if the default =
changes to ashift=3D13 due to the prevalence of SSDs these days.)

Cheers,

Paul.