Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Sep 2015 09:29:53 -0453.75
From:      "William A. Mahaffey III" <wam@hiwaay.net>
Cc:        FreeBSD Questions !!!! <freebsd-questions@freebsd.org>
Subject:   Re: followup storage question
Message-ID:  <55F2E417.7040704@hiwaay.net>
In-Reply-To: <3B589E85-4C75-4021-9B37-E022BC33AFA4@ccsys.com>
References:  <55F2D086.6060509@hiwaay.net> <3B589E85-4C75-4021-9B37-E022BC33AFA4@ccsys.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 09/11/15 09:23, Chad J. Milios wrote:
>> On Sep 11, 2015, at 8:59 AM, William A. Mahaffey III <wam@hiwaay.net> =
wrote:
>>
>>
>>
>> The Wiki page https://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/9.0-RELEAS=
E illustrates using gnop to enforce 4K alignment of gpt partitions for su=
bsequent use by ZFS. However the gpart commands also use the '-a 4k' argu=
ments, aligning partitions on 4k boundaries as I understand things. Is th=
e gnop command also necessary ? TIA & have a nice weekend.
>>
>>
>> --=20
>>
>>     William A. Mahaffey III
> Yes, handling separately both facets of the same underlying issue is ne=
cessary. Those facets being the partition's alignment upon the outer devi=
ce and the partition's block size that the device node reports to ZFS.
>
> The latter can be done a different way, effectively, in later versions =
of FreeBSD there is a sysctl, vfs.zfs.min_auto_ashift which you can set t=
o 12 for 4096 byte blocks or 9 for the default 512 bytes. (The ashift val=
ue is the exponent over the number 2 to get the number of bytes in a bloc=
k.)
>
> The old gnop way still works just fine so I still use that method, pers=
onally. This definitely only has to be done when vdev(s) are added/create=
d/replaced* on the pool, not on every mount/import, by then ZFS clearly l=
istens to the formatting metadata it stamped on the vdev instead of what =
the ioctls of the device node say and so will always write larger and cor=
rectly aligned blocks. (I'm not sure the reverse direction, not a typical=
 use, if it holds true without gnop every time, and I know the min_auto_a=
shift won't help there, being if for some reason you intend gnop for simu=
lating smaller blocks to ZFS from larger device node blocks, say you want=
ed to allow a certain amount of write amplification for more efficiently =
storing lots of small files/directories/metadata. In that case you may ne=
ed to enable the gnop every time. I'm not sure because I don't run any po=
ols that may but I know you can if you want for that reason, space overhe=
ad. It'd take some testing and actual measurement for me to confidently d=
ecide gnop can be subsequently skipped after the vdev initialization if g=
oing in that opposite direction was your goal. Maybe someone chimes in he=
re to let us know for sure. At any rate, gnop is by its nature just about=
 the fastest and lightest geom class under the sun and I believe you can =
keep running thousands of instances busily in production and see no notic=
eable overhead.)
>
> *Yes, mind the gnop or sysctl for ashift whenever replacing as well, it=
's a vdev property not copied as part of the data resilvering, it's decid=
ed by ZFS for each vdev independently even though having mixed pools seem=
s totally unintuitive. I've seen where it's been forgotten at replace tim=
e. Then when you do use it, it's sort of a pain to get gnop/ZFS to relinq=
uish the vdev if you do an online replace and then want to try to clear o=
ff the gnop mode. I'd just leave it on there and upon reboot it'll disapp=
ear and ZFS will pick up the real vdev and properly do what you want with=
 it. There should be no problem with years of uptime in the meantime and =
then coming up slightly differently on next boot bypassing gnop and with =
all correct ashift.


Excellent, clear as a bell :-). Thanks.


--=20

	William A. Mahaffey III

  ----------------------------------------------------------------------

	"The M1 Garand is without doubt the finest implement of war
	 ever devised by man."
                            -- Gen. George S. Patton Jr.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55F2E417.7040704>