Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Sep 2015 10:16:46 -0400
From:      "Chad J. Milios" <milios@ccsys.com>
To:        "William A. Mahaffey III" <wam@hiwaay.net>
Cc:        FreeBSD Questions !!!! <freebsd-questions@freebsd.org>
Subject:   Re: followup storage question
Message-ID:  <3B589E85-4C75-4021-9B37-E022BC33AFA4@ccsys.com>
In-Reply-To: <55F2D086.6060509@hiwaay.net>
References:  <55F2D086.6060509@hiwaay.net>

next in thread | previous in thread | raw e-mail | index | archive | help
> On Sep 11, 2015, at 8:59 AM, William A. Mahaffey III <wam@hiwaay.net> wrot=
e:
>=20
>=20
>=20
> The Wiki page https://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/9.0-RELEASE il=
lustrates using gnop to enforce 4K alignment of gpt partitions for subsequen=
t use by ZFS. However the gpart commands also use the '-a 4k' arguments, ali=
gning partitions on 4k boundaries as I understand things. Is the gnop comman=
d also necessary ? TIA & have a nice weekend.
>=20
>=20
> --=20
>=20
>    William A. Mahaffey III

Yes, handling separately both facets of the same underlying issue is necessa=
ry. Those facets being the partition's alignment upon the outer device and t=
he partition's block size that the device node reports to ZFS.

The latter can be done a different way, effectively, in later versions of Fre=
eBSD there is a sysctl, vfs.zfs.min_auto_ashift which you can set to 12 for 4=
096 byte blocks or 9 for the default 512 bytes. (The ashift value is the exp=
onent over the number 2 to get the number of bytes in a block.)

The old gnop way still works just fine so I still use that method, personall=
y. This definitely only has to be done when vdev(s) are added/created/replac=
ed* on the pool, not on every mount/import, by then ZFS clearly listens to t=
he formatting metadata it stamped on the vdev instead of what the ioctls of t=
he device node say and so will always write larger and correctly aligned blo=
cks. (I'm not sure the reverse direction, not a typical use, if it holds tru=
e without gnop every time, and I know the min_auto_ashift won't help there, b=
eing if for some reason you intend gnop for simulating smaller blocks to ZFS=
 from larger device node blocks, say you wanted to allow a certain amount of=
 write amplification for more efficiently storing lots of small files/direct=
ories/metadata. In that case you may need to enable the gnop every time. I'm=
 not sure because I don't run any pools that may but I know you can if you w=
ant for that reason, space overhead. It'd take some testing and actual measu=
rement for me to confidently decide gnop can be subsequently skipped after t=
he vdev initialization if going in that opposite direction was your goal. Ma=
ybe someone chimes in here to let us know for sure. At any rate, gnop is by i=
ts nature just about the fastest and lightest geom class under the sun and I=
 believe you can keep running thousands of instances busily in production an=
d see no noticeable overhead.)

*Yes, mind the gnop or sysctl for ashift whenever replacing as well, it's a v=
dev property not copied as part of the data resilvering, it's decided by ZFS=
 for each vdev independently even though having mixed pools seems totally un=
intuitive. I've seen where it's been forgotten at replace time. Then when yo=
u do use it, it's sort of a pain to get gnop/ZFS to relinquish the vdev if y=
ou do an online replace and then want to try to clear off the gnop mode. I'd=
 just leave it on there and upon reboot it'll disappear and ZFS will pick up=
 the real vdev and properly do what you want with it. There should be no pro=
blem with years of uptime in the meantime and then coming up slightly differ=
ently on next boot bypassing gnop and with all correct ashift.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3B589E85-4C75-4021-9B37-E022BC33AFA4>