From owner-freebsd-questions@freebsd.org Fri Sep 11 14:24:26 2015 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E68079CCB43 for ; Fri, 11 Sep 2015 14:24:25 +0000 (UTC) (envelope-from wam@hiwaay.net) Received: from fly.hiwaay.net (fly.hiwaay.net [216.180.54.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B7753178C for ; Fri, 11 Sep 2015 14:24:25 +0000 (UTC) (envelope-from wam@hiwaay.net) Received: from kabini1.local (dynamic-216-186-222-143.knology.net [216.186.222.143] (may be forged)) (authenticated bits=0) by fly.hiwaay.net (8.13.8/8.13.8/fly) with ESMTP id t8BEON3X009099 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO) for ; Fri, 11 Sep 2015 09:24:24 -0500 Subject: Re: followup storage question References: <55F2D086.6060509@hiwaay.net> <3B589E85-4C75-4021-9B37-E022BC33AFA4@ccsys.com> Cc: FreeBSD Questions !!!! From: "William A. Mahaffey III" Message-ID: <55F2E417.7040704@hiwaay.net> Date: Fri, 11 Sep 2015 09:29:53 -0453.75 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <3B589E85-4C75-4021-9B37-E022BC33AFA4@ccsys.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Sep 2015 14:24:26 -0000 On 09/11/15 09:23, Chad J. Milios wrote: >> On Sep 11, 2015, at 8:59 AM, William A. Mahaffey III = wrote: >> >> >> >> The Wiki page https://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/9.0-RELEAS= E illustrates using gnop to enforce 4K alignment of gpt partitions for su= bsequent use by ZFS. However the gpart commands also use the '-a 4k' argu= ments, aligning partitions on 4k boundaries as I understand things. Is th= e gnop command also necessary ? TIA & have a nice weekend. >> >> >> --=20 >> >> William A. Mahaffey III > Yes, handling separately both facets of the same underlying issue is ne= cessary. Those facets being the partition's alignment upon the outer devi= ce and the partition's block size that the device node reports to ZFS. > > The latter can be done a different way, effectively, in later versions = of FreeBSD there is a sysctl, vfs.zfs.min_auto_ashift which you can set t= o 12 for 4096 byte blocks or 9 for the default 512 bytes. (The ashift val= ue is the exponent over the number 2 to get the number of bytes in a bloc= k.) > > The old gnop way still works just fine so I still use that method, pers= onally. This definitely only has to be done when vdev(s) are added/create= d/replaced* on the pool, not on every mount/import, by then ZFS clearly l= istens to the formatting metadata it stamped on the vdev instead of what = the ioctls of the device node say and so will always write larger and cor= rectly aligned blocks. (I'm not sure the reverse direction, not a typical= use, if it holds true without gnop every time, and I know the min_auto_a= shift won't help there, being if for some reason you intend gnop for simu= lating smaller blocks to ZFS from larger device node blocks, say you want= ed to allow a certain amount of write amplification for more efficiently = storing lots of small files/directories/metadata. In that case you may ne= ed to enable the gnop every time. I'm not sure because I don't run any po= ols that may but I know you can if you want for that reason, space overhe= ad. It'd take some testing and actual measurement for me to confidently d= ecide gnop can be subsequently skipped after the vdev initialization if g= oing in that opposite direction was your goal. Maybe someone chimes in he= re to let us know for sure. At any rate, gnop is by its nature just about= the fastest and lightest geom class under the sun and I believe you can = keep running thousands of instances busily in production and see no notic= eable overhead.) > > *Yes, mind the gnop or sysctl for ashift whenever replacing as well, it= 's a vdev property not copied as part of the data resilvering, it's decid= ed by ZFS for each vdev independently even though having mixed pools seem= s totally unintuitive. I've seen where it's been forgotten at replace tim= e. Then when you do use it, it's sort of a pain to get gnop/ZFS to relinq= uish the vdev if you do an online replace and then want to try to clear o= ff the gnop mode. I'd just leave it on there and upon reboot it'll disapp= ear and ZFS will pick up the real vdev and properly do what you want with= it. There should be no problem with years of uptime in the meantime and = then coming up slightly differently on next boot bypassing gnop and with = all correct ashift. Excellent, clear as a bell :-). Thanks. --=20 William A. Mahaffey III ---------------------------------------------------------------------- "The M1 Garand is without doubt the finest implement of war ever devised by man." -- Gen. George S. Patton Jr.