From owner-freebsd-stable@FreeBSD.ORG Thu Sep 11 01:22:09 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4A7AB299; Thu, 11 Sep 2014 01:22:09 +0000 (UTC) Received: from smtp1.multiplay.co.uk (smtp1.multiplay.co.uk [85.236.96.35]) by mx1.freebsd.org (Postfix) with ESMTP id 0E44B7ED; Thu, 11 Sep 2014 01:22:08 +0000 (UTC) Received: by smtp1.multiplay.co.uk (Postfix, from userid 65534) id 52DF220E7088E; Thu, 11 Sep 2014 01:22:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.multiplay.co.uk X-Spam-Level: ** X-Spam-Status: No, score=2.2 required=8.0 tests=AWL,BAYES_00,DOS_OE_TO_MX, FSL_HELO_NON_FQDN_1,RDNS_DYNAMIC,STOX_REPLY_TYPE autolearn=no version=3.3.1 Received: from r2d2 (82-69-141-170.dsl.in-addr.zen.co.uk [82.69.141.170]) by smtp1.multiplay.co.uk (Postfix) with ESMTPS id 86FDD20E7088B; Thu, 11 Sep 2014 01:22:06 +0000 (UTC) Message-ID: From: "Steven Hartland" To: "Aristedes Maniatis" , "Stefan Esser" , "freebsd-stable" References: <540FF3C4.6010305@ish.com.au> <54100258.2000505@freebsd.org> <5410F0B4.9040808@ish.com.au> Subject: Re: getting to 4K disk blocks in ZFS Date: Thu, 11 Sep 2014 02:22:14 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Sep 2014 01:22:09 -0000 ----- Original Message ----- From: "Aristedes Maniatis" To: "Stefan Esser" ; "freebsd-stable" Sent: Thursday, September 11, 2014 1:45 AM Subject: Re: getting to 4K disk blocks in ZFS > Thanks Stefan and Peter for the highly informative posts. > > On 10/09/2014 5:48pm, Stefan Esser wrote: >> ZFS uses variable block sizes by breaking down large blocks to smaller >> fragments as suitable for the data to be stored. The largest block to >> be used is configurable (128 KByte by default) and the smallest fragment >> is the sector size (i.e. 512 or 4096 bytes), as configured by "ashift". > > So this means that the ZFS developers would need to effectively (re)fragment the entire pool if they wanted to develop a way to > increase the ashift size. This sounds like something that isn't going to be solved in the near future (less than three years) if > it is a similar technical problem to inserting another disk into an existing vdev. > > And that means that as it becomes harder to buy older 512 byte disks, everyone with a ZFS pool is going to be stuck with managing > quite a lot of downtime as they upgrade. And even more pain if they boot off that pool. > > > On 10/09/2014 4:51pm, Peter Wemm wrote: >> For what its worth, in the freebsd.org cluster we automatically align >> everything to a minimum of 4k, no matter what the actual drive is. >> >> We set: sysctl vfs.zfs.min_auto_ashift=12 >> (this saves a lot of messing around with gnop etc) >> >> and ensure all the gpt slices are 4k or better aligned. > > Should the FreeBSD project change this minimum in the next release? > There seems to be no downside and a huge amount of pain for people > who stumble along with the defaults not knowing what a mess they are > creating to solve later. The downside is wasted space which can be significant and hence when I last suggested just this it was unfortunately rejected. We still maintain a local patch to our source tree which does just this because, as you've mentioned, we don't want the pain so its easier to just run everything as 4k. Regards Steve