From owner-freebsd-stable@FreeBSD.ORG Mon Oct 13 20:47:27 2014 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 25741631 for ; Mon, 13 Oct 2014 20:47:27 +0000 (UTC) Received: from gw.catspoiler.org (cl-1657.chi-02.us.sixxs.net [IPv6:2001:4978:f:678::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C98D3D1A for ; Mon, 13 Oct 2014 20:47:26 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id s9DKlGxD030176; Mon, 13 Oct 2014 13:47:20 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201410132047.s9DKlGxD030176@gw.catspoiler.org> Date: Mon, 13 Oct 2014 13:47:16 -0700 (PDT) From: Don Lewis Subject: Re: getting to 4K disk blocks in ZFS To: killing@multiplay.co.uk In-Reply-To: <43D94A22FBD2477FBBDEFF16C0088DDA@multiplay.co.uk> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: freebsd-stable@FreeBSD.org, fullermd@over-yonder.net X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Oct 2014 20:47:27 -0000 On 13 Oct, Steven Hartland wrote: > ----- Original Message ----- > From: "Matthew D. Fuller" > > >> On Mon, Oct 13, 2014 at 11:48:27AM -0700 I heard the voice of >> Darren Pilgrim, and lo! it spake thus: >>> >>> If the default is 4k and (for the limited time they're still common) >>> you use true 512b disks, you can waste space. Sure, but how much >>> space? >> >> The median file in /usr/ports is 408 bytes. Over 90% of the files are >> under 2k, which means the wastage for them is over 100% (before >> counting what gain compression might get). A little offhand mathery >> says it's about 78% extra overhead on the whole. >> >> And that includes the almost hundred megs (over 22% of the total size >> of the FS) for the INDEX.db, plus the ~90 megs of the flat INDEX files >> (another 20%). If you pull those out, the overhead is 130%. >> >> >> (To be sure, relatively few people have ports trees eating most of >> their space, but still; it's pretty pathological. I for one did >> decide some years back to always force 4k on any new FSen to make >> future life simpler, accepting the bloat, but it's there.) > > And thats before you add the overhead if your running RAIDZ... > > A good read on this is > http://blog.delphix.com/matt/2014/06/06/zfs-stripe-width/ This is a timely subject. I'm planning on moving my Cyrus imap mail spool from a 4K/1K UFS filesystem to a three drive raidz1. It looks like the UFS fragmentation overhead is about 2.4%. ZFS ashift=12 increases that to about 17%. Combine that with raidz and now the overhead is about 40%. Ouch!