Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Oct 2014 16:51:21 -0700
From:      Charles Swiger <cswiger@mac.com>
To:        Don Lewis <truckman@FreeBSD.org>
Cc:        freebsd-stable@FreeBSD.org, lyndon@orthanc.ca
Subject:   Re: getting to 4K disk blocks in ZFS
Message-ID:  <2272C292-3FAE-49CB-968A-E31A606EC77C@mac.com>
In-Reply-To: <201410132302.s9DN2F91030438@gw.catspoiler.org>
References:  <201410132302.s9DN2F91030438@gw.catspoiler.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Oct 13, 2014, at 4:02 PM, Don Lewis <truckman@FreeBSD.org> wrote:
> On 13 Oct, Charles Swiger wrote:
>> Hi--
>> 
>> On Oct 13, 2014, at 2:25 PM, Lyndon Nerenberg <lyndon@orthanc.ca>
>> wrote:
>> [ ... ]
>>> On any real-world system where you're running ZFS, it's unlikely the
>>> 4K block overhead is really going to be an issue.  And the underlying
>>> disk hardware is moving to 4K physical sectors, anyway.  Sooner or
>>> later you're just going to have to suck it up.
>> 
>> Or SSDs, which currently have anywhere from 2KB to 16KB "sectors".
> 
> Which is even worse because you're more likely to care about wasted
> space because of the much higher cost per byte.

Yep.  Mail spools see a surprising amount of disk activity, and while SSDs
do wonderfully for the read side and seek times, lots of small writes isn't
something they handle very well.

>> I suspect that MIX -- http://en.wikipedia.org/wiki/MIX_%28Email%29 --
>> will gain in popularity.  Big messages are kept one per file, just as
>> Maildir does, but MIX also does a pretty good job of conserving inodes
>> (or equivalent) and minimizing wasted space from intrinsic
>> fragmentation due to filesystem blocksize by aggregating small
>> messages together.
> 
> Interesting, but it would be nice to have a more generic solution that
> could be used to solve the equivalent problem with /usr/ports and
> similar sorts of things.  For instance, it looks like /usr/src expands
> by quite a bit on an ashift=12 raidz1, though not quite as much as my
> mail spool.

For readonly data, it's easy enough to keep the tree in a tarball, iso, or
similar and let libarchive / bsdtar / mount -t cd9660 deal with it.

As soon as you start trying to update pieces of that content, though, it
turns immediately back into the same hard problem.  Any storage medium
that imposes a minimum physical sector size is going to demand that the
filesystem on top of it honor that or suffer expensive read-modify-write
cycles when the logical sector size is smaller than the actual physical
sector size.

Regards,
-- 
-Chuck




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2272C292-3FAE-49CB-968A-E31A606EC77C>