Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Sep 2008 01:12:08 -0400
From:      "Zaphod Beeblebrox" <zbeeble@gmail.com>
To:        "Jeremy Chadwick" <koitsu@freebsd.org>
Cc:        Derek Kuli?ski <takeda@takeda.tk>, freebsd-stable@freebsd.org, Clint Olsen <clint.olsen@gmail.com>, pjd@freebsd.org
Subject:   Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY
Message-ID:  <5f67a8c40809282212s27298b52p50935ae1976d5df4@mail.gmail.com>
In-Reply-To: <20080929040025.GA97332@icarus.home.lan>
References:  <20080921213426.GA13923@0lsen.net> <20080921215930.GA25826@0lsen.net> <20080921220720.GA9847@icarus.home.lan> <249873145.20080926213341@takeda.tk> <20080927051413.GA42700@icarus.home.lan> <765067435.20080926223557@takeda.tk> <20080927064417.GA43638@icarus.home.lan> <588787159.20080927003750@takeda.tk> <5f67a8c40809282030l7888d942q548d570cd0b33be9@mail.gmail.com> <20080929040025.GA97332@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Sep 29, 2008 at 12:00 AM, Jeremy Chadwick <koitsu@freebsd.org>wrote:

> On Sun, Sep 28, 2008 at 11:30:01PM -0400, Zaphod Beeblebrox wrote:
>


> > However, as a core general purpose filesystem, it seems to have flaws,
> not
> > the least of which is a re-separation of file cache and memory cache.
>  This
> > virtually doesn't matter for a fileserver, but is generally important in
> a
> > general purpose local filesystem.  ZFS also has a transactional nature
> ---
> > which probably, again, works well in a fileserver, but I find (as a local
> > filesystem) it introduces unpredicable delays as the buffer fills up and
> > then gets flushed en masse.
>
> I'm curious to know how Solaris deals with these problems, since the
> default filesystem (AFAIK) in OpenSolaris is now ZFS.  CC'ing pjd@ who
> might have some insight there.


I certainly am not implying that it won't work as a local filesystem, simply
that this design choice may not be ideal for completely generalized local
workloads --- those same workloads that drove UN*X in general to unified
buffer caches... which appears to be implemented independently by every
major UN*X vendor... solaris may have even been the first.

The ARC is separate from the general VM cache in solaris, too, IIRC.
Solaris' UFS still uses a unified cache.

Most of the problems where ZFS runs the machine out of kernel memory (or
fights with other filesystems for memory, etc) are due to the effects of
it's non-unified cache.  Solaris and new patches to FreeBSD seem to make
this play better, but the fundamental reason for unifying the filesystem and
memory cache was the payoff that local applications memory and file usage
would balance out better if the buffering of files and memory was not just
from the same pool of memory, but in fact the "same thing".

Historically, you had file cache being a percentage of memory (say 10%).
The next innovation (I seem to remember my HpUX 9 workstation doing this)
was to have the division of memory between file and memory caches move
dynamically.  This was better but non-optimal.  This is the state of affairs
now with ZFS too.  The unified caches sprung up in UN*X derivatives shortly
thereafter ... where caching a file and caching memory were one in the
same.  This is where UFS sits.

Expanding on my post, if the job is to serve network disk, the dynamic
division or unified cache strategies probably don't make too much
difference.  The "thumper" offering from sun gives you 48 SATA disks, two
dual core opterons and 16G of memory.  The obvious intention is that most of
that 16G is, in the end, cache for the files (all in 4U and all externally
accessible --- very cool, BTW).

But a general purpose machine is executing many of those libraries and
binaries and mmap()ing many of those files... both operations where the
unified strategy was designed to win.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5f67a8c40809282212s27298b52p50935ae1976d5df4>