Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Mar 1999 18:36:20 -0500 (EST)
From:      "John S. Dyson" <toor@dyson.iquest.net>
To:        tlambert@primenet.com (Terry Lambert)
Cc:        tlambert@primenet.com, unknown@riverstyx.net, dyson@iquest.net, freebsd-chat@FreeBSD.ORG
Subject:   Re: Linux vs. FreeBSD: The Storage Wars
Message-ID:  <199903302336.SAA16847@dyson.iquest.net>
In-Reply-To: <199903302207.PAA05079@usr04.primenet.com> from Terry Lambert at "Mar 30, 99 10:07:43 pm"

next in thread | previous in thread | raw e-mail | index | archive | help
> > > Why doesn't FreeBSD FS stacking work?
> >
> > It never did, and there hasn't been much demand.
> 
> Hey, speak for yourself.  I've gone so far as to approach John Heidemann
> about rereleasing his code donated to the CSRG under the GPL for a
> Linux implementation (yes, I'm deadly serious).
> 
> John's stuff worked before it was damaged into inoperability, and
> it currently works fine on BSDI.
>
With coherency?  It sure wasn't coherent (per the original 4.4 code.)

> 
> 
> > It actually would be worthwhile to totally remove the stacking, or fix
> > it with a VM approach.  It is totally wrong to use buffer/VP approach,
> > but there are those who advocate it (too many people are "bp" heads --
> > bp's are good only for I/O, not object or caching representation.)
> 
> It is wrong to think of vnodes as caching objects instead of backing
> objeccts.
>
I agree with that, but I didn't say anything contrary to that.  I do
claim that the concept of 'bp' caching is the wrong concept.

> 
> Yes, I know all of the unified VM and buffer cache centric arguments
> in favor of this, but the point of having a well defined framework
> and API is the ability to share FS code with other OS's.
>
I agree that the framework should be better defined, once it works.  The
current framework, if defined, will still not work.  The current framework,
with all of Terry's fixes that have been proposed to me, will still not
work.  Many of the important problems are orthogonal to what is being
fixed.

>
>  And not all
> other OS's have unified VM and buffer caches.
>
Most OSes (for servers/workstations) competing with FreeBSD that don't have
merged caches are missing a major capability.  (IMO, are broken.)

>
>  Implemetnation of a
> common API must take into account the lowest common denominator, or
> you will be creating a FreeBSD specific API that is not generally useful.
> 
My concept can support that, and has been well thought out.  However,
it is frightening that someone might still stay with the old, broken 'bp'
approach that is guaranteed to limit flexibility and capability.  A 'bp'
is too concrete, but so is a 'vnode' as it is today.  Given a choice, I
would keep the concept of a 'vnode' though.  (At least the notion of
a 'vnode' can be abstracted such that the current usage isn't damaging.)

> 
> > FS stacking will not help to gain commercial work, but properly working
> > reasonbly sized file I/O does.
> 
> It is (or should be, since the announcement on these lists last week)
> well known that Veritas is porting to Linux.
>
Good for them.

> 
> This code would work in FreeBSD as well, if the Linux and FreeBSD
> VFS stacking frameworks were identical API's.
> 
Just don't adopt a broken scheme that doesn't support networked layers.

>
> Because the API's are not identical (in fact, both are sufficiently
> fluid and architecturally damaged as to render them nearly useless),
> this work is bound to be another checkmark in the Linux column that
> will remain absent from the FreeBSD column.
>
If I ever get to implement an API, it will be very clean and not
limited by the type of scheme that FreeBSD has.

> 
> > >  Why was LFS broken, and then not fixed?
> >
> > It was always broken, and has always been basically a festering mess.
> 
> I think Margo Seltzer would take some issue with this.  I would trust
> her authority as an FS expert above that of anyone in the core team;
> after all, file systems are her life's work.
>
Don't set up a disgreement between me and her.  All it takes is
a competent programmer to see that the code is a mess.  Maybe some
of her ideas are great, but the implementation is pretty messy.
It even does copying to build the segments, and is limited by the
way it uses the buffer cache :-(.  Hack alert.  (I have written
my share of messy code, so it takes one to know one :-)).

A complete reimplementation of what exists, plus an intelligent
cleaner would be necessary to bring it up to the quality of FFS.
With the work of re-creating it, and implementing a really good
cleaner (that might do 'fsck' in background), it would end up being
at best, very slightly better than ffs with softupdates.  In most
areas, there would be no gain.

You seem to confuse the fact that an idea isn't an implementation.  It
is the implementation that I think is inadequate.  It doesn't take
an expert to see that (and only assume that you haven't looked
at or worked on the code because you don't know about all of its
problems) the code is "rough."

If I had my druthers, I would prefer a journaled filesystem, and leave
softupdates for most of the applications that a LFS would be useful
for.

> 
> > It is totally wrong to implement a bp
> > based LFS anyway, note the hacks in vfs_bio to support that travesty.
> 
> With respect, these are historical artifacts that also applied to
> the FFS of the same code vintage, and which predate the unification
> of the VM and buffer cache code.  This is a case of failure to cross
> "T"'s and dot "I"'s during the VM and buffer cache unification wherein
> the equivalent FFS issues *were* addressed.  Code does not mutate.  If
> code stops working, it is a failure in maintenance, not a failure of
> the code (presuming it worked beforehand -- and LFS did; it merely
> lacked a cleaner process to deal with issue like garbage collection
> and fragmentations -- issues addressed in later versions of Margo's code).
>
I kept the code somewhat compatible with the original vfs_bio for the
reason that there is a chance to grab technology from other versions of
*BSD.  For example, if I would have made a 100% break, the chance of applying
other *BSD softupdates would have been much smaller.  There were some subtile
incompatbilities, but much of it seemed to be the fact that FreeBSD doesn't
need 'bp's for caching.  (People kept on forgetting the fact that FreeBSD
logically, FreeBSD doesn't need 'bp's to support caching -- they are for
I/O.)  Converting FFS would take me a few hours (okay, a couple of days.)

> >
> > Which version? and please PR it.  I have *never* seen it in person recently,
> > and locally hacked kernels can cause unexpected brokenness.  The problem
> > of modified programs has been fixed a long time ago.  Also, it has taken
> > awhile to find someone competent to work on the VM/VFS code.  There
> > is a possibility now, but most of the people with the "balls" to work
> > on the code with commit frenzies, are often not careful enough to do so.
> 
> I believe Matt has much of this in hand.
>
Actually certain others seem to really understand the code.  The most
recent problem was only a few days ago, and just running silly benchmarks
would have caught that problem. 

>
> I had to fincd explicity
> demonstration cases for the people who didn't feel like bothering to
> try to follow my theoretical arguments, and refused to work from
> anything but concrete examples.
>
Your 'theoretical' arguments are wrong often enough to make them
"unuseful."  It is like the boy who cried 'wolf': sometimes people just
cannot accept the statements without verification.

> >
> > Alot of work is done privately.  Research != papers, there is NO advantage
> > for a FreeBSD team member to give away the mechanisms for FreeBSD's behavior.
> 
> Malarkey.  What do you care if the software running the ATM machine and
> using the correct algorithm is FreeBSD, or some other software using
> the correct algorithm?  The point of the exercise is to increase overall
> correctness in the world.
> 
The point of the exercise is to improve FreeBSD, and not damage it.  Things
that seem to be partially understood can be exposed.  There are things that
are just not in the interest of FreeBSD to give away.

> 
> Obscurity hurts everyone.  The obscurity of the VM algorithms (not to
> pick favorites, but the VM system is one place where complexity was
> allowed to grow in FreeBSD unshackled by the "we must understand this
> if you do" mentality) was, in fact, damaging to Matt's ability to
> contribute.
>
Matt's ability to contribute was damaged by his inablility to listen.  If
the tools available aren't used, but are made available, and mistakes are
being made, then the problem is with the contributor.

There are things about the VM code that aren't written down, but the info
is available.  There are bits and pieces that I won't give away, but nothing
to do with the FreeBSD code itself.

John


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199903302336.SAA16847>