Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 1 Dec 2013 15:19:50 -0800
From:      Jordan Hubbard <jordan.hubbard@gmail.com>
To:        Lionel Cons <lionelcons1972@gmail.com>
Cc:        Rick Macklem <rmacklem@uoguelph.ca>, Cedric Blancher <cedric.blancher@gmail.com>, Freebsd hackers list <freebsd-hackers@freebsd.org>, Richard Yao <ryao@gentoo.org>, Pedro Giffuni <pfg@freebsd.org>
Subject:   Extended Attributes and how to avoid them (was Re: O_XATTR support in FreeBSD?)
Message-ID:  <92F46317-D62D-4E19-B687-2A392309A244@mail.turbofuzz.com>
In-Reply-To: <CAPJSo4WvVpjUGkcOFcX19x%2BYBDp3eaf_j=UuoT7epoYmUCcWJQ@mail.gmail.com>
References:  <BC41DB59-5868-432D-9452-00F420934E12@mail.turbofuzz.com> <718836647.19911209.1385302696963.JavaMail.root@uoguelph.ca> <CALXu0UfEQD2y6m5irGQRms=6bY8H854v0Wu9_96JpL4w6wntcg@mail.gmail.com> <706707CA-BD52-4814-BCCE-EB044B062BA6@kientzle.com> <CAPJSo4WvVpjUGkcOFcX19x%2BYBDp3eaf_j=UuoT7epoYmUCcWJQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Dec 1, 2013, at 2:05 PM, Lionel Cons <lionelcons1972@gmail.com> =
wrote:

> But this discussion is *not* about extended attributes, this
> discussion is about Alternate Data Streams. Unfortunately the O_XATTR
> discussion somehow started to cover the Linux "extended attribute
> system", which is utterly useless in the intended use cases (as said,
> no access through normal POSIX read(), write(), mmap(), no unlimited
> size, no sparse data support (aka SEEK_HOLE, SEEK_DATA) etc etc).

I think this discussion doesn't really *know* what it's about, frankly, =
because there are so many possible avenues to choose from! :-)

As we saw earlier, there is apparently some interest in supporting ADS =
for Windows clients, though the question of how to actually add that =
support seems primarily the job of Samba (or whatever BSD-licensed =
equivalent someday emerges), so there's really not much to discuss there =
from FreeBSD's perspective since FreeBSD itself has little to say on the =
subject.  If native CIFS support ever becomes a possibility, I'm sure it =
will come up again!

Then there's the whole topic of EAs (and I don't know who said Linux EAs =
represented some sort of gold standard - I certainly didn't) and what =
the intended use cases are.   Let's stick with the intended (and =
citable) use cases, if you please, because a lot of academic debate over =
the years about "how EAs should work" has been, to be perfectly honest, =
ultimately *pointless*.   Academically speaking, there's nothing you can =
do with an EA that you can't conceptually do just as well, if not =
better, with a detached attribute database because academics don't have =
to worry about their EAs working anywhere outside a laboratory setting!

It's the *pragmatic* discussions and clearly defined use cases that =
carry more weight (if not ALL the weight) - that's where you get into =
real-world concerns about EAs and how to avoid them and their associated =
files parting company, how to serialize and back them up, what clients =
are *actually* going to use them and what APIs they need, etc. etc.

Since you brought up POSIX APIs, let's talk about that for a second.  =
I've worked with EAs "in the field", as it were, a lot (a LOT) and no =
one during my long history with them has ever demanded the ability to =
call read() or write() on an EA, to mmap() one, or to store sparse data =
in one.  I would love to know which apps actually need to do that (and =
why), because other than "unlimited size", none of those demands have =
ever hit any bug database I've had access to.   I'm also generally not =
one to throw marketing numbers around in a technical conversation, but =
with 72 million seats and over 1 million applications (and by all means =
fact-check those numbers), if the ability to use EAs in that fashion =
were truly necessary, I suspect I would have heard that early and often. =
  If anything, the trend has been in the other direction - people want a =
simple file property getting/setting API that maybe uses EAs under the =
covers or maybe it doesn't, all they know is that they can hand the API =
a file handle (or path) and a dictionary and The Right Thing happens for =
storing the EAs, the converse also being true for getting them.   EAs =
just are not first-class filesystem citizens and, frankly, they don't =
really need to be in order to be "useful enough" for those situations =
where an application or bit of OS middleware really needs a way of =
storing some extended metadata for a file in a filesystem-neutral =
fashion (and we've already covered the network filesystem and archiver =
scenarios which make that important).

I'll opine that If FreeBSD really wants to support EAs in a "useful =
enough" way, then the best way of doing so is to stay focused on the =
pragmatic "this our usage cases, and we are not afraid to describe them =
in detail!" side of the street because, as I said, the academic =
discussions generally don't lead anywhere but in circles.   A pragmatic =
approach will, conversely, lead to doing just the basic minimums and not =
waste time implementing anything that won't actually be needed in =
real-world scenarios.

Heck, if we really want to get all academic about something here, let's =
forget about EAs and ADS as comparatively uninteresting technologies =
from the 90's and start talking instead about file object stores that =
are far more flexible than what we have now!

I don't want to have my filesystem view be necessarily hierarchical =
(that should be a policy decision, not intrinsic to the filesystem =
itself).  I don't want any process to necessarily be able to see any =
part of the file object space save that which I explicitly grant it or =
its children.  I don't want to have to think about where a file object =
lives - I'd like it to be able to move around (memory, on-disk, "the =
cloud", etc) purely in response to how "hot" it is without me having to =
know or care about anything other than the object changing out from =
under me (which should also be an intrinsic part of the filesystem =
access APIs).   I want file objects to be able to have arbitrary =
properties of any type or size, and able to reference other file =
objects, such that I don't have to keep side-stores around everywhere to =
facilitate a lot of basic operations (like searching) that should be =
intrinsic to the object store, or at least handled by a first class OS =
service with the ability to be co-resident with it so things like =
indexing are actually *efficient*.

Can we have that discussion instead?  It would be more fun. :-)

- Jordan




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?92F46317-D62D-4E19-B687-2A392309A244>