Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 Mar 2016 21:39:07 -0500 (EST)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Ken Merry <ken@freebsd.org>
Cc:        Robert Watson <rwatson@FreeBSD.org>, Julian Elischer <julian@FreeBSD.ORG>,  fs@freebsd.org, scsi@freebsd.org
Subject:   Re: FUSE extended attribute patches available
Message-ID:  <436595384.8930140.1457404747058.JavaMail.zimbra@uoguelph.ca>
In-Reply-To: <BBF1EEE5-A6A9-46A0-B5E5-9FFD90631636@freebsd.org>
References:  <CD5FCB90-1952-4014-BBE0-1BFF1EF85E17@freebsd.org> <800018199.6694281.1457233600357.JavaMail.zimbra@uoguelph.ca> <56DD2AB6.1030407@freebsd.org> <6AF0FC23-CC34-43EA-A008-9FB82FB21558@FreeBSD.org> <BBF1EEE5-A6A9-46A0-B5E5-9FFD90631636@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Ken Merry wrote:
>=20
>=20
> > On Mar 7, 2016, at 2:59 AM, Robert Watson <rwatson@FreeBSD.org> wrote:
> >=20
> > FreeBSD and Linux=E2=80=99s extended-attribute models were inherited fr=
om IRIX, as
> > they were introduced to solve the same problems: a place to metadata su=
ch
> > as ACLs, MAC labels, capability masks, etc. IRIX had three namespaces: =
one
> > each for =E2=80=9Cuser=E2=80=9D, =E2=80=9Croot=E2=80=9D, and =E2=80=9Cs=
ecure=E2=80=9D, reflecting whether or not they were
> > managed by the file owner (or permissions), the privileged root user, o=
r
> > part of the TCB protection mechanism (e.g., for integrity labels).
> >=20
> > These extended attributes should not be confused with the filesystem
> > feature of the same name in NFSv4, which is sometimes known by the name
> > =E2=80=9Cfile fork=E2=80=9D or =E2=80=9Cdata streams=E2=80=9D. EAs in I=
RIX/FreeBSD/Linux/HPFS/etc are
> > tuple pairs of names and values intended to be written atomically or
> > updated in place specifically for (shortish) metadata such as ACLs, rat=
her
> > than being complete separate data spaces for I/O (e.g., that could be
> > memory mapped).
>=20
> It would be nice to have NFSv4 / Solaris style alternate data streams.  Z=
FS
> handles them already, but I suppose it would take more work to support th=
em
> in UFS.
>=20
When this was discussed previously, Jordan Hubbard pointed out that most of=
 the
work is making sure the userland utilities (like backup utilities...) know =
about them
and what to do with them.

I am not familiar with the userland issue, but if that was resolved, person=
ally, I don't think
a lack of support in UFS would be a showstopper. (Assuming that it would be=
 an
addition and not a replacement for extended attributes.)

I do recall that someone at Cern is adamant about this. (I think they creat=
e file
forks with Gbytes of data and can't live without them.)

> > In FreeBSD=E2=80=99s design, we incorporated the disjoint namespace mod=
el,
> > providing USER and SYSTEM, the former being managed by the file owner (=
and
> > those given suitable permission), and the latter being used for TCB
> > mechanisms such as the implementations of MAC labels, ACLs, etc.
> >=20
> > In Linux, they adopted a more free-form mechanism based on a single
> > combined namespace with a prefix =E2=80=94 e.g., user.FOO, and system.B=
AR. Over
> > time it looks like that namespace has been expanded in various
> > filesystem-specific ways. We also have room to expand our namespace, bu=
t
> > from the description below, it=E2=80=99s not clear quite what the right=
 mechanism
> > is.
> >=20
> > One path would be to introduce a new namespace for filesystem-specific
> > attributes =E2=80=94 e.g., EXTATTR_NAMESPACE_FS?
> >=20
> > But I think the key question here is whether the existing namespaces ca=
n
> > provide the semantics you need. If not, then we likely need a new
> > namespace. But then we get the question as to who controls use of the
> > namespace. Certainly =E2=80=9Cthe filesystem=E2=80=9D is one option, bu=
t then you will get
> > inconsistency in approaches between filesystems and applications =E2=80=
=94 across
> > various dimensions including protection (who can read/modify them?),
> > allocation (who decides what names should be used for what?), and
> > semantics (what applications can use them, and who backs them up?).
> >=20
> > For example: who should be responsible for backing up those attributes?=
 For
> > =E2=80=98system=E2=80=99 attributes in FreeBSD, it is assumed that back=
up tools will be
> > aware of the services layered over the attributes =E2=80=94 e.g., that =
they will
> > back up ACLs using the ACL API, rather than backing up the binary EAs
> > holding the ACLs. For =E2=80=98user=E2=80=99 attributes, it is assumed =
that backup tools
> > (e.g., tar) must explicitly preserve them, since they are user-defined =
and
> > user-managed. For filesystem-specific attributes, some other choice wil=
l
> > need to be made =E2=80=94 perhaps filesystem-specific backup tools need=
 to know
> > about them?
> >=20
> > Note that in the Linux EA model, ACLs are actually accessed via the EA
> > system calls, whereas in FreeBSD, ACLs are a first-class citizen in the
> > system-call API/ABI, and so user applications don=E2=80=99t treat them =
as EAs. We
> > made that choice as filesystems may choose themselves not to represent
> > ACLs as EAs, and they have real semantics visible to the VFS layer. In
> > Linux, I believe they chose to pass them via EAs to narrow the system-c=
all
> > interface for filesystem metadata. Both are legitimate choices, but thi=
s
> > could also trigger discussions about whether new attributes are best
> > accessed via the EA interface, or new system calls. For
> > filesystem-specific attributes, EAs are likely the better way to go.
>=20
> It may be that for at least the purposes of FUSE, we can adequately live
> under the USER namespace.  That would allow for arbitrary namespaces that
> Linux-centric filesystems create without significant churn in FreeBSD to
> support it.
>=20
> And of course this is only for the front/top end of a FUSE filesystem.  W=
hat
> the filesystem actually does with the extended attributes that the user s=
ets
> on top is another question altogether.  In the case of IBM=E2=80=99s LTFS=
, it stores
> extended attributes (without the =E2=80=9Cuser.=E2=80=9D prefix) in the L=
TFS index, which is
> an XML file that resides on tape.  For other filesystems, the answer coul=
d
> also vary significantly.  A few that I examined in sysutils/fusefs* used
> extended attributes on the backend (usually on a backing filesystem) unde=
r
> Linux only, but not on the front (user facing) end.
>=20
> In order to make arbitrary namespaces in FUSE work in FreeBSD under the u=
ser
> namespace, we=E2=80=99ll have to do what Rick was talking about and just =
not include
> the namespace as a prefix when we get/set attributes.  This will allow us=
ing
> any sort of namespace or attribute name that the FUSE filesystem wants to
> use.
>=20
> The impact of this, from a porting standpoint, is that the FUSE filesyste=
ms
> will have to know that on FreeBSD, they cannot/should not expect to see t=
he
> =E2=80=9Cuser.=E2=80=9D namespace prefix, but they might see other namesp=
ace prefixes.
>=20
> I took a look at the way LTFS and Gluster work with respect to extended
> attributes with MacOS, since it seems that is how MacOS works, and it=E2=
=80=99s less
> obvious to me what is going on with Gluster.  They=E2=80=99ve got this fu=
nction:
>=20
> #ifdef GF_DARWIN_HOST_OS
> static int
> set_xattr_user_namespace_mode (struct posix_private *priv, const char *st=
r)
> {
>         if (strcmp (str, "none") =3D=3D 0)
>                 priv->xattr_user_namespace =3D XATTR_NONE;
>         else if (strcmp (str, "strip") =3D=3D 0)
>                 priv->xattr_user_namespace =3D XATTR_STRIP;
>         else if (strcmp (str, "append") =3D=3D 0)
>                 priv->xattr_user_namespace =3D XATTR_APPEND;
>         else if (strcmp (str, "both") =3D=3D 0)
>                 priv->xattr_user_namespace =3D XATTR_BOTH;
>         else
>                 return -1;
>         return 0;
> }
> #endif
>=20
> Although it=E2=80=99s not clear that they do anything with values other t=
han
> XATTR_STRIP.
>=20
> With LTFS, since they either assume a =E2=80=9Cuser.=E2=80=9D prefix on L=
inux, or no prefix
> on Windows and MacOS X, it=E2=80=99s more straightforward.
>=20
> Ken
>=20
>=20
> >=20
> > Robert
> >=20
> >> On 7 Mar 2016, at 07:16, Julian Elischer <julian@FreeBSD.ORG> wrote:
> >>=20
> >> On 5/03/2016 7:06 PM, Rick Macklem wrote:
> >>> Ken Merry wrote:
> >>>> I have patches for FreeBSD=E2=80=99s FUSE filesystem kernel module t=
o support
> >>>> extended attributes:
> >> oh showing off your masochistic side eh?
> >>=20
> >>>> https://people.freebsd.org/~ken/fuse_extattr.20160229.1.txt
> >>>>=20
> >> I spent an hour beating my head against fuse yesterday.
> >> then realised that it's an old version on our product. We really have =
to
> >> get off 8.0
> >> (hopefully a matter of weeks now to a 10.x switch)
> >> Now all I need is to find  a FreeBSD filesystem expert (ZFS/NFS/CIFS/G=
FS)
> >> to hire.
> >>=20
> >>=20
> >>> The only bit of code I have that might be useful for this patch is:
> >>>  =09case FUSE_GETXATTR:
> >>>  =09case FUSE_LISTXATTR:
> >>> ! =09=09/*
> >>> ! =09=09 * These can have varying response lengths, and 0 length
> >>> ! =09=09 * isn't necessarily invalid.
> >>> ! =09=09 */
> >>> ! =09=09err =3D 0;
> >>> *** I came up with this:
> >>> =09=09fgin =3D (struct fuse_getxattr_in *)
> >>> =09=09    ((char *)ftick->tk_ms_fiov.base +
> >>> =09=09     sizeof(struct fuse_in_header));
> >>> =09=09if (fgin->size =3D=3D 0)
> >>> =09=09=09err =3D (blen =3D=3D sizeof(struct fuse_getxattr_out)) ? 0 :
> >>> =09=09=09    EINVAL;
> >>> =09=09else
> >>> =09=09=09err =3D (blen <=3D fgin->size) ? 0 : EINVAL;
> >>>  =09=09break;
> >>> I think I got the size check right?
> >>>=20
> >>> The big question is...
> >>> What to do with the NAMESPACE?
> >>> - My code fails for SYSTEM and does USER without prepending "user.".
> >>>  (That seemed to be what rwatson@ felt was reasonable. I thought our
> >>>   discussion was on a mailing list, but I can't find it.)
> >>>  I've cc'd him. Maybe he can comment again.
> >> Is there  a standard for extended attributes I should knwo about?
> >> It seems to me that it's a bit like the wild west.
> >> Extended attributes seem to be "every OS for himself".
> >>=20
> >>>=20
> >>> - If you stick with prepending "user." or "system." there needs to be
> >>>  some way to bypass this so that attributes that don't start in "user=
."
> >>>  or "system." can be accessed. I've seen "trusted." and "glusterfs."
> >>>  on GlusterFS.
> >>>  --> Maybe a new namespace called something like "nil" that just bypa=
sses
> >>>      any USER or SYSTEM checks?
> >>>=20
> >>> rick
> >>>=20
> >>>> The patch implements the get/set/delete/list extended attribute meth=
ods.
> >>>> The
> >>>> listing code also converts extended attribute lists from the Linux/F=
USE
> >>>> format to the FreeBSD format.  For example:
> >>>>=20
> >>>> # touch foo
> >>>> # ls -la foo
> >>>> -rwxrwxrwx  1 root  wheel  0 Feb 29 21:40 foo
> >>>> # lsextattr user foo
> >>>> foo
> >>>> # setextattr user testattr1 "12345678" foo
> >>>> # lsextattr user foo
> >>>> foo     testattr1
> >>>> # getextattr user testattr1 foo
> >>>> foo     12345678
> >>>> # setextattr user testattr2 "87654321" foo
> >>>> # lsextattr user foo
> >>>> foo     testattr2       testattr1
> >>>> # rmextattr user testattr1 foo
> >>>> # lsextattr user foo
> >>>> foo     testattr2
> >>>> # getextattr user testattr1 foo
> >>>> getextattr: foo: failed: Attribute not found
> >>>> # getextattr user testattr2 foo
> >>>> foo     87654321
> >>>>=20
> >>>>=20
> >>>> Just to be clear on what this does, it only provides extended attrib=
ute
> >>>> support to FreeBSD applications if the underlying FUSE filesystem
> >>>> implements
> >>>> FUSE extended attribute support.  Many FUSE filesystems don=E2=80=99=
t support
> >>>> the
> >>>> extended attribute VFS operations.
> >>>>=20
> >>>> I have tested this out on IBM=E2=80=99s LTFS implementation, but I h=
ave not yet
> >>>> found
> >>>> another FUSE filesystem that supports extended attributes.  If anyon=
e
> >>>> knows
> >>>> of one, please let me know so I can try it out.  (I looked through a
> >>>> number
> >>>> of the filesystems in sysutils/fusefs* in the ports tree.)
> >>>>=20
> >>>> Any feedback is welcome.  I=E2=80=99m planning to check this into Fr=
eeBSD/head
> >>>> in the
> >>>> next week or so.
> >>>>=20
> >>>> Obviously, I=E2=80=99ve also ported IBM=E2=80=99s LTFS implementatio=
n to FreeBSD.  It
> >>>> works
> >>>> in the standard FUSE mode, and you can also link it into an applicat=
ion
> >>>> as a
> >>>> library if you don=E2=80=99t want to incur the overhead of running t=
hrough FUSE.
> >>>> I
> >>>> haven=E2=80=99t gotten around to packaging it up to go out for testi=
ng / review.
> >>>>=20
> >>>> If anyone has IBM LTO-5 or newer tape drives, or IBM TS1140 or newer
> >>>> tape
> >>>> drives, and wants to try it out, let me know.  I=E2=80=99ll send you=
 the code
> >>>> when
> >>>> I=E2=80=99ve got it at least somewhat ready.  This is IBM-specific, =
and won=E2=80=99t
> >>>> work
> >>>> on HP tape drives.
> >>>>=20
> >>>> Ken
> >>>> =E2=80=94
> >>>> Ken Merry
> >>>> ken@FreeBSD.ORG
> >>>>=20
> >>>>=20
> >>>>=20
> >>>> _______________________________________________
> >>>> freebsd-fs@freebsd.org mailing list
> >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> >>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org=
"
> >>> _______________________________________________
> >>> freebsd-fs@freebsd.org mailing list
> >>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> >>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> >>>=20
> >>>=20
> >>=20
> >=20
>=20
>=20
>=20
> =E2=80=94
> Ken Merry
> ken@FreeBSD.ORG
>=20
>=20
>=20
>=20



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?436595384.8930140.1457404747058.JavaMail.zimbra>