From owner-freebsd-fs@freebsd.org Mon Mar 7 22:28:23 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C68FAAC7C67 for ; Mon, 7 Mar 2016 22:28:23 +0000 (UTC) (envelope-from ken@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id B1B5B818 for ; Mon, 7 Mar 2016 22:28:23 +0000 (UTC) (envelope-from ken@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id AD372AC7C65; Mon, 7 Mar 2016 22:28:23 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 94305AC7C64; Mon, 7 Mar 2016 22:28:23 +0000 (UTC) (envelope-from ken@freebsd.org) Received: from mithlond.kdm.org (mithlond.kdm.org [96.89.93.250]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "A1-33714", Issuer "A1-33714" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 65C9C817; Mon, 7 Mar 2016 22:28:23 +0000 (UTC) (envelope-from ken@freebsd.org) Received: from [10.0.0.27] (mbp2013-wired.int.kdm.org [10.0.0.27]) (authenticated bits=0) by mithlond.kdm.org (8.15.2/8.14.9) with ESMTPSA id u27MSGvD009626 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 7 Mar 2016 17:28:21 -0500 (EST) (envelope-from ken@freebsd.org) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\)) Subject: Re: FUSE extended attribute patches available From: Ken Merry In-Reply-To: <6AF0FC23-CC34-43EA-A008-9FB82FB21558@FreeBSD.org> Date: Mon, 7 Mar 2016 17:28:16 -0500 Cc: Julian Elischer , Rick Macklem , fs@freebsd.org, scsi@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <800018199.6694281.1457233600357.JavaMail.zimbra@uoguelph.ca> <56DD2AB6.1030407@freebsd.org> <6AF0FC23-CC34-43EA-A008-9FB82FB21558@FreeBSD.org> To: Robert Watson X-Mailer: Apple Mail (2.3112) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mithlond.kdm.org [96.89.93.250]); Mon, 07 Mar 2016 17:28:21 -0500 (EST) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2016 22:28:23 -0000 > On Mar 7, 2016, at 2:59 AM, Robert Watson wrote: >=20 > FreeBSD and Linux=E2=80=99s extended-attribute models were inherited = from IRIX, as they were introduced to solve the same problems: a place = to metadata such as ACLs, MAC labels, capability masks, etc. IRIX had = three namespaces: one each for =E2=80=9Cuser=E2=80=9D, =E2=80=9Croot=E2=80= =9D, and =E2=80=9Csecure=E2=80=9D, reflecting whether or not they were = managed by the file owner (or permissions), the privileged root user, or = part of the TCB protection mechanism (e.g., for integrity labels). >=20 > These extended attributes should not be confused with the filesystem = feature of the same name in NFSv4, which is sometimes known by the name = =E2=80=9Cfile fork=E2=80=9D or =E2=80=9Cdata streams=E2=80=9D. EAs in = IRIX/FreeBSD/Linux/HPFS/etc are tuple pairs of names and values intended = to be written atomically or updated in place specifically for (shortish) = metadata such as ACLs, rather than being complete separate data spaces = for I/O (e.g., that could be memory mapped). It would be nice to have NFSv4 / Solaris style alternate data streams. = ZFS handles them already, but I suppose it would take more work to = support them in UFS. > In FreeBSD=E2=80=99s design, we incorporated the disjoint namespace = model, providing USER and SYSTEM, the former being managed by the file = owner (and those given suitable permission), and the latter being used = for TCB mechanisms such as the implementations of MAC labels, ACLs, etc. >=20 > In Linux, they adopted a more free-form mechanism based on a single = combined namespace with a prefix =E2=80=94 e.g., user.FOO, and = system.BAR. Over time it looks like that namespace has been expanded in = various filesystem-specific ways. We also have room to expand our = namespace, but from the description below, it=E2=80=99s not clear quite = what the right mechanism is. >=20 > One path would be to introduce a new namespace for filesystem-specific = attributes =E2=80=94 e.g., EXTATTR_NAMESPACE_FS? >=20 > But I think the key question here is whether the existing namespaces = can provide the semantics you need. If not, then we likely need a new = namespace. But then we get the question as to who controls use of the = namespace. Certainly =E2=80=9Cthe filesystem=E2=80=9D is one option, but = then you will get inconsistency in approaches between filesystems and = applications =E2=80=94 across various dimensions including protection = (who can read/modify them?), allocation (who decides what names should = be used for what?), and semantics (what applications can use them, and = who backs them up?). >=20 > For example: who should be responsible for backing up those = attributes? For =E2=80=98system=E2=80=99 attributes in FreeBSD, it is = assumed that backup tools will be aware of the services layered over the = attributes =E2=80=94 e.g., that they will back up ACLs using the ACL = API, rather than backing up the binary EAs holding the ACLs. For = =E2=80=98user=E2=80=99 attributes, it is assumed that backup tools = (e.g., tar) must explicitly preserve them, since they are user-defined = and user-managed. For filesystem-specific attributes, some other choice = will need to be made =E2=80=94 perhaps filesystem-specific backup tools = need to know about them? >=20 > Note that in the Linux EA model, ACLs are actually accessed via the EA = system calls, whereas in FreeBSD, ACLs are a first-class citizen in the = system-call API/ABI, and so user applications don=E2=80=99t treat them = as EAs. We made that choice as filesystems may choose themselves not to = represent ACLs as EAs, and they have real semantics visible to the VFS = layer. In Linux, I believe they chose to pass them via EAs to narrow the = system-call interface for filesystem metadata. Both are legitimate = choices, but this could also trigger discussions about whether new = attributes are best accessed via the EA interface, or new system calls. = For filesystem-specific attributes, EAs are likely the better way to go. It may be that for at least the purposes of FUSE, we can adequately live = under the USER namespace. That would allow for arbitrary namespaces = that Linux-centric filesystems create without significant churn in = FreeBSD to support it. And of course this is only for the front/top end of a FUSE filesystem. = What the filesystem actually does with the extended attributes that the = user sets on top is another question altogether. In the case of IBM=E2=80= =99s LTFS, it stores extended attributes (without the =E2=80=9Cuser.=E2=80= =9D prefix) in the LTFS index, which is an XML file that resides on = tape. For other filesystems, the answer could also vary significantly. = A few that I examined in sysutils/fusefs* used extended attributes on = the backend (usually on a backing filesystem) under Linux only, but not = on the front (user facing) end. In order to make arbitrary namespaces in FUSE work in FreeBSD under the = user namespace, we=E2=80=99ll have to do what Rick was talking about and = just not include the namespace as a prefix when we get/set attributes. = This will allow using any sort of namespace or attribute name that the = FUSE filesystem wants to use. The impact of this, from a porting standpoint, is that the FUSE = filesystems will have to know that on FreeBSD, they cannot/should not = expect to see the =E2=80=9Cuser.=E2=80=9D namespace prefix, but they = might see other namespace prefixes. I took a look at the way LTFS and Gluster work with respect to extended = attributes with MacOS, since it seems that is how MacOS works, and = it=E2=80=99s less obvious to me what is going on with Gluster. = They=E2=80=99ve got this function: #ifdef GF_DARWIN_HOST_OS static int set_xattr_user_namespace_mode (struct posix_private *priv, const char = *str) { if (strcmp (str, "none") =3D=3D 0) priv->xattr_user_namespace =3D XATTR_NONE; else if (strcmp (str, "strip") =3D=3D 0) priv->xattr_user_namespace =3D XATTR_STRIP; else if (strcmp (str, "append") =3D=3D 0) priv->xattr_user_namespace =3D XATTR_APPEND; else if (strcmp (str, "both") =3D=3D 0) priv->xattr_user_namespace =3D XATTR_BOTH; else return -1; return 0; } #endif =20 Although it=E2=80=99s not clear that they do anything with values other = than XATTR_STRIP.=20 With LTFS, since they either assume a =E2=80=9Cuser.=E2=80=9D prefix on = Linux, or no prefix on Windows and MacOS X, it=E2=80=99s more = straightforward. Ken >=20 > Robert >=20 >> On 7 Mar 2016, at 07:16, Julian Elischer wrote: >>=20 >> On 5/03/2016 7:06 PM, Rick Macklem wrote: >>> Ken Merry wrote: >>>> I have patches for FreeBSD=E2=80=99s FUSE filesystem kernel module = to support >>>> extended attributes: >> oh showing off your masochistic side eh? >>=20 >>>> https://people.freebsd.org/~ken/fuse_extattr.20160229.1.txt >>>>=20 >> I spent an hour beating my head against fuse yesterday. >> then realised that it's an old version on our product. We really have = to get off 8.0 >> (hopefully a matter of weeks now to a 10.x switch) >> Now all I need is to find a FreeBSD filesystem expert = (ZFS/NFS/CIFS/GFS) to hire. >>=20 >>=20 >>> The only bit of code I have that might be useful for this patch is: >>> case FUSE_GETXATTR: >>> case FUSE_LISTXATTR: >>> ! /* >>> ! * These can have varying response lengths, and 0 length >>> ! * isn't necessarily invalid. >>> ! */ >>> ! err =3D 0; >>> *** I came up with this: >>> fgin =3D (struct fuse_getxattr_in *) >>> ((char *)ftick->tk_ms_fiov.base + >>> sizeof(struct fuse_in_header)); >>> if (fgin->size =3D=3D 0) >>> err =3D (blen =3D=3D sizeof(struct = fuse_getxattr_out)) ? 0 : >>> EINVAL; >>> else >>> err =3D (blen <=3D fgin->size) ? 0 : EINVAL; >>> break; >>> I think I got the size check right? >>>=20 >>> The big question is... >>> What to do with the NAMESPACE? >>> - My code fails for SYSTEM and does USER without prepending "user.". >>> (That seemed to be what rwatson@ felt was reasonable. I thought our >>> discussion was on a mailing list, but I can't find it.) >>> I've cc'd him. Maybe he can comment again. >> Is there a standard for extended attributes I should knwo about? >> It seems to me that it's a bit like the wild west. >> Extended attributes seem to be "every OS for himself". >>=20 >>>=20 >>> - If you stick with prepending "user." or "system." there needs to = be >>> some way to bypass this so that attributes that don't start in = "user." >>> or "system." can be accessed. I've seen "trusted." and "glusterfs." >>> on GlusterFS. >>> --> Maybe a new namespace called something like "nil" that just = bypasses >>> any USER or SYSTEM checks? >>>=20 >>> rick >>>=20 >>>> The patch implements the get/set/delete/list extended attribute = methods. The >>>> listing code also converts extended attribute lists from the = Linux/FUSE >>>> format to the FreeBSD format. For example: >>>>=20 >>>> # touch foo >>>> # ls -la foo >>>> -rwxrwxrwx 1 root wheel 0 Feb 29 21:40 foo >>>> # lsextattr user foo >>>> foo >>>> # setextattr user testattr1 "12345678" foo >>>> # lsextattr user foo >>>> foo testattr1 >>>> # getextattr user testattr1 foo >>>> foo 12345678 >>>> # setextattr user testattr2 "87654321" foo >>>> # lsextattr user foo >>>> foo testattr2 testattr1 >>>> # rmextattr user testattr1 foo >>>> # lsextattr user foo >>>> foo testattr2 >>>> # getextattr user testattr1 foo >>>> getextattr: foo: failed: Attribute not found >>>> # getextattr user testattr2 foo >>>> foo 87654321 >>>>=20 >>>>=20 >>>> Just to be clear on what this does, it only provides extended = attribute >>>> support to FreeBSD applications if the underlying FUSE filesystem = implements >>>> FUSE extended attribute support. Many FUSE filesystems don=E2=80=99t= support the >>>> extended attribute VFS operations. >>>>=20 >>>> I have tested this out on IBM=E2=80=99s LTFS implementation, but I = have not yet found >>>> another FUSE filesystem that supports extended attributes. If = anyone knows >>>> of one, please let me know so I can try it out. (I looked through = a number >>>> of the filesystems in sysutils/fusefs* in the ports tree.) >>>>=20 >>>> Any feedback is welcome. I=E2=80=99m planning to check this into = FreeBSD/head in the >>>> next week or so. >>>>=20 >>>> Obviously, I=E2=80=99ve also ported IBM=E2=80=99s LTFS = implementation to FreeBSD. It works >>>> in the standard FUSE mode, and you can also link it into an = application as a >>>> library if you don=E2=80=99t want to incur the overhead of running = through FUSE. I >>>> haven=E2=80=99t gotten around to packaging it up to go out for = testing / review. >>>>=20 >>>> If anyone has IBM LTO-5 or newer tape drives, or IBM TS1140 or = newer tape >>>> drives, and wants to try it out, let me know. I=E2=80=99ll send = you the code when >>>> I=E2=80=99ve got it at least somewhat ready. This is IBM-specific, = and won=E2=80=99t work >>>> on HP tape drives. >>>>=20 >>>> Ken >>>> =E2=80=94 >>>> Ken Merry >>>> ken@FreeBSD.ORG >>>>=20 >>>>=20 >>>>=20 >>>> _______________________________________________ >>>> freebsd-fs@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs >>>> To unsubscribe, send any mail to = "freebsd-fs-unsubscribe@freebsd.org" >>> _______________________________________________ >>> freebsd-fs@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs >>> To unsubscribe, send any mail to = "freebsd-fs-unsubscribe@freebsd.org" >>>=20 >>>=20 >>=20 >=20 =E2=80=94=20 Ken Merry ken@FreeBSD.ORG