From owner-freebsd-hackers@FreeBSD.ORG Tue Nov 26 09:51:56 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F1ACB9D4; Tue, 26 Nov 2013 09:51:56 +0000 (UTC) Received: from mail-ie0-x22d.google.com (mail-ie0-x22d.google.com [IPv6:2607:f8b0:4001:c03::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A43762FE6; Tue, 26 Nov 2013 09:51:56 +0000 (UTC) Received: by mail-ie0-f173.google.com with SMTP id to1so8913161ieb.32 for ; Tue, 26 Nov 2013 01:51:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=8uRPI+4/4ofFhlwRw8vVeAQ5R38mEnZjo8v6xQ8RQV4=; b=zw6Y3++4/dp5ogcm7lIbbfBx3yHq6mlR8EIjOcINfo9zU0mbArkn7Th1bmiQkjaeKW bDao2SXKAMZ5dGytDrzUM13cV5pOelM1rUeZEHjgf8RWrqSrY6IpLAbYsddmNxfTymFM IBtpHet31bnQFJcAWNhPJwhg9ZLh5zYL2S5IsRePR7s0qG4iKch5ToAw1MuJJyU+msP5 uivIKa++x2DYkq45oYEuTE+bShoYHbRbiNsOb1hYGULdyJyQlNniC7UCEtIGZxtI5vMp EuQdcLJBP7QFaPCdDZsRA6gdNdMHWeQm9kuaZmm/NX4KrQ+XqwIkUSDe8ufjrtxPIkFo f5HA== MIME-Version: 1.0 X-Received: by 10.42.40.83 with SMTP id k19mr18837986ice.3.1385459515757; Tue, 26 Nov 2013 01:51:55 -0800 (PST) Received: by 10.50.225.70 with HTTP; Tue, 26 Nov 2013 01:51:55 -0800 (PST) In-Reply-To: <718836647.19911209.1385302696963.JavaMail.root@uoguelph.ca> References: <718836647.19911209.1385302696963.JavaMail.root@uoguelph.ca> Date: Tue, 26 Nov 2013 10:51:55 +0100 Message-ID: Subject: Re: O_XATTR support in FreeBSD? From: Cedric Blancher To: Freebsd hackers list Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Cc: Rick Macklem , Richard Yao , Pedro Giffuni , Jordan Hubbard X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Nov 2013 09:51:57 -0000 On 24 November 2013 15:18, Rick Macklem wrote: > Jordan Hubbard wrote: >> >> On Nov 23, 2013, at 2:53 PM, Rick Macklem >> wrote: >> >> > Interestingly, FreeBSD has a VOP_OPENEXTATTR() but no syscall >> > that uses it nor support for it in ZFS. (I'm just guessing it >> > was intended for an openat(2) syscall at some time?) > Oops, my mistake. Robert has clarified what the VOP_OPENEXTATTR() > is used for. > > Therefore, there doesn't appear to be any support for subfiles/fork files > in the VFS (VOP_xxx()) or syscalls. > >> > Btw Cedric, if you had mentioned "subfiles" or "fork files" in your >> > subject line, you might have gotten a better answer. I, for one, >> > didn't know what O_XATTR is. I also always get confused w.r.t. what >> > to call these beasts. (NFSv4 calls the named attributes.) >> > >> > Btw, apps can use extended attributes (the limited sized >> > atomically stored/read kind). They aren't just for >> > storing ACLs. >> >> Sigh. Extended Attributes. :-/ >> >> I guess I=92ll raise my head in this discussion. They=92ve certainly >> been the bane of my existence for long enough! >> >> First, supporting EAs properly really involves multiple levels of the >> Unix command and library stack. >> >> The filesystem can support them natively, sure, but that=92s actually >> somewhat optional since you can always (cough cough) stick them in a >> side-store if the rest of the stack cooperates. That=92s where the >> awesome AppleDouble files came from (=93._weirdfile" corresponding to >> =93weirdfile") which remain useful even after filesystems like >> HFS/ZFS/UFS became EA-aware natively because there=92s always those >> foreign data stores to talk to (some early AFP/CIFS/NFS mount, for >> example) and the fact that you still need to *serialize* the dang >> things into tar / cpio / zip / ??? files as well as across network >> replication with tools like rsync. What good is an EA, much less an >> ACL that=92s been stored in an EA, if it gets stripped off the first >> time you tar up a directory and extract it somewhere else? >> >> So I wouldn=92t start with NFSv4 or ZFS if I was asking the question. >> I would start with libc and ask if it had anything similar to >> copyfile(3) so that the tools above it could start actually >> supporting those attributes on a *practical* basis! :) >> > Righto, w.r.t. the client side. For the NFSv4 server side, it > would be what the VFS and file systems support. > > Just to clarify, I wasn't interested in doing this. It was Cedric > who asked about them. My impression is that with the Linux "marketshare", > apps will tend to use the atomic limited size ones already supported > by FreeBSD. (NFSv4.0 and 4.1 don't support these, but there is discussion > w.r.t. adding support for them in a future minor revision of the > protocol, possibly 4.2. The named attributes in NFSv4 expect the > subfile/forkfile model and do not support atomic writing of an > entire extended attribute.) Some clarifications: 1. You do not need more syscalls. Solaris uses the plain openat() syscall for this, with the O_XATTR flag passed to the normal open()/openat() flags to open a named attribute. Likewise read(), write(), mmap() etc work, too. 2. The filesystem implementation should be quite easy too. For example Solaris UFS seems to use the special *file* name '/@/' (or part of a file name, I don't know) to describe that this is a named attribute 3. We use NFSv4 attributes (as opened with O_XATTR, accessed with openat(), read(), write() like normal files, and most importantly with unlimited size) quite expensively, and if you look around you see that institutions like CERN, software like SAMBA or SAP+OracleDB or much code from NIH uses this stuff. 4. Usage seems even to be common enough that the latest ksh version (ksh93v-) includes special support for O_XATTR attributes (like cd -@ filename or a builtin virtual filesystem (like ksh's /dev/tcp/host/portnumber) so that you can use cat Institute Pasteur