Date: Wed, 23 Apr 2003 23:33:53 -0400 From: Allan Fields <fist@afields.ca> To: Wout Mertens <wmertens@cisco.com> Cc: fist@ground.cs.columbia.edu Subject: Re: [FiST] Re: Overlayfs for FiST? Message-ID: <20030424033353.GA4596@afields.ca> In-Reply-To: <Pine.GSO.4.53.0304232315330.9711@bru-cse-075.cisco.com> References: <20030424001211.GB15070@gatekeeper.gatekeeper>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, In my opinion... (O.K. I'll take a dive into this.) This all sounds a lot like (Free)BSD's unionfs. I have tried using unionfs for various tasks, including some related to security. It works quite well, but I noticed there are definitely complicated cases where a complex hierarchy of overlay would be required for it to become practical to use overlays. UnionFS seems a successful proof of concept that demonstrates how overlays can work in simple situations, and in how namespaces can be joined using vnode stacking. Doing overlays in fist would be great: hopefully then, as the BSD templates further mature, an overlayfs could be used in multiplatform environments, much as cryptfs could be. My concern would be that the code quality for any target platform would not be left aside for greater portability. Fist can likely produce very optimal code, if the ports are closely worked into the source bases of the target operating systems. I've been on-and-off trying FreeBSD templates, and it is very good to see the progress thus far. I spent a few nights 1-2 months ago trying to get the make files to integrate with the BSD makefiles under 5.0-RELEASE. Some progress, but nothing usable yet. The 4.x-RELEASE templates are better off since the last update. Another issue with overlays is implementing the mechanisms to migrate changes between different working copies, or layers. At that point, it would even seem to be related to revisioning. Here, authority and immutability also would seem to be applicable, in that the filestore could depending on trust assessment (for instance by ACL), assign different authority to an update. Some of my ideas: - union mounts should have a filtering mask, which determines which layer the changes are effected on. Parameters could be anything which fist potentially allows: filenames, userid, timestamp, size, accesses, acl, etc... - union mounts should allow 3 and more filesystems to be stacked <above> 1 2 3 ... <below> (coalesced into one namespace) where they can occupy different sections of the namespace for instance and should allow dynamic configuration of the stack after the initial union mount point is set. This would imply that there can be a relative weight to a store both in retrieval and storage. (Here is where fan-outs would come into play I would imagine. With the introduction in FreeBSD of GEOM, which I haven't had the chance to fully explore, at least at the device level, things are getting more advanced.) Let me know if these are the types of things you had in mind. If these concepts are one in the same, from the standpoint of theory; I'm not certain which terminology I would prefer. The term union seems to suggest that the namespaces are being brought together to create a combined system composed of numerous member filesystems. While, the term overlay seems to suggest that sections of the filesystem are being ignored and emphasizes intersection in, and overriding factors of the combined system instead. Perhaps both apply. One thing is for sure, this problem goes deeper than simple overlays. It has to do with more than combining two sources when exploring the roots of the issue. Not to diminish the role of the join in namespace. On Wed, Apr 23, 2003 at 23:50:33PM +0200, Wout Mertens wrote: > Hi Erez, > > On Wed, 23 Apr 2003, Erez Zadok wrote: > > > I'll CC the fist list (where this message is suitable) for other people's > > comments. > > > > In message <Pine.GSO.4.53.0304230843220.9711@bru-cse-075.cisco.com>, Wout Mertens writes: > > > Hi there, > > > > > > I'm trying to boot a Linux 2.4 thin client from a readonly nfs root, and I > > > keep going back to my childhood dream, a filesystem that you can overlay > > > over another filesystem and that keeps the changes you make to it. The idea of using NFS for both the base image and overlay, is a good example of the types of applications possible. To what extent this intersects with existing network filesystems might be of interest. > > > > > > The idea would be that the filesystem would keep track of additions, > > > renames, deletions, permissions and so forth, but not touching the > > > filesystem below it. If this is then done with a tmpfs backing store, you > > > get a nonpersistent fs. > > > > > > Right now I solve the problem by copying all files to tmpfs, but this is > > > wasteful. > > > > > > So I was wondering if you have implementation hints, maybe you considered > > > the same things, or you have a half-finished .fist file lying around... > > > > > > Thanks! > > > > > > Wout. > > > > So if I understand you right, you want the f/s to read from one source, but > > when writing, it should write to another location, right? > > > > Do you just want to keep the latest update to files that have been modified, > > or a historical detailed log of all activity (perhaps one that can be rolled > > back). The latter of course is more complex. > > > > Once a file is modified and written, what happens if you try to re-read it? > > Do you get the original unmodified version, or the one just written? The > > latter is a special case of a write-through cachefs (such as Solaris's) but > > one which doesn't write through any changes. > > I want to be able to perform all file operations on files in a certain > filesystem, where the changes are kept somewhere else. In my specific > case, I start fresh and I'll throw away the changes afterwards, but they > could also be kept. Activity log is not necessary in my case. > > It is related to a write-through cachefs, but an important difference is > that deletions, attr changes, etc. should also be handled. cachefs is much > simpler to implement, I would think. > > Maybe I'm being too complicated and the best way would be to just keep the > block level changes on the raw device but not apply them, but then that > wouldn't work for nfs, my goal filesystem, which has no raw device. Makes me think of revisioning in the filesystem. Even if you didn't "commit" changes to a file, they could still exist under a different revision name. There has been, and will be much conversation on this topic (inevitably), as it becomes apparent at various stages that it was both a good thing and a bad thing that a VMS-like model wasn't adopted. > Besides, then you wouldn't be able to see what the changes were, useful > for sandboxed stuff. (Although you should need per-user-visible mounts as > well then) One less exciting application is: the idea of using multiple layers of storage to maintain working copies of data. For instance using an overlay for special purpose attribute/meta-data directories files to avoid filesystem pollution. It's frustrating to have to clean-up after tools that leave their messy attribute directories all over a filesystem. The only other ways to eliminate them are to set very restrictive permissions and risk breaking something, use a special copy of the data in an isolated location (sandbox), or spend the time fixing each potential offender. Luckily build tools and source repositories avoid this type of problem where possible by placing their working files in an object tree in the first place. The concept of an attribute itself, to me: is somewhat risky, since if abused, it's no longer an attribute. What constitutes meta-data anyway? I don't accept that "dot directories are out, so don't worry about them" dictum. It's been abused, so that .dir has lost it's meaning almost entirely: just look at your home directory to find out why. I have for example: .kde/ .w3m/ .netscape/ .procmail/ .emacs/ almost none of this is "special data"! alias ls='ls -a' Under FreeBSD for instance: netatalk is one packages that spews random directories to back proprietary Mac attributes, and assumes that since the attributes are in .thing directories, it's OK to add them at any point. > > It seems to me that one way or another, you'll be needing a fan-out file > > stackable system: one that can have one branch that it treats as read-only > > (say, nfs), and another branch that's a writable directory (perhaps even a > > local disk based f/s). > > I agree. The hard part is deciding on a nice way of keeping the changes. > Possibly something with a subdirectory per type of change, with regular > files replacing/adding to the original filesystem just being in the > corresponding directory on the backing store. > > Original > |--a/ > | `--b.txt > `--c/ > |--d/ > `--e.txt > > Backing Store > |--a/ > | |--b.txt (newer file) > | `--f.txt > |--c/ > | `--.deletions > | `--d/ > | `--e.txt (0-length file) > `--g.txt Reminds me of "white-out" entries in BSD. There is a good paper on the union fs by Jan-Simon Pendry, and of course McKusick's book covers this as well. > > Overlay > |--a/ > | |--b.txt (newer file) > | `--f.txt > |--c/ > `--g.txt > > > True fan-out file system support in fist has been on my todo list for > > several years. It's not an easy task: many OS design assumptions are easily > > broken and have to be addressed. We recently did a prototype two-branch > > read-only unionfs (fan-out) in linux 2.4; we hope to polish it up, add full > > write support, multiple branches, and more, then make it available by > > summer's end. > > Looking forward to that :) > > Cheers, > > Wout. > _______________________________________________ > FiST mailing list > FiST@lists.cs.columbia.edu > http://lists.cs.columbia.edu/mailman/listinfo/fist Allan Fields
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030424033353.GA4596>