Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 04 May 2002 07:30:30 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Eric Jacobs <eaja@erols.com>
Cc:        fs@freebsd.org, Bakul Shah <bakul@bitblocks.com>
Subject:   Re: Filesystem
Message-ID:  <3CD3F086.9F400956@mindspring.com>
References:  <200205032031.QAA24496@repulse.cnchost.com> <PstOfc.3cd34a56.c67ea6@localhost>

next in thread | previous in thread | raw e-mail | index | archive | help
Eric Jacobs wrote:
> > Plan9 does ".." right.  The same can be done in Unix by
> > storing the rooted path in the kernel for a process'es
> > current working dir. and by following some path rewrite
> > rules:
> >
> >     <prefix>/<component>/..          == <prefix>
> >     <prefix>/<component>/../<suffix> == <prefix>/<suffix>
> >     /../<suffix>                     == /<suffix>
> 
> Those rules aren't valid on the account of syntax alone. You would
> have to know which components are symbolic links. And once you take
> into account symbolic links, you have essentially what namei does
> anyway.
> 
> I think what Terry Lambert was saying was that since hard-linking
> directories isn't allowed anyway, there's no need to refcount them,
> except for the subdirectory counting tricks.

I meant that ".." being treated as a link is useful, because the
link count itself can be useful information.  However, the trade
off is that it limits the number of subdirectories.

The trade off in the other direction is that you have to be
prepared to descend into the directory.  This isn't really that
big a deal these days, now that there is an attribute bit
indicating the entry is a directory in the directory entry itself,
so it's possible to both avoid the stat, and still get the
information, if the link count is such that it "indicates" there
are no subdirectories.

Basically, some software will have to be hacked to traverse a
directory for subdirectories, instead of just stat'ing the
parent inode, and only traversing if the link count was > 2.

The disallowing of hard links on directories was actually my
suggestion from ~1994, on the basis of working around POSIX time
update requirements for hosted file services.  If you pretend
that directories are special, and that they aren't files, you can
escape from a number of time updates that would otherwise be a
"SHALL update" vs. a "SHALL mark for update".  Hard links on
directories also fail to maintain parent/child relationships
properly.  Without such links, you are guarantted that you can
cache the parent in the child inode, which can let you further
speed reverse traversal.  Since it was only ever an option for
root, it's really no big loss.


> > You would also have to deal with middle directories being
> > renamed, filesystems being forcibly unmounted and so on.
> >
> > Not storing the entire path for cwd may have been the right
> > decision for '70s but not since then....
> 
> The entire path is stored indirectly via the VFS name cache, so
> getcwd() works _even_ for filesystems which do not implement "..".
> Implementing ".." at the VFS level would be just as simple. Probably
> the only reason it isn't is because it has been traditionally handled
> at the FS level.

The cache implementation LRU's it out.

Saving the path-on-open works when not doing so fails, only because
leaf nodes of type file don't maintain proper parent pointers.

The implementation at the VFS level should be handled by having
real vnodes/inodes for hard links.  Maintaing the link-to-link
relationship would require some additional overhead, but it's
minor.  Doing this would also allow you to store the parent inode
of any inode... and since non-leaf inodes are always guaranteed
to be directories, the recoverability of any open file's path to
the root is guaranteed.  If 128 bytes is too large a stretch, it
can be done with smaller "link nodes", but the net effect is the
same: by moving the link out to an abstract FS artifact, rather
than an artifact of a count and a directory entry, you gain a lot
of benefit.


> > > In any case, it's still an incredibly bad idea to have even a tenth of
> > > that man objects in a single directory, period.
> >
> > IMHO it is a bad idea to not have evolved directories to use a B-tree
> > representation (at least when the number of entries exceed some
> > threshold.  Implement mechanisms and leave policies to the users!
> 
> If you can handle access considerations yourself, one creative solution
> might be to use getfh(2) and fhopen(2) and store the file handles however
> way you want. This bypasses the kernel lookup entirely.

I mentioned this, as a means of getting a flat (inode) name space.

The only real problem with this (and it's a doozy!) is that the
fsck process expects to have a real directory from which it can
derive the reference count, or the inode is considered "lost" and
will end up in "lost+found" on the next fsck, as an FS inconsistency.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3CD3F086.9F400956>