Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 07 Aug 2001 01:08:19 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Michael Reifenberger <root@nihil.plaut.de>
Cc:        FreeBSD-Current <current@FreeBSD.ORG>, fs@FreeBSD.ORG
Subject:   Re: Linux ls fails on DEVFS /dev
Message-ID:  <3B6FA1F3.2563C15C@mindspring.com>
References:  <20010805104350.A1188-100000@nihil>

next in thread | previous in thread | raw e-mail | index | archive | help
Michael Reifenberger wrote:
> linux ls fails on DEVFS /dev because linux_getdents fails because
> linux_getdents uses VOP_READDIR( ..., &ncookies, &cookies ) instead of
> VOP_READDIR( ..., NULL, NULL ) because it seems to need the offsets for
> linux_dirent and sizeof(dirent) != sizeof(linux_dirent)...
> 
> If I eliminate the usage of cookies, then a ls on at least
> a cd9660 mounted dir fails with not finding all direntries.
> 
> So the question is if all filesystems are expected to implement
> the cookies != NULL case?

The problem is that the interface is broken by design;
for it to be correct, it actually needs to be split into
two pieces: one to snapshot the directory entry block,
and a second one to do the copy out from the on disk
format to the "wire format", which is the NFS externalized
version of the structure (cookies came in when the on disk
directory entry structure changed from the representation
that was historically used for NFS, to its current form;
they basically exist to provide glue between internal and
external representation).

Basically, cookies assume that the client of their services
will be the NFS server.  They are actually a kludge, and
there is a better way to do the same thing, which avoids
the problem, at least as much as it is possible to avoid
the problem, if the directory is changing out from under
your server (or in this case, the Linux consumer).


> BTW:
> Wy doesn't a call to fstat on a directory set a st_blksize != 0?
> Do directories have no preferred blocksize?

Directories aren't files, per se.  Directory entries are
stored in physical disk blocks, to ensure atomicity of the
directory operations.

That said, this does seem to be a compatability issue with
the Linux ABI (see below) that should be addressed in the
Linux ABI implementation, and not the FreeBSD generic stat
implementation.


> I ask because getdents(2) explicitly states one
> should use stat(2) to get the minimum buffersize...

By "getdents(2)", I assume that you are talking about the
Linux system call man page, not the FreeBSD one.

The correct thing to do is to use opendir/readdir/closedir,
and not call "getdents(2)" directly yourself.

In general, as a compatability hedge, I suppose we could
make the stat call behave for directories as it does on
Linux.

In reality, the buffer size should be large, since the
standard directory reading code caches a "snapshot" of the
directory blocks, externalized into the "neutral" format.

For NFS clients of VFS', and for the Linux system call
code, which is also a VFS client, the correct thing to do
is probably to return a large number, and then "short
change" the result by backing off on the copy out, if it
can't be done on a full directory entry block boundary
internal to the FS.

This is true because the cookies are really there to permit
an arbitrary restart on a non-directory block boundary; you
could achieve the same thing for NFS by traversing the block
entries to the entry following the offset; if it was not on
an offset boundary, then you back up one entry (i.e. you are
trying to restart based on a "snapshot" that has changed out
from under you).  It's up to the client to perform duplicate
suppression.

If you did the "short change" trick, then you would always
be guaranteed that you could copy out one or more full
directory entry blocks, and stop on an alignment boundary,
which would eliminate the need to restart in the middle of
a block, which would mean that "NULL, NULL" would be the
correct thing to pass.

The only place this would fail unexpectedly is when the
buffer passed in to the "linux_getdents(2)" call was too
small to hold all the entries that could occur in a single
FreeBSD directory block.  You code would have to be very
badly behaved (intentionally so) to do that.  Still, you
can fake up the restart by copying out a FreeBSD-normal
block into a transition buffer, and then traversing from
there to do the restart (the other trick mentioned above).

This would probably be best, since it would avoid the
problem recurring with any future FS's.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3B6FA1F3.2563C15C>