Date: Fri, 18 Oct 1996 12:04:06 -0700 (MST) From: Terry Lambert <terry@lambert.org> To: michaelh@cet.co.jp Cc: karl@Mcs.Net, freebsd-hackers@FreeBSD.org Subject: Re: NFS node: disappearing directory Message-ID: <199610181904.MAA01751@phaeton.artisoft.com> In-Reply-To: <Pine.SV4.3.93.961018145633.1217B-100000@parkplace.cet.co.jp> from "Michael Hancock" at Oct 18, 96 03:07:13 pm
next in thread | previous in thread | raw e-mail | index | archive | help
Ah, another question requiring more than a 10 line answer. 8-(. > > > But 3) says it does get reloaded. > > > > Sometimes. But if go to the "up-level" when it happens and do a "ls", you > > get a VERY short list (~10% of what's really there - right about 200 > > entries) > > Umm. Is John around? What kind of memory does the result of readdir go > into? Depends on the FS. For NFS, "bogus cookie handling memory which was allocated for fear the user buffer would be too small to return the data". The problem is cookie related. The fix is to get rid of the cookie code. The cookie code was introduced because the on disk directory structure and the exported directory structure for NFS is no longer the same (the NFS standard did not get changed to accomodate BSD). In other words, struct direct != struct dirent. In FFS, the exported directory structure is the same one that is returned via the system call interface; in other words, it matches the on disk structure of the default FS. Because the NFS structure is a different size, there is no way to know if the interface will have a buffer large enough to deal with the data returned. In general, we can consider the NFS server a consumer of the VFS interface. Similarly, system calls are a consumer of the VFS interface. The generic soloution is therefore to pass back a directory block reference from the FS, and then *use an FS specific VOP_ call to translate the buffer contents on demand ito the consumer buffer format*. For UFS, this would be a null op on the buffer. For NFS, this would be a page allocate and a data copy, and since memory access is significantly faster than network access, this would not impose too much overhead (the cookie crap adds copy overhead anyway). For the generic "restart", the entry offset is passed in, and the block is traversed by the underlying FS. If it gets to an entry whose offset matches that passed in, then it is just returned; if it goes past it, then an FS dependent action is taken: o for pre-compation FS's, the entry is assumed to be compacted, and the entry prior to the entry following the restart point is returned, on the assumption that entries were moved down in the block. o for post-compacting sparse directory block FS's (like ffs), the entry is assumed to have been deleted, and is returned. This is legal because the getdents() call is assumed to work on a "snaphot" of the directory, not the actual directory structure. For what it's worth, the cookie code presumes a copy, which no longer takes place in the unified cache case, so that it can reference it out out the VM instead of out of the (potentially volatile) buffer cache. This implementation was discussed in great detail by myself and Doug Rabson about 18 months ago on the -current list. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199610181904.MAA01751>