From owner-freebsd-current@FreeBSD.ORG Sat Apr 25 01:39:39 2015 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7401D688; Sat, 25 Apr 2015 01:39:39 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 192091D5F; Sat, 25 Apr 2015 01:39:38 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2D/AwAW7zpV/95baINbg15cBYMVwzkJgUcKhTZOAoF1FAEBAQEBAQGBCoQgAQEBAwEBAQEgBCcgCwUWGAICDRkCKQEJJgYIBwQBHASIAggNtkWUSgEBAQEGAQEBAQEBARuBIYoWhDIBARw0B4JogUUFlVSECINRPYV7jXsjhBAiMQeBBDmBAAEBAQ X-IronPort-AV: E=Sophos;i="5.11,644,1422939600"; d="scan'208";a="206046163" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 24 Apr 2015 21:39:33 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id DBD19B3EB1; Fri, 24 Apr 2015 21:39:31 -0400 (EDT) Date: Fri, 24 Apr 2015 21:39:31 -0400 (EDT) From: Rick Macklem To: Jilles Tjoelker Cc: Julian Elischer , freebsd-current@freebsd.org, John Baldwin Message-ID: <326462676.25571625.1429925971889.JavaMail.root@uoguelph.ca> In-Reply-To: <20150424215249.GA96554@stack.nl> Subject: Re: readdir/telldir/seekdir problem (i think) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.11] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Apr 2015 01:39:39 -0000 Jilles Tjoelker wrote: > On Fri, Apr 24, 2015 at 04:28:12PM -0400, John Baldwin wrote: > > Yes, this isn't at all safe. There's no guarantee whatsoever that > > the offset on the directory fd that isn't something returned by > > getdirentries has any meaning. In particular, the size of the > > directory entry in a random filesystem might be a different size > > than the structure returned by getdirentries (since it converts > > things into a FS-independent format). > > > This might work for UFS by accident, but this is probably why ZFS > > doesn't work. > > > However, this might be properly fixed by the thing that ino64 is > > doing where each directory entry returned by getdirentries gives > > you a seek offset that you _can_ directly seek to (as opposed to > > seeking to the start of the block and then walking forward N > > entries until you get an inter-block entry that is the same). > > The ino64 branch only reserves space for d_off and does not use it in > any way. This is appropriate since actually using d_off is a major > feature addition. > Well, at some point ino64 will need to define a new getdirentries(2) syscall and I believe this new syscall can have different/additional arguments. I'd suggest that the new gtedirentries(2) syscall should return a flag to indicate that the underlying file system is filling in d_off. Then the libc functions can use d_off if it it available. (They will still need to "work" at least as well as they do now if the file system doesn't support d_off. The old getdirentries(2) syscall will be returning the old/current "struct dirent" which doesn't have the field anyhow.) Another bit of fun is that the argument for seekdir()/telldir() is a long and ends up 32bits for some arches. d_off is 64bits, since that is what some file systems require. Maybe the library code can only use d_off if it is a 64bit arch and the file system is filling it in. (Or maybe the library can keep track of 32<->64bit mappings for the offsets. I haven't looked at the libc functions for a while, so I can't remember what they keep track of.) rick > A proper d_off would still be useful even if UFS's readdir keeps > masking > off the offset so a directory read always starts at the beginning of > a > 512-byte directory block, since this allows more distinct offset > values > than safely using getdirentries()'s *basep. With d_off, one outer > loop > must read at least one directory block to avoid spinning > indefinitely, > while using getdirentries()'s *basep requires reading the whole > getdirentries() buffer. > > Some Linux filesystems go further and provide a unique d_off for each > entry. > > Another idea would be to store the last d_ino instead of dd_loc into > the > struct ddloc. On seekdir(), this would seek to loc_seek as before and > skip entries until that d_ino is found, or to the start of the buffer > if > not found (and possibly return some entries again that should not be > returned, but Samba copes with that). > > -- > Jilles Tjoelker > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to > "freebsd-current-unsubscribe@freebsd.org" >