From owner-freebsd-current@FreeBSD.ORG Mon Apr 27 20:03:11 2015 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2B1F2E12; Mon, 27 Apr 2015 20:03:11 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id B21EB15C7; Mon, 27 Apr 2015 20:03:10 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CsBABElT5V/95baINcg19cBYMVwy6BSAqFNk4CgW4TAQEBAQEBAYEKhCEBAQQBAQEgBCcgCxsYAgINGQIpAQkmBggHBAEYBASICg2zT5QEAQEBAQYBAQEBAQEcgSGJFYEChCwHAQEcATMHgmiBRQWFLJAqhAiED5QCI4IHHIFtIjEHewkXIoEAAQEB X-IronPort-AV: E=Sophos;i="5.11,659,1422939600"; d="scan'208";a="206445322" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 27 Apr 2015 16:03:08 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 4D4BCB3F5F; Mon, 27 Apr 2015 16:03:08 -0400 (EDT) Date: Mon, 27 Apr 2015 16:03:08 -0400 (EDT) From: Rick Macklem To: Julian Elischer Cc: freebsd-current@freebsd.org, John Baldwin Message-ID: <1101073752.26759547.1430164988301.JavaMail.root@uoguelph.ca> In-Reply-To: <553E676D.1020902@freebsd.org> Subject: Re: readdir/telldir/seekdir problem (i think) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.11] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Apr 2015 20:03:11 -0000 Julian Elischer wrote: > On 4/25/15 4:28 AM, John Baldwin wrote: > > On Saturday, April 25, 2015 02:36:24 AM Julian Elischer wrote: > >> On 4/25/15 1:30 AM, Julian Elischer wrote: > >>> On 4/24/15 10:59 PM, John Baldwin wrote: > >>>> Index: head/lib/libc/gen/telldir.c > >>>> =================================================================== > >>>> --- head/lib/libc/gen/telldir.c (revision 281929) > >>>> +++ head/lib/libc/gen/telldir.c (working copy) > >>>> @@ -101,8 +101,10 @@ > >>>> return; > >>>> if (lp->loc_loc == dirp->dd_loc && lp->loc_seek == > >>>> dirp->dd_seek) > >>>> return; > >>>> - (void) lseek(dirp->dd_fd, (off_t)lp->loc_seek, > >>>> SEEK_SET); > >>>> - dirp->dd_seek = lp->loc_seek; > >>>> + if (lp->loc_seek != dirp->dd_seek) { > >>>> + (void) lseek(dirp->dd_fd, (off_t)lp->loc_seek, > >>>> SEEK_SET); > >>>> + dirp->dd_seek = lp->loc_seek; > >>>> + } > >>> yes I did that yesterday but it still fails when you transition > >>> blocks.. (badly). > >>> > >>> I also tried bigger blocks.. also fails (eventually) > >>> > >>> I did find a way to make it work... you had to seek back > >>> to the first block you deleted on each set.. > >>> then work forward from there again.. unfortunately since > >>> I'm trying to make a microsoft program not fail (via samba) > >>> I have no control over how it does things and seekdir doesn't > >>> know what was deleted anyway... (so the fix is fine for the > >>> test program but not for real life) > >>> > >>> I think I can make the BSD one act like the linux one by changing > >>> the lseek being done to use the offset (loc) plus the buffer seek > >>> address of the target, instead of just going for the buffer base > >>> and > >>> stepping forward through the entries.. > >>> > >>> maybe tomorrow. > >>> > >> The following conditional code makes ours behave the same as the > >> linux > >> one. > >> it breaks several 'rules' but works where ours is clean but > >> fails.. > >> as Rick said.. "maybe that's what we should do too." > >> > >> > >> this is at the end of seekdir() > >> > >> > >> The new code does what linux does.. and shouldn't work.. but does > >> // at least in the limited conditions I need it to. > >> // We'll probably need to do this at work...: > >> > >> > >> The original code is what we have now, but gets mightily confused > >> sometimes. > >> // This is clean(er) but fails in specific > >> situations(when > >> doing commands > >> // from Microft windows, via samba). > >> > >> > >> root@vps1:/tmp # diff -u dir.c.orig dir.c > >> --- dir.c.orig 2015-04-24 11:29:36.855317000 -0700 > >> +++ dir.c 2015-04-24 11:15:49.058500000 -0700 > >> @@ -1105,6 +1105,13 @@ > >> dirp->dd_loc = lp->loc_loc; > >> return; > >> } > >> +#ifdef GLIBC_SEEK > >> + (void) lseek(dirp->dd_fd, (off_t)lp->loc_seek + lp->loc_loc, > >> SEEK_SET); > >> + dirp->dd_seek = lp->loc_seek + lp->loc_loc; > >> + dirp->dd_loc = 0; > >> + lp->loc_seek = dirp->dd_seek; > >> + lp->loc_loc = 0; > >> +#else > >> (void) lseek(dirp->dd_fd, (off_t)lp->loc_seek, SEEK_SET); > >> dirp->dd_seek = lp->loc_seek; > >> dirp->dd_loc = 0; > >> @@ -1114,6 +1121,7 @@ > >> if (dp == NULL) > >> break; > >> } > >> +#endif > >> } > > Yes, this isn't at all safe. There's no guarantee whatsoever that > > the offset on the directory fd that isn't something returned by > > getdirentries has any meaning. In particular, the size of the > > directory entry in a random filesystem might be a different size > > than the structure returned by getdirentries (since it converts > > things into a FS-independent format). > > > > This might work for UFS by accident, but this is probably why ZFS > > doesn't work. > > > > However, this might be properly fixed by the thing that ino64 is > > doing where each directory entry returned by getdirentries gives > > you a seek offset that you _can_ directly seek to (as opposed to > > seeking to the start of the block and then walking forward N > > entries until you get an inter-block entry that is the same). > I just made the stunning discovery that our seekdir/readdir/telldir > code in libc works with > FreeBSD 8.0. > so maybe the problem is that the kernel changed it's behaviour, and > no-one thought to fix libc.. > > (at least it works on one of our 8.0 base appliances.. I'll do more > testing tomorrow.. it's past midnight.) > I suspect that pre-r252438 systems work better for UFS than r252438 or later. That patch changed ufs_readdir() so that it no longer returned the on-disk directory structure. (Among other things, it added code that skipped over d_ino == 0 entries.) As such, r252438 and later systems have UFS where the "logical" offset of a directory entry returned by getdirentries() isn't the same as the "physical" offset for it in the on-disk directory. Having said the above, I have two somewhat inconsistent thoughts: 1 - As jhb has explained, the libc functions aren't safe for telldir()/seekdir() when entries are added/deleted. It just happens that UFS might work ok (and is more likely to work ok when "logical offset" == "physical offset"). 2 - I'm not sure r252438 was a good idea (at least the part that skips invalid d_ino == 0 entries) because I don't think making "logical offset" != "physical offset" is a good idea, if there isn't a good reason to need to do so. I think it is hard to argue that r252438 broke the libc functions. It just happens that cases that aren't guaranteed to work happens to work without r252438. I also think that the use of d_off (or d_cookie, if you prefer that name), which would be the "physical offset" of the next directory entry is the best bet for fixing this, in general. (By in general, I mean for all file systems.) But this will require a new getdirentries(2) syscall and libc functions that know how to use it. rick > > > > > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to > "freebsd-current-unsubscribe@freebsd.org" >