From owner-freebsd-current@FreeBSD.ORG Fri Apr 24 21:52:53 2015 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 46602FB2; Fri, 24 Apr 2015 21:52:53 +0000 (UTC) Received: from mx1.stack.nl (relay02.stack.nl [IPv6:2001:610:1108:5010::104]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailhost.stack.nl", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 0CB46189E; Fri, 24 Apr 2015 21:52:53 +0000 (UTC) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mx1.stack.nl (Postfix) with ESMTP id 7FD3D358C59; Fri, 24 Apr 2015 23:52:49 +0200 (CEST) Received: by snail.stack.nl (Postfix, from userid 1677) id 55D6528494; Fri, 24 Apr 2015 23:52:49 +0200 (CEST) Date: Fri, 24 Apr 2015 23:52:49 +0200 From: Jilles Tjoelker To: John Baldwin Cc: Julian Elischer , freebsd-current@freebsd.org Subject: Re: readdir/telldir/seekdir problem (i think) Message-ID: <20150424215249.GA96554@stack.nl> References: <55386505.70708@freebsd.org> <553A7DB0.8080308@freebsd.org> <553A8D28.7090901@freebsd.org> <4718551.Y2ZnMk6NSM@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4718551.Y2ZnMk6NSM@ralph.baldwin.cx> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Apr 2015 21:52:53 -0000 On Fri, Apr 24, 2015 at 04:28:12PM -0400, John Baldwin wrote: > Yes, this isn't at all safe. There's no guarantee whatsoever that > the offset on the directory fd that isn't something returned by > getdirentries has any meaning. In particular, the size of the > directory entry in a random filesystem might be a different size > than the structure returned by getdirentries (since it converts > things into a FS-independent format). > This might work for UFS by accident, but this is probably why ZFS > doesn't work. > However, this might be properly fixed by the thing that ino64 is > doing where each directory entry returned by getdirentries gives > you a seek offset that you _can_ directly seek to (as opposed to > seeking to the start of the block and then walking forward N > entries until you get an inter-block entry that is the same). The ino64 branch only reserves space for d_off and does not use it in any way. This is appropriate since actually using d_off is a major feature addition. A proper d_off would still be useful even if UFS's readdir keeps masking off the offset so a directory read always starts at the beginning of a 512-byte directory block, since this allows more distinct offset values than safely using getdirentries()'s *basep. With d_off, one outer loop must read at least one directory block to avoid spinning indefinitely, while using getdirentries()'s *basep requires reading the whole getdirentries() buffer. Some Linux filesystems go further and provide a unique d_off for each entry. Another idea would be to store the last d_ino instead of dd_loc into the struct ddloc. On seekdir(), this would seek to loc_seek as before and skip entries until that d_ino is found, or to the start of the buffer if not found (and possibly return some entries again that should not be returned, but Samba copes with that). -- Jilles Tjoelker