From owner-freebsd-fs@FreeBSD.ORG Fri Nov 21 15:58:00 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A7BBBC5A for ; Fri, 21 Nov 2014 15:58:00 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4A41920B for ; Fri, 21 Nov 2014 15:58:00 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id sALFvtji007331 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 21 Nov 2014 17:57:55 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua sALFvtji007331 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id sALFvteg007330; Fri, 21 Nov 2014 17:57:55 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 21 Nov 2014 17:57:54 +0200 From: Konstantin Belousov To: Rick Macklem Subject: Re: RFC: patch to make d_fileno 64bits Message-ID: <20141121155754.GN17068@kib.kiev.ua> References: <683927697.4538805.1416539949195.JavaMail.root@uoguelph.ca> <539201047.4538834.1416539954794.JavaMail.root@uoguelph.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <539201047.4538834.1416539954794.JavaMail.root@uoguelph.ca> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2014 15:58:00 -0000 On Thu, Nov 20, 2014 at 10:19:14PM -0500, Rick Macklem wrote: > The attached patch covers the basics of a way to > convert the d_fileno field of "struct dirent" to > 64bits. This patch is incomplete and won't even > build, but I thought I'd post it in case anyone > wanted to take a look and comment on the approach > it uses. > > - renames the old/current one "struct dirent32" > - changes d_fileno to 64bits and adds a 64bit > d_off field for the offset of the underlying > file system > - defines a new VOP_READDIR() that will return > the new "struct dirent" that is used as the > default one for a new getdirentries(2). > - the old/current getdirentries(2) uses the old > VOP_READDIR32() by default. > > For the case of a file system that supports both > the new and old VOP_READDIR(), they are used by > the corresponding new and old getdirentries(2) > syscalls. > > For a file system that only supports one of > the VOP_READDIR()s, the "struct dirent32" > is copied to "struct dirent" (or vice versa). > > At this point, all file systems would support > the old VOP_READDIR() and I think the new > VOP_READDIR() can easily be added for NFS, > ZFS. (OpenBSD already has UFS code for > essentially a new struct dirent and hopefully > that code could be ported easily, too.) > > Anyhow, any comments on this approach? rick I do not think we need to have in-kernel compatibility shims. The work, big but relatively trivial, is to convert filesystems to use the new ino_t, even if the on-disk structures still use 32bit inode number. Really problematic part of this change is the usermode ABI breakage. The struct dirent is only the start of the whole issue. ino_t is embedded into more structures which are part of the contract, e.g. struct stat. We have to provide new syscalls which accept or return the affected structures. And then, there are libraries which embed ino_t into their ABI. Immediate example is fts(3) in libc. Look at the FTSENT.fts_ino. Even after the base system is fixed by properly providing the compat shims and symbol versions for the affected libraries, we get the same problem with the binaries not from base. Summary of the issue with ino_t is that it is not too hard to fix the kernel, comparing with the ABI issues which must be solved in usermode.