Date: Wed, 24 May 2017 20:49:04 -0700 From: Conrad Meyer <cem@freebsd.org> To: Jilles Tjoelker <jilles@stack.nl> Cc: freebsd-current <freebsd-current@freebsd.org>, freebsd-fs@freebsd.org Subject: Re: 64-bit inodes (ino64) Status Update and Call for Testing Message-ID: <CAG6CVpU848QQziB=F4wFs7_ZmMXw5Ph7PHGS7VDP2_aQdY4SpA@mail.gmail.com> In-Reply-To: <20170521121456.GA21613@stack.nl> References: <20170420194314.GI1788@kib.kiev.ua> <20170521121456.GA21613@stack.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Jilles, Thanks for bringing this up. And of course, thanks to kib@ for including the d_namlen size bump and for his work in driving the rest of this change through to completion. On Sun, May 21, 2017 at 5:14 AM, Jilles Tjoelker <jilles@stack.nl> wrote: > We have another type in this area which is too small in some situations: > uint8_t for struct dirent.d_namlen. For filesystems that store filenames > as upto 255 UTF-16 code units, the name to be stored in d_name may be > upto 765 bytes long in UTF-8. This was reported in PR 204643. The code > currently handles this by returning the short (8.3) name, but this name > may not be present or usable, leaving the file inaccessible. We've been working to add such support to our FreeBSD-derivative product. A big piece of it is expanding d_namlen out to 16 bits. We've also been trying to divorce system-wide constants like MAXNAMLEN / NAME_MAX and MAXPATHLEN / PATH_MAX from filesystem-specific limitations (UFS' limit of 255 bytes). And push that upstream when possible, e.g., r313475, r316509. Bumping d_namlen in FreeBSD reduces the amount of ABI breakage we have to introduce in our product relative to FreeBSD, and leaves open the possibility of supporting 255-unicode-character filesystems natively in FreeBSD down the road. > Actually allowing longer names seems too complicated to add to the ino64 > change, but changing d_namlen to uint16_t (using d_pad0 space) and > skipping entries with d_namlen > 255 in libc may be helpful. > > Note that applications using the deprecated readdir_r() will not be able > to read such long names, since the API does not allow specifying that a > larger buffer has been provided. (This could be avoided by making struct > dirent.d_name 766 bytes long instead of 256.) We're looking at 255 Unicode code points, which can be 4 bytes a piece in UTF8, or 1020 bytes potentially. > Unfortunately, the existence of readdir_r() also prevents changing > struct dirent.d_name to the more correct flexible array. Best, Conrad
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAG6CVpU848QQziB=F4wFs7_ZmMXw5Ph7PHGS7VDP2_aQdY4SpA>