FreeBSD Mail Archives

Date:      Wed, 24 May 2017 20:49:04 -0700
From:      Conrad Meyer <cem@freebsd.org>
To:        Jilles Tjoelker <jilles@stack.nl>
Cc:        freebsd-current <freebsd-current@freebsd.org>, freebsd-fs@freebsd.org
Subject:   Re: 64-bit inodes (ino64) Status Update and Call for Testing
Message-ID:  <CAG6CVpU848QQziB=F4wFs7_ZmMXw5Ph7PHGS7VDP2_aQdY4SpA@mail.gmail.com>
In-Reply-To: <20170521121456.GA21613@stack.nl>
References:  <20170420194314.GI1788@kib.kiev.ua> <20170521121456.GA21613@stack.nl>

Hi Jilles,

Thanks for bringing this up.  And of course, thanks to kib@ for
including the d_namlen size bump and for his work in driving the rest
of this change through to completion.

On Sun, May 21, 2017 at 5:14 AM, Jilles Tjoelker <jilles@stack.nl> wrote:
> We have another type in this area which is too small in some situations:
> uint8_t for struct dirent.d_namlen. For filesystems that store filenames
> as upto 255 UTF-16 code units, the name to be stored in d_name may be
> upto 765 bytes long in UTF-8. This was reported in PR 204643. The code
> currently handles this by returning the short (8.3) name, but this name
> may not be present or usable, leaving the file inaccessible.

We've been working to add such support to our FreeBSD-derivative
product.  A big piece of it is expanding d_namlen out to 16 bits.
We've also been trying to divorce system-wide constants like MAXNAMLEN
/ NAME_MAX and MAXPATHLEN / PATH_MAX from filesystem-specific
limitations (UFS' limit of 255 bytes).  And push that upstream when
possible, e.g., r313475, r316509.

Bumping d_namlen in FreeBSD reduces the amount of ABI breakage we have
to introduce in our product relative to FreeBSD, and leaves open the
possibility of supporting 255-unicode-character filesystems natively
in FreeBSD down the road.

> Actually allowing longer names seems too complicated to add to the ino64
> change, but changing d_namlen to uint16_t (using d_pad0 space) and
> skipping entries with d_namlen > 255 in libc may be helpful.
>
> Note that applications using the deprecated readdir_r() will not be able
> to read such long names, since the API does not allow specifying that a
> larger buffer has been provided. (This could be avoided by making struct
> dirent.d_name 766 bytes long instead of 256.)

We're looking at 255 Unicode code points, which can be 4 bytes a piece
in UTF8, or 1020 bytes potentially.

> Unfortunately, the existence of readdir_r() also prevents changing
> struct dirent.d_name to the more correct flexible array.

Best,
Conrad

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAG6CVpU848QQziB=F4wFs7_ZmMXw5Ph7PHGS7VDP2_aQdY4SpA>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation