Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 25 Jan 2014 06:00:08 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Bryan Drewery <bryan@shatow.net>
Cc:        bapt@FreeBSD.org, freebsd-standards@FreeBSD.org, Jilles Tjoelker <jilles@stack.nl>
Subject:   Re: closedir(3) handling NULL
Message-ID:  <20140125041504.Y986@besplex.bde.org>
In-Reply-To: <20140124165509.GA73838@admin.xzibition.com>
References:  <20140124014105.GC37334@admin.xzibition.com> <20140124132435.GA90996@stack.nl> <20140124165509.GA73838@admin.xzibition.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 24 Jan 2014, Bryan Drewery wrote:

> On Fri, Jan 24, 2014 at 02:24:35PM +0100, Jilles Tjoelker wrote:
>> On Thu, Jan 23, 2014 at 07:41:05PM -0600, Bryan Drewery wrote:
>>> I found that Linux handles closedir(NULL) fine and returns EINVAL. POSIX
>>> [1] specifies that EBADF should be returned if "The dirp argument does
>>> not refer to an open directory stream"
>>
>>> [1] http://pubs.opengroup.org/onlinepubs/009696799/functions/closedir.html

Bug in POSIX.  It actually doesn't say this, but says in a confusing way
that "among the undefined behaviours that occurs if the dirp argument does
not refer to an open directory stream, the implementation is permitted but
not required to return -1 and set errno to EBADF".  Since the undefined
behaviour includes doing this, the part about returning EBADF says nothing
at all.  Similarly for returning NULL.

>>> I've updated fdclosedir(3) as well to behave the same.
>>
>>> I'll also update the manpage if there is no objection.
>>
>> If you do this, it is to improve compatibility with poorly written
>> software and not for POSIX compliance. POSIX only permits passing null
>> pointers where explicitly specified (e.g. time()); otherwise, passing a
>> null pointer is undefined behaviour like passing any argument outside
>> the required domain.

fclose() is a better example than time().  POSIX follows C for fclose()
and thus has fewer design and wording bugs for fclose() than for closedir().
Its actual wording for closedir() is (from the 2001 draft7; nothing has
changed):

% 7035 NAME
% 7036           closedir - close a directory stream
% 
% 7037 SYNOPSIS
% 7038           #include <dirent.h>
% 7039           int closedir(DIR *dirp);
% 
% 7040 DESCRIPTION
% 7041           The closedir( ) function shall close the directory stream referred to by the argument dirp. Upon

Already the behaviour is undefined, just like for fclose(), if the arg is
not a stream.

% 7042           return, the value of dirp may no longer point to an accessible object of the type DIR. If a file

Note that in POSIX, "may" is a (poorly chosen) technical term.  It
means that the implementation is permitted to, at its option, implement
the indicated behaviour, but portable applications can not depend on
this behaviour.

% 7043           descriptor is used to implement type DIR, that file descriptor shall be closed.
% 
% 7044 RETURN VALUE
% 7045           Upon successful completion, closedir( ) shall return 0; otherwise, -1 shall be returned and errno
% 7046           set to indicate the error.
% 
% 7047 ERRORS
% 7048           The closedir( ) function may fail if:
% 7049           [EBADF]             The dirp argument does not refer to an open directory stream.

This "may" is even more confusing.  Since a previous closedir() _may_ have
turned dirp into garbage, EBADF need not happen naturally due to close()
on an already-closed descriptor pointed to by a closed dirp.  The
implementation is permitted to fake this, but the behaviour is already
undefined in this case so no permission is needed.  I don't see how
EBADF can ever happen naturally if dirp is a directory stream -- directory
streams must be open, so they cannot have a closed file descriptor in them
and closing of an open file descriptor cannoth produce EBADF.

> I do think that improving portability is important. Even against sloppy
> coding. Applications developed for Linux are fine passing NULL to closedir(3),
> which leads to a style of coding that does not reveal itself to be a
> problem on FreeBSD until an edge case comes up.

This unimproves portability.  FreeBSD intentionally does the opposite for
fclose(): from fclose(3):

@ NOTES
@      The fclose() function does not handle NULL arguments; they will result in
@      a segmentation violation.  This is intentional - it makes it easier to
@      make sure programs written under FreeBSD are bug free.  This behaviour is
@      an implementation detail, and programs should not rely upon it.

It would be good to do the same thing for garbage stream pointers.

I once implemented the opposite of this for stdio (all entry points except
macros IIRC).  Garbage stream pointers can be detected by keeping a list
of valid ones and checking that the supposed stream arg is in the list.
Use a hash to search efficiently.  The implementation used a hash value
pointed to by the stream arg (fp->_fp_hash).  This was on hardware that
couldn't trap when following a garbage pointer.  If the hardware traps,
then this would require either a trap handler or caclulation of a hash
value from the value of fp.  After detecting a garbage pointer, I did the
wrong thing of making the stdio functions fail instead of SIGSEGVing.
Streams that are valid but wrong cannot be detected.  The most common
error is probably a closed stream.  fp->_fp_hash and even fp->_fp_fileno
could be kept valid for closed streams using type-stable memory.

> This is the situation to led to me find this. A mountpoint disappeared
> and some code written for Linux, that ported to FreeBSD without changes,
> segfaulted in closedir(3).

FreeBSD was perfect :-).

>> On another note, I don't understand when the condition
>> [EBADF] The dirp argument does not refer to an open directory stream.
>> can actually apply. As far as I understand, only valid open directory
>> streams (opendir() has been called and closedir() has not been called)
>> may be passed to closedir() and other functions.

Yes, I couldn't see how this could happen either.  Maybe some choices in
the "may"s makes it possible.

>> If the pointer is not a
>> null pointer, detecting whether it refers to a valid open directory
>> stream is not possible using common methods (you could maintain a list
>> of directory streams but doing that is against the spirit of C and I am
>> quite sure that POSIX did not intend to require implementations to do
>> that).

When I did this, it was for a 16-bit system.  stdio was about 1/10 as
large as in FreeBSD and libc was about 1/100 as large.  So space
considerations don't prevent doing this.  You can also do a simple
check using magic numbers that detects most cases.

> I was wondering that as well and whether EBADF really made sense, sicne
> it was not validating the pointer was from opendir(3).
>
>> In the current code in FreeBSD, [EBADF] may also happen if an
>> application closes a directory stream's file descriptor from under it
>> and no other file descriptor is opened in its place.

Ah, that is a reason.  revoke() by the sysadmin is an even more legitimate
reason (maybe the fd is still valid them, but it is worse than useless).

>> In some cases like the file_name argument to realpath(), NULL is
>> specified to cause an [EINVAL] return; I think we are stuck with that
>> but should not add more such cases. Note that this [EINVAL] return also
>> means that realpath(NULL, p) is valid to do and should not trigger
>> undefined behaviour detectors..

It's OK to have an extension to make null pointers give defined behaviour.
Requiring an error for null pointers pevents such extensions.  I find it
hard to remember that fclose(NULL) isn't already an extension (giving
fcloseall()).

> I'm not clear where you stand on this. Is EINVAL more proper or EBADF,
> or are you against the change all together?
>
> I find this sort of seat belt good for portability and of little harm.

% diff --git lib/libc/gen/closedir.c lib/libc/gen/closedir.c
% index 88ded37..d7a5bdb 100644
% --- lib/libc/gen/closedir.c
% +++ lib/libc/gen/closedir.c
% @@ -53,6 +53,11 @@ fdclosedir(DIR *dirp)
%  {
%  	int fd;
% 
% +	if (dirp == NULL) {
% +		errno = EBADF;
% +		return (-1);
% +	}
% +

Style bug (extra blank line).

%  	if (__isthreaded)
%  		_pthread_mutex_lock(&dirp->dd_lock);

Example of normal style (no extra blank line after an if statement).

Extra blank lines are especially not needed after return statements since
return statements obviously don't fall through.

%  	fd = dirp->dd_fd;
% @@ -71,6 +76,10 @@ fdclosedir(DIR *dirp)
%  int
%  closedir(DIR *dirp)
%  {
% +	int fd;
% +
% +	if ((fd = fdclosedir(dirp)) == -1)
% +		return (-1);
%

Style bug (extra blank line).

There is no man page for fdclosedir(3).  It is in directory(3), but someone
forgot to add fdclosedir.3 to MLINKS.  fdopendir.3 is unsorted in MLINKS
instead of missing.

% -	return (_close(fdclosedir(dirp)));
% +	return (_close(fd));
%  }

This is necessary iff you allow fdclosedir() to return on null pointers.

Among the dirent functions, only closedir() and fdclosedir() "work" on
null pointers, and this is not documented.

More POSIX wording:

% 186              may
% 187                     Describes a feature or behavior that is optional for an implementation that conforms to
% 188                     IEEE Std 1003.1-200x. An application should not rely on the existence of the feature or
% 189                     behavior. An application that relies on such a feature or behavior cannot be assured to be
% 190                     portable across conforming implementations.
% 191                     To avoid ambiguity, the opposite of may is expressed as need not, instead of may not.

In non-technical English, "may" means either permission or happenstance.
It is ambigous so it should never be used.  It is often used.  It is most
often used for happenstance when "might" should be used.  POSIX uses it
for a special type of permission.  I don't know if POSIX requires the
implementation to document its choice of option.  FreeBSD's closedir(3)
documents that the dirp is a pointer that becomes free on both successful
completion (it uses unnecessarily complex sentences which make it less
than clear that freeing occurs in the failure case).

% 10718              The fclose( ) function shall cause the stream pointed to by stream to be flushed and the associated
% 10719              file to be closed. Any unwritten buffered data for the stream shall be written to the file; any
% 
% 10732 ERRORS
% 10733              The fclose( ) function shall fail if:
% ...
% 10736 CX           [EBADF]                The file descriptor underlying stream is not valid.

I don't know how the fd can be invalid for a (necessarily valid) stream.
Maybe because the fd for a stdio stream is not private, and POSIX actually
allows closing it directly.  At least this says "shall fail" instead of
"may fail".  I think the "may fail" for closedir() is just buggy wording.
The "may" is for the implementation not being required to use fd's at all.
But when it uses them, errors from them should be "shall fail" like they
are for fclose().

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140125041504.Y986>