Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 Nov 2000 19:36:39 -0500
From:      Garance A Drosihn <drosih@rpi.edu>
To:        Robert Nordier <rnordier@nordier.com>
Cc:        hackers@FreeBSD.ORG
Subject:   Re: fclose vs ferror (from libc/getcap)
Message-ID:  <p04330104b640abfaf13d@[128.113.24.47]>
In-Reply-To: <200011201020.MAA23349@siri.nordier.com>
References:  <200011201020.MAA23349@siri.nordier.com>

next in thread | previous in thread | raw e-mail | index | archive | help
At 12:19 PM +0200 11/20/00, Robert Nordier wrote:
>Garance A Drosihn wrote:
>  > [...]  The basic problem is some code which does:
>  >
>>  	 (void)fclose(pfp);
>>  	 if (ferror(pfp)) {
>>  		 ...do stuff...
>>  	 }
>>
>  > I find it surprising that the above works under FreeBSD.
>  > [...]
>
>Bear in mind that ferror is one of a number of stdio functions that
>are often implemented as macros.  For instance: ferror and (say)
>getc might look like this:
>
>     #define ferror(f) ((f)->fl & _FERR)
>     #define getc(f) ((f)->p < (f)->xr ? *(f)->p++ : fgetc(f))
>
>Given the intentionally minimalist way those functions are written,
>to do any consistent and obligatory sanity-checking on (FILE *) would
>cause a big change in the actual code generated, and the amount of
>code generated.

Hmm.  I can see that it would add some overhead.  This may seem
paradoxical, but I'm not quite as concerned about getc as I am
about ferror.  If you're calling getc on a closed stream, then
you're almost always going to get into some obvious trouble, and
right in that section of code.  The thing with ferror is that it
will generally "work" after the fclose, although the value it
returns might not be the right (pre-close) value.

And before I go on with the rest of my response, let me note that
I realize that my own (personal) headaches with this is due to
linux's implementation of ferror.  It's just that this was in code
from freebsd that I was trying to run on linux, and I was surprised
that this fclose/ferror combination was not a problem caught on
freebsd.

>I think the best way to do what you want is to create a separate
>debugging library.

This is great if you already know where the problem is.  In my
case, I started out with a problem that was nowhere-near the
code which had the above bug in it.  Initially the process was
dying in a call to inet_aton().  I commented that out (as a
debugging measure, since I knew what the result of that call
would be), and then the process would die in some other library
routine that I forget right now.  I commented out the call to
THAT routine, and finally the process started to die on a call
to fopen.  Not an fopen anywhere near the above fclose, but at
least this indicated the problem might have something to do with
io-streams.  To make debugging matters worse, I am debugging
this in a daemon process, and the process would sometimes work
perfectly fine, while other times it would simply disappear.

For those and other reasons, I had spent a few hours debugging
the problem before I had any idea the problem was specific to
routines for streamed IO.

[note that the debugging library does not do any good unless one
knows to #undef those macros, so that strategy does not work
until after someone has already figured out where the problem is]

>The point about
>
>   	 (void)fclose(pfp);
>   	 if (ferror(pfp)) {
>   		 ...do stuff...
>   	 }
>
>is that it's a silly thing to do deliberately, but if I was
>porting some hairy old C code I'd tend to expect it to work.
>C is not a language in which you go out of your way to prevent
>people making mistakes.

I would not expect it to work.  This has nothing to do with
the C language, it has to do with fclose.  Fclose gets rid
of the descriptor.  In my own code, I usually follow 'fclose(fp)'
with 'fp = NULL', because that stream is GONE.  I do realize that
this code does seem to work on several operating systems, but it
also causes dramatic problems with linux.  Given the description
of fclose, I'd say it is the code which is wrong.

The "single Unix spec" says:
      After the call to fclose(), any use of stream causes
      undefined behavior.

FreeBSD's own man page for fclose says:
      [fclose returns 0 or EOF].  In either case, no further
      access to the stream is possible.

Neither of those indicate that anyone should "expect it to
work", no matter what language they are programming in.  There
is nothing in the description for ferror which implies that it
is some magical exception to the above rules.

I might grudgingly admit that it would be a shame to
increase the overhead of macro-ized 'ferror' calls, and I'm
not sure of any good way to avoid that.  But I see no reason
that this code should be "expected" to work.  And if it does
not work, then I'd rather it failed right at the call to
ferror, and not some indeterminate place later on.  (obviously
this is the big problem with the implementation of ferror on
linux, in that it doesn't fail or die at the ferror call, it
dies much much later due to some corrupted data structure).

Perhaps it would be helpful to add some sanity checking in the
subroutine-ized version of ferror, in libc/lib/stdio/ferror.c ?
(although I realize that wouldn't have really helped me at all)

Really I should be bugging someone in linux-land about this,
as it is their implementation which causes such painfully
obscure problems.  It's just that I don't see myself as a
linux developer...   :)

Still, I guess I should go bug them now.
-- 

---
Garance Alistair Drosehn            =   gad@eclipse.acs.rpi.edu
Senior Systems Programmer           or  gad@freebsd.org
Rensselaer Polytechnic Institute    or  drosih@rpi.edu


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?p04330104b640abfaf13d>