Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 01 Aug 2012 17:57:31 +0800
From:      David Xu <listlog2011@gmail.com>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, arch@FreeBSD.org, David Xu <davidxu@FreeBSD.org>
Subject:   Re: short read/write and error code
Message-ID:  <5018FD8B.8090309@gmail.com>
In-Reply-To: <20120801183240.K1291@besplex.bde.org>
References:  <5018992C.8000207@freebsd.org> <20120801071934.GJ2676@deviant.kiev.zoral.com.ua> <20120801183240.K1291@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2012/8/1 17:23, Bruce Evans wrote:
> On Wed, 1 Aug 2012, Konstantin Belousov wrote:
>
>> On Wed, Aug 01, 2012 at 10:49:16AM +0800, David Xu wrote:
>>> POSIX requires write() to return actually bytes written, same rule is
>>> applied to read().
>>>
>>> http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
>>>> ETURN VALUE
>>>>
>>>> Upon successful completion, write() [XSI]   and pwrite() shall
>>>> return the number of bytes actually written to the file associated
>>>> with fildes. This number shall never be greater than nbyte.
>>>> Otherwise, -1 shall be returned and errno set to indicate the error.
>>>
>>> http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
>>>> RETURN VALUE
>>>>
>>>> Upon successful completion, read() [XSI]   and pread() shall return
>>>> a non-negative integer indicating the number of bytes actually read.
>>>> Otherwise, the functions shall return -1 and set errno to indicate
>>>> the error.
>> Note that the wording is only about successful return, not for the case
>> when error occured. I do think that if fo_read() returned an error, and
>> error is not of the kind 'interruption', then the error shall be 
>> returned
>> as is.
>
> That is clearly not what is intended.  write() is unusable if it won't
> tell you how many bytes it wrote.  According to your interpretation,
> recalcitrantix would conform to POSIX if all it writes wrote whatever
> they could and then returned -1 after detecting the error EPOSIXFUZZY.
>
> The usability is specified for signals.  From an old POSIX draft:
>
> % 51235              If write( ) is interrupted by a signal before it 
> writes any data, it shall return -1 with errno set to
> % 51236              [EINTR].
> % 51237              If write( ) is interrupted by a signal after it 
> successfully writes some data, it shall return the
> % 51238              number of bytes written.
>
> POSIX formally defines "Successfully Transferred", mainly for aio.  I
> couldn't find any formal definition of "successfully writes", but clearly
> it is nonsense for a write to be unsuccessful if a reader on the local
> system or on an external system has successfully read some of the data
> written by the write.
>
> FreeBSD does try to convert EINTR to 0 after some data has been written,
> to conform to the above.  SIGPIPE should return EINTR to be returned to
> dofilewrite(), so there should be no problem for SIGPIPE.  But we were
> reminded of this old FreeBSD bug by probelms with SIGPIPE.
>
> POSIX contradicts itself by disallowing successful completion if _any_
> error is detected:
>
> % 435              RETURN VALUE
> % 436                        This section indicates the possible 
> return values, if any.
> % 437                        If the implementation can detect errors, 
> ``successful completion'' means that no error
> % 438                        has been detected during execution of the 
> function. If the implementation does detect
>
> Relcalcitrantix has 2 versions according to which of these contradictions
> has precedence.  In one version, writes do as much as possible before
> returning -1/EPOSIXFUZZY, as above.  In the other version, this still
> happens for most writes.  But ones that are interrupted by a signal after
> having written some data return the number of bytes written, accoding to
> the "shall" for the interrupted case.  Perhaps there are some other weird
> cases where writes are required to work :-).
>
Thanks, I even don't know there is such a chapter in POSIX document.

>>> I have following patch to fix our code to be compatible with POSIX:
>> ...
>>
>>> -current only resets error code to zero for short write when code is
>>> ERESTART, EINTR or EWOULDBLOCK.
>>> But this is incorrect, at least for pipe, when EPIPE is returned,
>>> some bytes may have already been written. For a named pipe, I may don't
>>> care a reader is disappeared or not, because for named pipe, a new
>>> reader can come in and talk with writer again,  so I need to know
>>> how many bytes have been written, same is applied to reader, I don't
>>> care writer is gone, it can come in again and talk with reader. So I
>>> suggest to remove surplus code in -current's dofilewrite() and
>>> dofileread().
>> Then fix the pipe code, and not introduce the behaviour change for all
>> file types ?
>
> Because returning the error to userland breaks all file types that
> want to return a short i/o (mainly special files whose i/o cannot be
> backed out of).  They are just detecting and returning an error as a
> courtesy to upper layers, and to simplify the implementation.  The
> syscall API doesn't permit returning both the error code (the reason
> for the short i/o) and the short count, so the error code must be
> cleared to allow the short count to be returned.
>
> Bruce
>
The dofileread and dofilewrite are rather annoying.
dofileread discards data already in hand.
dofilewrite does not tell you it has written some data to media,
this might have already caused some side effect. :-(





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5018FD8B.8090309>