Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 25 Aug 2009 04:07:11 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Jilles Tjoelker <jilles@stack.nl>
Cc:        svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, Konstantin Belousov <kib@FreeBSD.org>
Subject:   Re: svn commit: r196460 - head/sys/kern
Message-ID:  <20090825031355.B729@besplex.bde.org>
In-Reply-To: <20090823213759.GA55039@stack.nl>
References:  <200908231244.n7NCiFgc061095@svn.freebsd.org> <20090823213759.GA55039@stack.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 23 Aug 2009, Jilles Tjoelker wrote:

> I think poll on fifos should instead be fixed by closing the
> half-connection corresponding to writing from fi_readsock to
> fi_writesock. I have tried this out, see attached patch. With the patch,
> pipepoll only gives "expected POLLHUP; got POLLIN | POLLHUP" errors and
> an error in fifo case 6b caused by our distinction between new and old
> readers.

This sort of worked for me, but has several problems:

% Index: sys/fs/fifofs/fifo_vnops.c
% ===================================================================
% --- sys/fs/fifofs/fifo_vnops.c	(revision 196459)
% +++ sys/fs/fifofs/fifo_vnops.c	(working copy)
% @@ -193,6 +193,10 @@
%  			goto fail2;
%  		fip->fi_writesock = wso;
%  		error = soconnect2(wso, rso);
% +		if (error == 0)
% +			error = soshutdown(rso, SHUT_WR);
% +		if (error == 0)
% +			error = soshutdown(wso, SHUT_RD);
%  		if (error) {
%  			(void)soclose(wso);
%  fail2:

The second soshutdown() is only harmful.  I see the following state changes
- the first soshutdown() sets SBS_CANTRCVMORE on rso like you would expect,
   and also sets SBS_CANTSENDMORE on wso.  This gives the desired state.
- the second soshutdown() then clears SBS_CANTRCVMORE on rso (without
   the first soshutdown() it leaves both flags clear in both directions).
   This clobbers the desired state.  The failure shows in just one of my
   uncommitted regression tests (when there is a writer and there was
   a reader, poll() returns POLLOUT for the writer, but should return
   POLLHUP; the missing SBS_CANTRCVMORE on rso prevents it ever returning
   POLLHUP for writers).

After removing the second soshutdown() and fixing a spurious POLLIN (see
below), all my tests pass.

Elsewhere, fifo_vnops.c hacks on SBS_CANT*MORE directly.  Perhaps it should
call soshutdown(), or if the direct access there is safe then it is probably
safe above.

% Index: sys/kern/uipc_socket.c
% ===================================================================
% --- sys/kern/uipc_socket.c	(revision 196469)
% +++ sys/kern/uipc_socket.c	(working copy)
% @@ -2898,11 +2898,13 @@
%  		if (so->so_oobmark || (so->so_rcv.sb_state & SBS_RCVATMARK))
%  			revents |= events & (POLLPRI | POLLRDBAND);
% 
% -	if ((events & POLLINIGNEOF) == 0)
% -		if (so->so_rcv.sb_state & SBS_CANTRCVMORE)
% -			revents |= POLLHUP;
% -	if (so->so_snd.sb_state & SBS_CANTSENDMORE)
% -		revents |= POLLHUP;
% +	if ((events & POLLINIGNEOF) == 0) {

Old problems become larger:

I don't like POLLINIGNEOF (for input) affecting POLLHUP for output.  This
seems to cause no problems for fifos, at least when the kernel sets
POLLINIGNEOF, but it is hard to understand even why it doesn't cause
problems, and kib@ wants POLLINIGNEOF to remain user-settable, so the
complications might remain exported to userland for for fifos and
sockets, where they are harder to document and understand.

Note that fifo_poll_f() always calls here twice, first for the read
socket, then for the write socket.  For the first call, the kernel
sets POLLINIGNEOF according to its idea of the stickyness of EOF on
input.  For the second call, the kernel never sets POLLINIGNEOF.  Thus
this code is always executed (unless the user sets POLLINIGNEOF) and
since returning POLLHUP is independent of the `events' arg, it is hard
to understand how setting POLLINIGNEOF for the first call has any
effect...

% +		if (so->so_rcv.sb_state & SBS_CANTRCVMORE) {

...well, it has an effect since the first call is on a different socket.
The soshutdown()s make the states involved clearer:
- SBS_CANTRCVMORE is always set for wso, so we always get here for the
   second call, and the resulting POLLHUP is purely for the write direction.
- SBS_CANTSENDMORE is aways set for rso, so if we get here for the first
   call then we will set POLLHUP, and this POLLHUP is purely for the read
   direction.
Then fifo_poll_f() will combine the POLLHUPs, so POLLHUP becomes impure,
but POLLHUP remains useful.

% +			revents |= events & (POLLIN | POLLRDNORM);

This gives spurious POLLINs when POLLHUP is also returned, and thus defeats
the point of soreadable_data() being different from soreadable().  Tests
6a-6d show the spurious POLLIN.  I don't understand how tests 6a and 6c-6d
passed for you.

% +			if (so->so_snd.sb_state & SBS_CANTSENDMORE)
% +				revents |= POLLHUP;

Tests 6a-6d pass with the above 3 lines changed to:

 			if (so->so_snd.sb_state & SBS_CANTSENDMORE)
 				revents |= POLLHUP;
 			else
 				revents |= events & (POLLIN | POLLRDNORM);

Returning POLLIN will cause poll() to not block on input descriptors, but
this seems to be as correct as possible since there really is a read-EOF.
Applications just can't see this EOF as POLLHUP -- they will see POLLIN
and have to try to read(), and then interpret read() returning 0 as meaning
EOF.  Device-independent applications must do precisely this anyway since
the input descriptor might be a regular file and there is no POLLHUP for
regular files.

% +		}
% +	}
% 
%  	if (revents == 0) {
%  		if (events & (POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND)) {

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090825031355.B729>