From owner-freebsd-net@FreeBSD.ORG Fri Sep 2 20:26:22 2011 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9A3141065672; Fri, 2 Sep 2011 20:26:22 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au [211.29.132.188]) by mx1.freebsd.org (Postfix) with ESMTP id 32DA28FC08; Fri, 2 Sep 2011 20:26:21 +0000 (UTC) Received: from c122-106-165-191.carlnfd1.nsw.optusnet.com.au (c122-106-165-191.carlnfd1.nsw.optusnet.com.au [122.106.165.191]) by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p82KQJi0032442 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 3 Sep 2011 06:26:20 +1000 Date: Sat, 3 Sep 2011 06:26:19 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Andriy Gapon In-Reply-To: <4E6123F4.4010209@FreeBSD.org> Message-ID: <20110903053813.V2093@besplex.bde.org> References: <4E60A1B8.7080607@FreeBSD.org> <20110902104018.GA12845@stack.nl> <20110903015445.J957@besplex.bde.org> <4E6123F4.4010209@FreeBSD.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@FreeBSD.org, freebsd-standards@FreeBSD.org, Jilles Tjoelker Subject: Re: POLLHUP on never connected socket X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Sep 2011 20:26:22 -0000 On Fri, 2 Sep 2011, Andriy Gapon wrote: > on 02/09/2011 20:36 Bruce Evans said the following: >> It seems likely that there was a transient connection to cause this >> problem. Do we know for sure? > > Not sure if this what you are asking but when I reproduced the problem it was by > connecting to a local port where no one listened for sure. Yes, that's what I'm asking. It should be simpler to fix if it is just a bug and doesn't involve reader/writer races. The code for setting POLLHUP for sockets is simple: from uipc_socket.c: % if ((events & POLLINIGNEOF) == 0) { % if (so->so_rcv.sb_state & SBS_CANTRCVMORE) { % revents |= events & (POLLIN | POLLRDNORM); % if (so->so_snd.sb_state & SBS_CANTSENDMORE) % revents |= POLLHUP; % } % } So POLLHUP tracks SBS_CANTRCVMORE && SBS_CANTSENDMORE exactly, and POLLHUP is never set without POLLIN. But in my version: % if ((events & POLLINIGNEOF) == 0) { % if (so->so_rcv.sb_state & SBS_CANTRCVMORE) { % if (so->so_snd.sb_state & SBS_CANTSENDMORE) % revents |= POLLHUP; % else % revents |= events & (POLLIN | POLLRDNORM); % } % } POLLIN is not set by hangup, bit actually means that there is (non-null) input available. jilles referred to this bug in -current. POLLINIGNEOF is an old attempt to fix this. It gives the caller more control. It works poorly, especially since most callers don't know that it exists. Perhaps SBS_CANTRCVMORE && SBS_CANTSENDMORE condition is too raw/low-level even for sockets. Fifos need more to get the behaviour that they want (as similar as possible to Linux). I quote my hackish version since the version in -current doesn't have any comments and my version does even more: >From fifo_open(), for readers: % mtx_lock(&fifo_mtx); % if (ap->a_mode & FREAD) { % fip->fi_readers++; % if (fip->fi_readers == 1) { % SOCKBUF_LOCK(&fip->fi_writesock->so_snd); % fip->fi_writesock->so_snd.sb_state &= ~SBS_CANTSENDMORE; % SOCKBUF_UNLOCK(&fip->fi_writesock->so_snd); % if (fip->fi_writers > 0) { % wakeup(&fip->fi_writers); % sowwakeup(fip->fi_writesock); % } % } % fdp = td->td_proc->p_fd; % FILEDESC_LOCK(fdp); % fp = fget_locked(fdp, ap->a_fdidx); % /* Abuse f_msgcount as a generation count. */ % fp->f_msgcount = fip->fi_wgen - fip->fi_writers; -current has the above line, and fixes the hack using a new field f_seqcount. % /* % * XXX quick fix: keep any POLLHUP condition sticky if we % * now have more than 1 reader, by clobbering f_msgcount % * here so that POLLINIGNEOF will not be set in % * fifo_poll_f(). (The above subtraction of fi_writers % * works similarly but is probably unnecessary since if % * there is a writer then there won't be a POLLHUP % * condition.) % * % * XXX: the generation counts are probably unnecessary now % * that this is broken to POSIX spec. They are to help % * make any POLLHUP condition sticky only for old readers. % * % * XXX: does this hack work right for a new reader that will % * block? % */ % fp->f_msgcount -= fip->fi_readers - 1; -current doesn't have the above line or the big comment. I forget if this is for more compatibility with Linux, or more correctness. I think the committed regression tests assert the behaviour give by this version, so they will certainly fail for -current. I'm surer of this for the spurious POLLIN with POLLHUP. % FILEDESC_UNLOCK(fdp); % } % int % fifo_poll_f(struct file *fp, int events, struct ucred *cred, struct thread *td) % { % struct file filetmp; % struct fifoinfo *fip; % int levents, revents = 0; % % fip = fp->f_vnode->v_fifoinfo; % levents = events & (POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND); % if (levents) { % filetmp.f_data = fip->fi_readsock; % filetmp.f_cred = fp->f_cred; % if (fp->f_msgcount == fip->fi_wgen) % levents |= POLLINIGNEOF; -current has this (with s/msgcount/seqcount/). This is the only place that seqcount is used. I thought I removed all uses of POLLINIGNEOF in the kernel. % revents |= soo_poll(&filetmp, levents, cred, td); % } % levents = events & (POLLOUT | POLLWRNORM | POLLWRBAND); % if (levents) { % filetmp.f_data = fip->fi_writesock; % filetmp.f_cred = fp->f_cred; % revents |= soo_poll(&filetmp, levents, cred, td); Note that fifo polling calls soo_poll() twice, and it only uses POLLINIGNEOF for the second call. Thus the combination always sets POLLHUP if there is a disconnection. But the second call doesn't ask for POLLIN, so in -current there is no spurious POLLIN to echo POLLHUP in the case where POLLINIGNEOF is used, but there is if it is not. -current is subtly different. Where the above masks out POLLINIGNEOF in the user `events' in the first assignment to levents, -current keeps it. But both versions mask it for the second assignment. Thus my version always ignores the user POLLINIGNEOF for polling of fifos, but -current keeps it partially. -current thus sees POLLHUP and fails to completely ignore EOF even if the user passes POLLINIGNEOF, iff the user events also has one of the output flags set. The user would have to call poll() twice, first with only input flags set and then with output flags set get unmixed behaviour, but this won't work since either of the calls may block and the output check will never ignore EOF. % } % return (revents); % } Bruce