Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 26 Jan 2017 17:41:17 -0800
From:      Mark Johnston <markj@FreeBSD.org>
To:        Gleb Smirnoff <glebius@FreeBSD.org>
Cc:        jch@FreeBSD.org, hiren@FreeBSD.org, Jason Eggleston <jeggleston@llnw.com>, rrs@FreeBSD.org, jtl@FreeBSD.org, net@FreeBSD.org
Subject:   Re: listening sockets as non sockets
Message-ID:  <20170127014117.GA90480@raichu>
In-Reply-To: <20170127005251.GM2611@FreeBSD.org>
References:  <20170127005251.GM2611@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jan 26, 2017 at 04:52:51PM -0800, Gleb Smirnoff wrote:
>   Hi guys,
> 
>   as some of you already heard, I'm trying to separate listening sockets
> into a new file descriptor type. If we look into current struct socket,
> we see that some functional fields belong to normal data flow sockets,
> and other belong to listening socket. They are never used simultaneously.
> Now, if we look at socket API, we see that once a socket underwent transformation
> to a listening socket, only 3 regular syscalls now may be called: listen(2),
> accept(2) and close(2) and a subset of ioctl() and setsockopt() parameters is
> accepted. A listening socket cannot be closed from the protocol side, only from
> user side. So, listening socket is so different from a dataflow socket, that
> separating them looks architecturally right thing to do.
> 
> The benefits are:
> 
> 1) Nicer code (I hope).
> 2) Smaller 'struct socket'.
> 3) Having two different locks for socket and solisten, we can try to get rid
>    of ACCEPT_LOCK global lock.
> 
> The patch is in a very pre-alpha state. It has been run only in my bhyve VM.
> 
> It passes regression tests from tools/regression/sockets and tests/sys,
> including the race tests, and including accept filter ones.

I haven't yet looked much at the diff, so sorry in advance if this
question is inappropriate.

One problem I've fought a couple of times (with Infiniband SDP and unix
sockets) is a race between accept(2) and a concurrent close of the
listening socket. Right now, this problem has to be handled in the
domain-specific code (see r303855 for instance), and it's generally
awkward to do so. Does your work address this intrinsic race in any way?

FWIW, I have a basic test case for unix sockets here, though I believe
it's been incorporated into stress2:
https://people.freebsd.org/~markj/unix_socket_detach.c

> 
> For TCP it passes basic functionality testing, but likely there are still races
> remaining after ACCEPT_LOCK removal.
> 
> For SCTP the patch is unfinished yet. The tricky thing with SCTP is that it
> can un-listen a listening socket back to normal socket, doing listen(fd, 0)
> on it. My patch has API for that I started working on SCTP, but temporarily
> put this problem aside. It looks solvable, but I don't know yet how to test
> it. Better first see results with TCP.
> 
> I've put current snapshot to Phab, so that you can view it there. The snap
> patch is also attached to this email.
> 
> https://reviews.freebsd.org/D9356
> 
> At this moment I'd like to start doing some testing (and doing polishing
> in parallel), and here I seek for your help. Those, who run FreeBSD at
> very high connection rates and observe contention on the accept global
> mutex, anybody willing to collaborate with me on this?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170127014117.GA90480>