From owner-svn-src-all@FreeBSD.ORG Mon Jan 13 21:29:36 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5C3C8670; Mon, 13 Jan 2014 21:29:36 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 428A719C1; Mon, 13 Jan 2014 21:29:36 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0DLTaPE095209; Mon, 13 Jan 2014 21:29:36 GMT (envelope-from jhb@svn.freebsd.org) Received: (from jhb@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0DLTZ5e095202; Mon, 13 Jan 2014 21:29:35 GMT (envelope-from jhb@svn.freebsd.org) Message-Id: <201401132129.s0DLTZ5e095202@svn.freebsd.org> From: John Baldwin Date: Mon, 13 Jan 2014 21:29:35 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260606 - in stable/8: share/man/man4 sys/kern tools/regression/sockets/unix_seqpacket tools/regression/sockets/unix_seqpacket_exercise usr.bin/netstat X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jan 2014 21:29:36 -0000 Author: jhb Date: Mon Jan 13 21:29:34 2014 New Revision: 260606 URL: http://svnweb.freebsd.org/changeset/base/260606 Log: MFC 197775,197777-197779,197781,197794,243152,243313,255478: First cut at implementing SOCK_SEQPACKET support for UNIX (local) domain sockets. This allows for reliable bi-directional datagram communication over UNIX domain sockets, in contrast to SOCK_DGRAM (M:N, unreliable) or SOCK_STERAM (bi-directional bytestream). Largely, this reuses existing UNIX domain socket code. This allows applications requiring record- oriented semantics to do so reliably via local IPC. Added: stable/8/tools/regression/sockets/unix_seqpacket/ - copied from r197781, head/tools/regression/sockets/unix_seqpacket/ - copied from r197781, head/tools/regression/sockets/unix_seqpacket_exercise/ Directory Properties: stable/8/tools/regression/sockets/unix_seqpacket_exercise/ (props changed) Modified: stable/8/share/man/man4/unix.4 stable/8/sys/kern/uipc_usrreq.c stable/8/tools/regression/sockets/unix_seqpacket_exercise/unix_seqpacket_exercise.c stable/8/usr.bin/netstat/main.c (contents, props changed) stable/8/usr.bin/netstat/netstat.h (contents, props changed) stable/8/usr.bin/netstat/unix.c (contents, props changed) Directory Properties: stable/8/share/man/man4/ (props changed) stable/8/sys/ (props changed) stable/8/sys/kern/ (props changed) stable/8/tools/regression/sockets/ (props changed) stable/8/usr.bin/netstat/ (props changed) Modified: stable/8/share/man/man4/unix.4 ============================================================================== --- stable/8/share/man/man4/unix.4 Mon Jan 13 20:55:15 2014 (r260605) +++ stable/8/share/man/man4/unix.4 Mon Jan 13 21:29:34 2014 (r260606) @@ -32,7 +32,7 @@ .\" @(#)unix.4 8.1 (Berkeley) 6/9/93 .\" $FreeBSD$ .\" -.Dd July 15, 2001 +.Dd October 5, 2009 .Dt UNIX 4 .Os .Sh NAME @@ -52,7 +52,8 @@ mechanisms. The .Ux Ns -domain family supports the -.Dv SOCK_STREAM +.Dv SOCK_STREAM , +.Dv SOCK_SEQPACKET , and .Dv SOCK_DGRAM socket types and uses @@ -127,11 +128,14 @@ The .Ux Ns -domain protocol family is comprised of simple transport protocols that support the -.Dv SOCK_STREAM +.Dv SOCK_STREAM , +.Dv SOCK_SEQPACKET , and .Dv SOCK_DGRAM abstractions. .Dv SOCK_STREAM +and +.Dv SOCK_SEQPACKET sockets also support the communication of .Ux file descriptors through the use of the @@ -206,8 +210,9 @@ and tested with .Xr getsockopt 2 : .Bl -tag -width ".Dv LOCAL_CONNWAIT" .It Dv LOCAL_CREDS -This option may be enabled on a -.Dv SOCK_DGRAM +This option may be enabled on +.Dv SOCK_DGRAM , +.Dv SOCK_SEQPACKET , or a .Dv SOCK_STREAM socket. Modified: stable/8/sys/kern/uipc_usrreq.c ============================================================================== --- stable/8/sys/kern/uipc_usrreq.c Mon Jan 13 20:55:15 2014 (r260605) +++ stable/8/sys/kern/uipc_usrreq.c Mon Jan 13 21:29:34 2014 (r260606) @@ -50,7 +50,8 @@ * garbage collector to find and tear down cycles of disconnected sockets. * * TODO: - * SEQPACKET, RDM + * RDM + * distinguish datagram size limits from flow control limits in SEQPACKET * rethink name space problems * need a proper out-of-band */ @@ -113,6 +114,7 @@ static ino_t unp_ino; /* Prototype for static int unp_rights; /* (g) File descriptors in flight. */ static struct unp_head unp_shead; /* (l) List of stream sockets. */ static struct unp_head unp_dhead; /* (l) List of datagram sockets. */ +static struct unp_head unp_sphead; /* (l) List of seqpacket sockets. */ struct unp_defer { SLIST_ENTRY(unp_defer) ud_link; @@ -154,10 +156,14 @@ static u_long unpst_sendspace = PIPSIZ; static u_long unpst_recvspace = PIPSIZ; static u_long unpdg_sendspace = 2*1024; /* really max datagram size */ static u_long unpdg_recvspace = 4*1024; +static u_long unpsp_sendspace = PIPSIZ; /* really max datagram size */ +static u_long unpsp_recvspace = PIPSIZ; SYSCTL_NODE(_net, PF_LOCAL, local, CTLFLAG_RW, 0, "Local domain"); SYSCTL_NODE(_net_local, SOCK_STREAM, stream, CTLFLAG_RW, 0, "SOCK_STREAM"); SYSCTL_NODE(_net_local, SOCK_DGRAM, dgram, CTLFLAG_RW, 0, "SOCK_DGRAM"); +SYSCTL_NODE(_net_local, SOCK_SEQPACKET, seqpacket, CTLFLAG_RW, 0, + "SOCK_SEQPACKET"); SYSCTL_ULONG(_net_local_stream, OID_AUTO, sendspace, CTLFLAG_RW, &unpst_sendspace, 0, "Default stream send space."); @@ -167,6 +173,10 @@ SYSCTL_ULONG(_net_local_dgram, OID_AUTO, &unpdg_sendspace, 0, "Default datagram send space."); SYSCTL_ULONG(_net_local_dgram, OID_AUTO, recvspace, CTLFLAG_RW, &unpdg_recvspace, 0, "Default datagram receive space."); +SYSCTL_ULONG(_net_local_seqpacket, OID_AUTO, maxseqpacket, CTLFLAG_RW, + &unpsp_sendspace, 0, "Default seqpacket send space."); +SYSCTL_ULONG(_net_local_seqpacket, OID_AUTO, recvspace, CTLFLAG_RW, + &unpsp_recvspace, 0, "Default seqpacket receive space."); SYSCTL_INT(_net_local, OID_AUTO, inflight, CTLFLAG_RD, &unp_rights, 0, "File descriptors in flight."); SYSCTL_INT(_net_local, OID_AUTO, deferred, CTLFLAG_RD, @@ -282,6 +292,7 @@ static void unp_process_defers(void * __ */ static struct domain localdomain; static struct pr_usrreqs uipc_usrreqs_dgram, uipc_usrreqs_stream; +static struct pr_usrreqs uipc_usrreqs_seqpacket; static struct protosw localsw[] = { { .pr_type = SOCK_STREAM, @@ -296,6 +307,20 @@ static struct protosw localsw[] = { .pr_flags = PR_ATOMIC|PR_ADDR|PR_RIGHTS, .pr_usrreqs = &uipc_usrreqs_dgram }, +{ + .pr_type = SOCK_SEQPACKET, + .pr_domain = &localdomain, + + /* + * XXXRW: For now, PR_ADDR because soreceive will bump into them + * due to our use of sbappendaddr. A new sbappend variants is needed + * that supports both atomic record writes and control data. + */ + .pr_flags = PR_ADDR|PR_ATOMIC|PR_CONNREQUIRED|PR_WANTRCVD| + PR_RIGHTS, + .pr_ctloutput = &uipc_ctloutput, + .pr_usrreqs = &uipc_usrreqs_seqpacket, +}, }; static struct domain localdomain = { @@ -378,6 +403,11 @@ uipc_attach(struct socket *so, int proto recvspace = unpdg_recvspace; break; + case SOCK_SEQPACKET: + sendspace = unpsp_sendspace; + recvspace = unpsp_recvspace; + break; + default: panic("uipc_attach"); } @@ -397,8 +427,22 @@ uipc_attach(struct socket *so, int proto UNP_LIST_LOCK(); unp->unp_gencnt = ++unp_gencnt; unp_count++; - LIST_INSERT_HEAD(so->so_type == SOCK_DGRAM ? &unp_dhead : &unp_shead, - unp, unp_link); + switch (so->so_type) { + case SOCK_STREAM: + LIST_INSERT_HEAD(&unp_shead, unp, unp_link); + break; + + case SOCK_DGRAM: + LIST_INSERT_HEAD(&unp_dhead, unp, unp_link); + break; + + case SOCK_SEQPACKET: + LIST_INSERT_HEAD(&unp_sphead, unp, unp_link); + break; + + default: + panic("uipc_attach"); + } UNP_LIST_UNLOCK(); return (0); @@ -732,11 +776,8 @@ uipc_rcvd(struct socket *so, int flags) unp = sotounpcb(so); KASSERT(unp != NULL, ("uipc_rcvd: unp == NULL")); - if (so->so_type == SOCK_DGRAM) - panic("uipc_rcvd DGRAM?"); - - if (so->so_type != SOCK_STREAM) - panic("uipc_rcvd unknown socktype"); + if (so->so_type != SOCK_STREAM && so->so_type != SOCK_SEQPACKET) + panic("uipc_rcvd socktype %d", so->so_type); /* * Adjust backpressure on sender and wakeup any waiting to write. @@ -851,6 +892,7 @@ uipc_send(struct socket *so, int flags, break; } + case SOCK_SEQPACKET: case SOCK_STREAM: if ((so->so_state & SS_ISCONNECTED) == 0) { if (nam != NULL) { @@ -893,7 +935,8 @@ uipc_send(struct socket *so, int flags, SOCKBUF_LOCK(&so2->so_rcv); if (unp2->unp_flags & UNP_WANTCRED) { /* - * Credentials are passed only once on SOCK_STREAM. + * Credentials are passed only once on SOCK_STREAM + * and SOCK_SEQPACKET. */ unp2->unp_flags &= ~UNP_WANTCRED; control = unp_addsockcred(td, control); @@ -902,11 +945,33 @@ uipc_send(struct socket *so, int flags, * Send to paired receive port, and then reduce send buffer * hiwater marks to maintain backpressure. Wake up readers. */ - if (control != NULL) { - if (sbappendcontrol_locked(&so2->so_rcv, m, control)) + switch (so->so_type) { + case SOCK_STREAM: + if (control != NULL) { + if (sbappendcontrol_locked(&so2->so_rcv, m, + control)) + control = NULL; + } else + sbappend_locked(&so2->so_rcv, m); + break; + + case SOCK_SEQPACKET: { + const struct sockaddr *from; + + from = &sun_noname; + if (sbappendaddr_locked(&so2->so_rcv, from, m, + control)) control = NULL; - } else - sbappend_locked(&so2->so_rcv, m); + break; + } + } + + /* + * XXXRW: While fine for SOCK_STREAM, this conflates maximum + * datagram size and back-pressure for SOCK_SEQPACKET, which + * can lead to undesired return of EMSGSIZE on send instead + * of more desirable blocking. + */ mbcnt_delta = so2->so_rcv.sb_mbcnt - unp2->unp_mbcnt; unp2->unp_mbcnt = so2->so_rcv.sb_mbcnt; sbcc = so2->so_rcv.sb_cc; @@ -969,7 +1034,8 @@ uipc_sense(struct socket *so, struct sta UNP_LINK_RLOCK(); UNP_PCB_LOCK(unp); unp2 = unp->unp_conn; - if (so->so_type == SOCK_STREAM && unp2 != NULL) { + if ((so->so_type == SOCK_STREAM || so->so_type == SOCK_SEQPACKET) && + unp2 != NULL) { so2 = unp2->unp_socket; sb->st_blksize += so2->so_rcv.sb_cc; } @@ -1039,6 +1105,26 @@ static struct pr_usrreqs uipc_usrreqs_dg .pru_close = uipc_close, }; +static struct pr_usrreqs uipc_usrreqs_seqpacket = { + .pru_abort = uipc_abort, + .pru_accept = uipc_accept, + .pru_attach = uipc_attach, + .pru_bind = uipc_bind, + .pru_connect = uipc_connect, + .pru_connect2 = uipc_connect2, + .pru_detach = uipc_detach, + .pru_disconnect = uipc_disconnect, + .pru_listen = uipc_listen, + .pru_peeraddr = uipc_peeraddr, + .pru_rcvd = uipc_rcvd, + .pru_send = uipc_send, + .pru_sense = uipc_sense, + .pru_shutdown = uipc_shutdown, + .pru_sockaddr = uipc_sockaddr, + .pru_soreceive = soreceive_generic, /* XXX: or...? */ + .pru_close = uipc_close, +}; + static struct pr_usrreqs uipc_usrreqs_stream = { .pru_abort = uipc_abort, .pru_accept = uipc_accept, @@ -1340,6 +1426,7 @@ unp_connect2(struct socket *so, struct s break; case SOCK_STREAM: + case SOCK_SEQPACKET: unp2->unp_conn = unp; if (req == PRU_CONNECT && ((unp->unp_flags | unp2->unp_flags) & UNP_CONNWAIT)) @@ -1377,6 +1464,7 @@ unp_disconnect(struct unpcb *unp, struct break; case SOCK_STREAM: + case SOCK_SEQPACKET: soisdisconnected(unp->unp_socket); unp2->unp_conn = NULL; soisdisconnected(unp2->unp_socket); @@ -1402,7 +1490,22 @@ unp_pcblist(SYSCTL_HANDLER_ARGS) struct unp_head *head; struct xunpcb *xu; - head = ((intptr_t)arg1 == SOCK_DGRAM ? &unp_dhead : &unp_shead); + switch ((intptr_t)arg1) { + case SOCK_STREAM: + head = &unp_shead; + break; + + case SOCK_DGRAM: + head = &unp_dhead; + break; + + case SOCK_SEQPACKET: + head = &unp_sphead; + break; + + default: + panic("unp_pcblist: arg1 %d", (int)(intptr_t)arg1); + } /* * The process of preparing the PCB list is too time-consuming and @@ -1515,6 +1618,9 @@ SYSCTL_PROC(_net_local_dgram, OID_AUTO, SYSCTL_PROC(_net_local_stream, OID_AUTO, pcblist, CTLFLAG_RD, (caddr_t)(long)SOCK_STREAM, 0, unp_pcblist, "S,xunpcb", "List of active local stream sockets"); +SYSCTL_PROC(_net_local_seqpacket, OID_AUTO, pcblist, CTLFLAG_RD, + (caddr_t)(long)SOCK_SEQPACKET, 0, unp_pcblist, "S,xunpcb", + "List of active local seqpacket sockets"); static void unp_shutdown(struct unpcb *unp) @@ -1526,7 +1632,8 @@ unp_shutdown(struct unpcb *unp) UNP_PCB_LOCK_ASSERT(unp); unp2 = unp->unp_conn; - if (unp->unp_socket->so_type == SOCK_STREAM && unp2 != NULL) { + if ((unp->unp_socket->so_type == SOCK_STREAM || + (unp->unp_socket->so_type == SOCK_SEQPACKET)) && unp2 != NULL) { so = unp2->unp_socket; if (so != NULL) socantrcvmore(so); @@ -1692,6 +1799,7 @@ unp_init(void) NULL, EVENTHANDLER_PRI_ANY); LIST_INIT(&unp_dhead); LIST_INIT(&unp_shead); + LIST_INIT(&unp_sphead); SLIST_INIT(&unp_defers); TASK_INIT(&unp_gc_task, 0, unp_gc, NULL); TASK_INIT(&unp_defer_task, 0, unp_process_defers, NULL); @@ -2065,7 +2173,8 @@ SYSCTL_INT(_net_local, OID_AUTO, taskcou static void unp_gc(__unused void *arg, int pending) { - struct unp_head *heads[] = { &unp_dhead, &unp_shead, NULL }; + struct unp_head *heads[] = { &unp_dhead, &unp_shead, &unp_sphead, + NULL }; struct unp_head **head; struct file *f, **unref; struct unpcb *unp; Modified: stable/8/tools/regression/sockets/unix_seqpacket_exercise/unix_seqpacket_exercise.c ============================================================================== --- head/tools/regression/sockets/unix_seqpacket_exercise/unix_seqpacket_exercise.c Mon Oct 5 15:27:01 2009 (r197781) +++ stable/8/tools/regression/sockets/unix_seqpacket_exercise/unix_seqpacket_exercise.c Mon Jan 13 21:29:34 2014 (r260606) @@ -50,21 +50,21 @@ __FBSDID("$FreeBSD$"); #define SEQPACKET_SNDBUF (131072-16) #define FAILERR(str) err(-1, "%s: %s", __func__, str) -#define FAILNERR(str, n) err(-1, "%s %d: %s", __func__, n, str) -#define FAILNMERR(str, n, m) err(-1, "%s %d %d: %s", __func__, n, m, str) +#define FAILNERR(str, n) err(-1, "%s %zd: %s", __func__, n, str) +#define FAILNMERR(str, n, m) err(-1, "%s %zd %d: %s", __func__, n, m, str) #define FAILERRX(str) errx(-1, "%s: %s", __func__, str) -#define FAILNERRX(str, n) errx(-1, "%s %d: %s", __func__, n, str) -#define FAILNMERRX(str, n, m) errx(-1, "%s %d %d: %s", __func__, n, m, str) +#define FAILNERRX(str, n) errx(-1, "%s %zd: %s", __func__, n, str) +#define FAILNMERRX(str, n, m) errx(-1, "%s %zd %d: %s", __func__, n, m, str) static int ann = 0; #define ANN() (ann ? warnx("%s: start", __func__) : 0) -#define ANNN(n) (ann ? warnx("%s %d: start", __func__, (n)) : 0) -#define ANNNM(n, m) (ann ? warnx("%s %d %d: start", __func__, (n), (m)) : 0) +#define ANNN(n) (ann ? warnx("%s %zd: start", __func__, (n)) : 0) +#define ANNNM(n, m) (ann ? warnx("%s %zd %d: start", __func__, (n), (m)):0) #define OK() warnx("%s: ok", __func__) -#define OKN(n) warnx("%s %d: ok", __func__, (n)) -#define OKNM(n, m) warnx("%s %d %d: ok", __func__, (n), (m)) +#define OKN(n) warnx("%s %zd: ok", __func__, (n)) +#define OKNM(n, m) warnx("%s %zd %d: ok", __func__, (n), (m)) #ifdef SO_NOSIGPIPE #define NEW_SOCKET(s) do { \ @@ -168,7 +168,7 @@ server(int s_listen) break; } if (ssize_send != ssize_recv) - warnx("server: recv %d sent %d", + warnx("server: recv %zd sent %zd", ssize_recv, ssize_send); } while (1); close(s_accept); Modified: stable/8/usr.bin/netstat/main.c ============================================================================== --- stable/8/usr.bin/netstat/main.c Mon Jan 13 20:55:15 2014 (r260605) +++ stable/8/usr.bin/netstat/main.c Mon Jan 13 21:29:34 2014 (r260606) @@ -186,6 +186,8 @@ static struct nlist nl[] = { { .n_name = "_mfctablesize" }, #define N_ARPSTAT 55 { .n_name = "_arpstat" }, +#define N_UNP_SPHEAD 56 + { .n_name = "unp_sphead" }, { .n_name = NULL }, }; @@ -627,7 +629,8 @@ main(int argc, char *argv[]) #endif /* NETGRAPH */ if ((af == AF_UNIX || af == AF_UNSPEC) && !sflag) unixpr(nl[N_UNP_COUNT].n_value, nl[N_UNP_GENCNT].n_value, - nl[N_UNP_DHEAD].n_value, nl[N_UNP_SHEAD].n_value); + nl[N_UNP_DHEAD].n_value, nl[N_UNP_SHEAD].n_value, + nl[N_UNP_SPHEAD].n_value); exit(0); } Modified: stable/8/usr.bin/netstat/netstat.h ============================================================================== --- stable/8/usr.bin/netstat/netstat.h Mon Jan 13 20:55:15 2014 (r260605) +++ stable/8/usr.bin/netstat/netstat.h Mon Jan 13 21:29:34 2014 (r260606) @@ -158,7 +158,7 @@ void ddp_stats(u_long, const char *, int void netgraphprotopr(u_long, const char *, int, int); #endif -void unixpr(u_long, u_long, u_long, u_long); +void unixpr(u_long, u_long, u_long, u_long, u_long); void esis_stats(u_long, const char *, int, int); void clnp_stats(u_long, const char *, int, int); Modified: stable/8/usr.bin/netstat/unix.c ============================================================================== --- stable/8/usr.bin/netstat/unix.c Mon Jan 13 20:55:15 2014 (r260605) +++ stable/8/usr.bin/netstat/unix.c Mon Jan 13 21:29:34 2014 (r260606) @@ -193,21 +193,37 @@ fail: } void -unixpr(u_long count_off, u_long gencnt_off, u_long dhead_off, u_long shead_off) +unixpr(u_long count_off, u_long gencnt_off, u_long dhead_off, u_long shead_off, + u_long sphead_off) { char *buf; int ret, type; struct xsocket *so; struct xunpgen *xug, *oxug; struct xunpcb *xunp; + u_long head_off; for (type = SOCK_STREAM; type <= SOCK_SEQPACKET; type++) { if (live) ret = pcblist_sysctl(type, &buf); - else - ret = pcblist_kvm(count_off, gencnt_off, - type == SOCK_STREAM ? shead_off : - (type == SOCK_DGRAM ? dhead_off : 0), &buf); + else { + head_off = 0; + switch (type) { + case SOCK_STREAM: + head_off = shead_off; + break; + + case SOCK_DGRAM: + head_off = dhead_off; + break; + + case SOCK_SEQPACKET: + head_off = sphead_off; + break; + } + ret = pcblist_kvm(count_off, gencnt_off, head_off, + &buf); + } if (ret == -1) continue; if (ret < 0)