From owner-freebsd-arch@FreeBSD.ORG Sun Jun 18 01:01:03 2006 Return-Path: X-Original-To: arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1EC8816A474 for ; Sun, 18 Jun 2006 01:01:03 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 31FCF43D45 for ; Sun, 18 Jun 2006 01:01:02 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 9CF2946C3E for ; Sat, 17 Jun 2006 21:01:01 -0400 (EDT) Date: Sun, 18 Jun 2006 02:01:01 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: arch@FreeBSD.org Message-ID: <20060618014337.V67789@fledge.watson.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Subject: Proposal: add pru_close protosw method, refactor abort/detach X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jun 2006 01:01:03 -0000 Over the past 6-12 months, I've spent quite a bit of time working with our socket and protocol locking and reference models. These efforts have been aimed at a couple of things: - Reducing the scope of the pcbinfo locks. - Evaluating and exploring the balance between overhead and lock contention. - Sanitizing the socket/protocol life cycle and reference model. - Identification of socket locking "loose ends". In my most recent work, I've actually created several branches in Perforce that use significant variations on socket locking to explore these issues: - Default CVS model, which uses different locks horizontally and vertically in the stack for the notion of a single "connection: for example, socket buffer send lock, socket buffer receive lock, and inpcb lock for the tcpcb. - "resock", a branch in which the send and receive socket buffer locks are combined into a single simple socket mutex, intended to evaluate the impact of potentially increased contention vs. reduced complexity and overhead. - "resock_vertical", and extension to resock in which I vertically coalesce the protocol and socket locks, reducing potential parallelism between layers in return for simplifying the locking strategy and significantly reducing the number of lock operations. For example, to add data to a socket buffer, TCP no longer needs to acquire socket buffer locks, since the PCB lock is already held, and the socket->protocol layer boundary can be transitioned without dropping locks for lock order reasons. I'm not yet at the point where I can start to seriously start to evaluate the performance and trade-offs, as I'm still working through the paperwork of getting resock_vertical up and running. One of the most significant barriers I ran into was the shutdown phase of the socket, in which the socket code and protocol code try to decide whether or not the socket has to be torn down. This is logic which I have already worked to formalize, document, and clean up as part of the earlier sockref work. The primary issue with loaning locks is the teardown, as you can't free the lock until the teardown has basically completed. What particularly complicates matters is that in the current world order, pru_detach may be called more than once, as the protocol may decide at the last moment that it wishes to preserve the socket buffer as it may need to continue transmitting after close (such as on TCP socket close after send on a slow network). Attached is a patch that attempts to further rationalize tear-down. Specifically, it refactors pru_detach (disconnect and conditionally free) and pru_abort (disconnect abruptly and free) into three protocol switch functions: pru_close: socket has been closed and a sensible shutdown without data loss is desired. pru_abort: socket is being aborted, generally due to insufficient queue space in a listen socket, or close of a listen socket while connections are waiting to be accepted: close abruptly and potentially with data loss. pru_detach: teardown is now unconditional -- both the protocol and socket are done. With these changes, pru_attach is an unconditional constructor for the socket, and pru_detach is an unconditional destructor for the socket, meaning that it can now be used by the protocol code to know that the socket is done using the loaned mutex from the protocol. This also has the side effect of moving us closer to using a pure reference count on sockets, rather than flags and counted references, and supports an eventual notion of separating the socket close from completion of operations on the socket, which is an issue for threaded applications when one thread closes a socket while another thread is blocked in I/O on the socket. The changes require protocol implementors to distinguish close and deatch, which while generally clarifying for some protocols (such as TCP, where the logic becomes much more clear), for others it has never been clear how exactly close worked, and does not become clearer. This patch is not yet ready for commit, and I have flagged some of these cases with XXXRW. In one case, I've replaced a call to sofree() with sodealloc(), following an attach failure: my belief is that if attach of a protocol to a socket "fails", then we should not detach it. This was true previously when the socket was created by sonewconn() (accepted socket), but not for socreate (result of socket() or socketpair()). What I'm looking for: - yay/nay on this approach, and the general change in protosw behavior. It does touch all protocols, but I think makes things generally more sensible. this doesn't just support the resock_vertical exploration, but also generally makes things more sensible by moving towards a more typical constructor/destructor and avoiding combining protocol state transitions with socket state freeing. - Review of the details of the patch. In particular, help deciding if my splitting up of events between pru_abort, pru_close, and pru_detach for each protocol is right. FYI, I have done some basic performance measurement, and am unable to measure any performance difference before/after these changes in my test environment, which is a high transaction/second web client/server setup involving one socket open/close per transaction, so about 15,000 socket open/close events per second. Robert N M Watson Computer Laboratory University of Cambridge --- //depot/vendor/freebsd/src/sys/kern/uipc_domain.c 2006/04/21 10:18:17 +++ //depot/user/rwatson/sockref/src/sys/kern/uipc_domain.c 2006/06/14 11:09:35 @@ -100,7 +100,8 @@ .pru_sosend = pru_sosend_notsupp, .pru_soreceive = pru_soreceive_notsupp, .pru_sopoll = pru_sopoll_notsupp, - .pru_sosetlabel = pru_sosetlabel_null + .pru_sosetlabel = pru_sosetlabel_null, + .pru_close = pru_close_notsupp, }; static void @@ -126,6 +127,7 @@ DEFAULT(pu->pru_soreceive, soreceive); DEFAULT(pu->pru_sopoll, sopoll); DEFAULT(pu->pru_sosetlabel, pru_sosetlabel_null); + DEFAULT(pu->pru_close, pru_close_notsupp); #undef DEFAULT if (pr->pr_init) (*pr->pr_init)(); --- //depot/vendor/freebsd/src/sys/kern/uipc_socket.c 2006/06/10 14:35:55 +++ //depot/user/rwatson/sockref/src/sys/kern/uipc_socket.c 2006/06/16 21:16:17 @@ -353,10 +353,7 @@ so->so_count = 1; error = (*prp->pr_usrreqs->pru_attach)(so, proto, td); if (error) { - ACCEPT_LOCK(); - SOCK_LOCK(so); - so->so_state |= SS_NOFDREF; - sorele(so); + sodealloc(so); return (error); } *aso = so; @@ -583,6 +580,9 @@ * socantsendmore_locked() drops the socket buffer mutex so that it * can safely perform wakeups. Re-acquire the mutex before * continuing. + * + * XXXRW: Why do we do wakeups here? At this point we should just + * be tearing down. */ socantsendmore_locked(so); SOCKBUF_LOCK(&so->so_snd); @@ -592,6 +592,7 @@ sorflush(so); knlist_destroy(&so->so_rcv.sb_sel.si_note); knlist_destroy(&so->so_snd.sb_sel.si_note); + (*so->so_proto->pr_usrreqs->pru_detach)(so); sodealloc(so); } @@ -656,7 +657,7 @@ } drop: - (*so->so_proto->pr_usrreqs->pru_detach)(so); + (*so->so_proto->pr_usrreqs->pru_close)(so); ACCEPT_LOCK(); SOCK_LOCK(so); KASSERT((so->so_state & SS_NOFDREF) == 0, ("soclose: NOFDREF")); @@ -666,9 +667,9 @@ } /* - * soabort() allows the socket code or protocol code to detach a socket that - * has been in an incomplete or completed listen queue, but has not yet been - * accepted. + * soabort() is used to abruptly tear down a connection, such as when a + * resource limit is reached (listen queue depth exceeded), or if a listen + * socket is closed while there are sockets waiting to be accepted. * * This interface is tricky, because it is called on an unreferenced socket, * and must be called only by a thread that has actually removed the socket @@ -678,9 +679,6 @@ * with any socket locks held. Protocols do call it while holding their own * recursible protocol mutexes, but this is something that should be subject * to review in the future. - * - * XXXRW: Why do we maintain a distinction between pru_abort() and - * pru_detach()? */ void soabort(so) --- //depot/vendor/freebsd/src/sys/kern/uipc_socket2.c 2006/06/17 22:50:57 +++ //depot/user/rwatson/sockref/src/sys/kern/uipc_socket2.c 2006/06/18 00:05:41 @@ -1271,6 +1271,12 @@ } +void +pru_close_notsupp(struct socket *so) +{ + +} + /* * Make a copy of a sockaddr in a malloced buffer of type M_SONAME. */ --- //depot/vendor/freebsd/src/sys/kern/uipc_usrreq.c 2006/06/16 22:16:23 +++ //depot/user/rwatson/sockref/src/sys/kern/uipc_usrreq.c 2006/06/16 22:21:44 @@ -149,8 +149,7 @@ KASSERT(unp != NULL, ("uipc_abort: unp == NULL")); UNP_LOCK(); unp_drop(unp, ECONNABORTED); - unp_detach(unp); - UNP_UNLOCK_ASSERT(); + UNP_UNLOCK(); } static int @@ -213,6 +212,21 @@ return (error); } +/* + * XXXRW: Should also unbind? + */ +static void +uipc_close(struct socket *so) +{ + struct unpcb *unp; + + unp = sotounpcb(so); + KASSERT(unp != NULL, ("uipc_close: unp == NULL")); + UNP_LOCK(); + unp_disconnect(unp); + UNP_UNLOCK(); +} + int uipc_connect2(struct socket *so1, struct socket *so2) { @@ -560,6 +574,7 @@ .pru_sosend = sosend, .pru_soreceive = soreceive, .pru_sopoll = sopoll, + .pru_close = uipc_close, }; int --- //depot/vendor/freebsd/src/sys/net/raw_usrreq.c 2006/06/02 10:03:02 +++ //depot/user/rwatson/sockref/src/sys/net/raw_usrreq.c 2006/06/14 22:56:14 @@ -146,7 +146,16 @@ KASSERT(rp != NULL, ("raw_uabort: rp == NULL")); raw_disconnect(rp); soisdisconnected(so); - raw_detach(rp); +} + +static void +raw_uclose(struct socket *so) +{ + struct rawcb *rp = sotorawcb(so); + + KASSERT(rp != NULL, ("raw_uabort: rp == NULL")); + raw_disconnect(rp); + soisdisconnected(so); } /* pru_accept is EOPNOTSUPP */ @@ -295,4 +304,5 @@ .pru_send = raw_usend, .pru_shutdown = raw_ushutdown, .pru_sockaddr = raw_usockaddr, + .pru_close = raw_uclose, }; --- //depot/vendor/freebsd/src/sys/net/rtsock.c 2006/04/01 15:45:41 +++ //depot/user/rwatson/sockref/src/sys/net/rtsock.c 2006/06/14 22:56:14 @@ -144,6 +144,13 @@ raw_usrreqs.pru_abort(so); } +static void +rts_close(struct socket *so) +{ + + raw_usrreqs.pru_close(so); +} + /* pru_accept is EOPNOTSUPP */ static int @@ -292,6 +299,7 @@ .pru_send = rts_send, .pru_shutdown = rts_shutdown, .pru_sockaddr = rts_sockaddr, + .pru_close = rts_close, }; /*ARGSUSED*/ --- //depot/vendor/freebsd/src/sys/netatalk/ddp_usrreq.c 2006/04/01 16:55:45 +++ //depot/user/rwatson/sockref/src/sys/netatalk/ddp_usrreq.c 2006/06/14 22:56:14 @@ -202,6 +202,10 @@ return (error); } +/* + * XXXRW: This is never called because we only invoke abort on stream + * protocols. + */ static void ddp_abort(struct socket *so) { @@ -210,10 +214,22 @@ ddp = sotoddpcb(so); KASSERT(ddp != NULL, ("ddp_abort: ddp == NULL")); - DDP_LIST_XLOCK(); + DDP_LOCK(ddp); + at_pcbdisconnect(ddp); + DDP_UNLOCK(ddp); +} + +static void +ddp_close(struct socket *so) +{ + struct ddpcb *ddp; + + ddp = sotoddpcb(so); + KASSERT(ddp != NULL, ("ddp_close: ddp == NULL")); + DDP_LOCK(ddp); - at_pcbdetach(so, ddp); - DDP_LIST_XUNLOCK(); + at_pcbdisconnect(ddp); + DDP_UNLOCK(ddp); } void @@ -276,4 +292,5 @@ .pru_send = ddp_send, .pru_shutdown = ddp_shutdown, .pru_sockaddr = at_setsockaddr, + .pru_close = ddp_close, }; --- //depot/vendor/freebsd/src/sys/netatm/atm_aal5.c 2006/04/01 15:45:41 +++ //depot/user/rwatson/sockref/src/sys/netatm/atm_aal5.c 2006/06/17 02:11:38 @@ -88,6 +88,7 @@ Atm_attributes *, void **); static void atm_aal5_cpcs_data(void *, KBuffer *); static caddr_t atm_aal5_getname(void *); +static void atm_aal5_close(struct socket *); /* @@ -108,6 +109,7 @@ .pru_sense = atm_aal5_sense, .pru_shutdown = atm_aal5_shutdown, .pru_sockaddr = atm_aal5_sockaddr, + .pru_close = atm_aal5_close, }; /* @@ -565,8 +567,19 @@ { ATM_INTRO_NOERR("abort"); + (void)atm_sock_disconnect(so); so->so_error = ECONNABORTED; - atm_sock_detach(so); + + ATM_OUTRO_NOERR(); +} + +static void +atm_aal5_close(so) + struct socket *so; +{ + ATM_INTRO_NOERR("close"); + + (void)atm_sock_disconnect(so); ATM_OUTRO_NOERR(); } --- //depot/vendor/freebsd/src/sys/netatm/atm_usrreq.c 2006/04/01 15:45:41 +++ //depot/user/rwatson/sockref/src/sys/netatm/atm_usrreq.c 2006/06/17 02:11:38 @@ -79,6 +79,7 @@ .pru_sosend = NULL, .pru_soreceive = NULL, .pru_sopoll = NULL, + .pru_close = atm_proto_notsupp5, }; --- //depot/vendor/freebsd/src/sys/netgraph/bluetooth/include/ng_btsocket_hci_raw.h 2006/04/01 15:45:41 +++ //depot/user/rwatson/sockref/src/sys/netgraph/bluetooth/include/ng_btsocket_hci_raw.h 2006/06/14 22:56:14 @@ -67,6 +67,7 @@ void ng_btsocket_hci_raw_init (void); void ng_btsocket_hci_raw_abort (struct socket *); +void ng_btsocket_hci_raw_close (struct socket *); int ng_btsocket_hci_raw_attach (struct socket *, int, struct thread *); int ng_btsocket_hci_raw_bind (struct socket *, struct sockaddr *, struct thread *); --- //depot/vendor/freebsd/src/sys/netgraph/bluetooth/include/ng_btsocket_l2cap.h 2006/04/01 15:45:41 +++ //depot/user/rwatson/sockref/src/sys/netgraph/bluetooth/include/ng_btsocket_l2cap.h 2006/06/14 22:56:14 @@ -93,6 +93,7 @@ void ng_btsocket_l2cap_raw_init (void); void ng_btsocket_l2cap_raw_abort (struct socket *); +void ng_btsocket_l2cap_raw_close (struct socket *); int ng_btsocket_l2cap_raw_attach (struct socket *, int, struct thread *); int ng_btsocket_l2cap_raw_bind (struct socket *, struct sockaddr *, struct thread *); @@ -184,6 +185,7 @@ void ng_btsocket_l2cap_init (void); void ng_btsocket_l2cap_abort (struct socket *); +void ng_btsocket_l2cap_close (struct socket *); int ng_btsocket_l2cap_accept (struct socket *, struct sockaddr **); int ng_btsocket_l2cap_attach (struct socket *, int, struct thread *); int ng_btsocket_l2cap_bind (struct socket *, struct sockaddr *, --- //depot/vendor/freebsd/src/sys/netgraph/bluetooth/include/ng_btsocket_rfcomm.h 2006/04/01 15:45:41 +++ //depot/user/rwatson/sockref/src/sys/netgraph/bluetooth/include/ng_btsocket_rfcomm.h 2006/06/14 22:56:14 @@ -315,6 +315,7 @@ void ng_btsocket_rfcomm_init (void); void ng_btsocket_rfcomm_abort (struct socket *); +void ng_btsocket_rfcomm_close (struct socket *); int ng_btsocket_rfcomm_accept (struct socket *, struct sockaddr **); int ng_btsocket_rfcomm_attach (struct socket *, int, struct thread *); int ng_btsocket_rfcomm_bind (struct socket *, struct sockaddr *, --- //depot/vendor/freebsd/src/sys/netgraph/bluetooth/socket/ng_btsocket.c 2005/11/09 13:31:29 +++ //depot/user/rwatson/sockref/src/sys/netgraph/bluetooth/socket/ng_btsocket.c 2006/06/14 22:56:14 @@ -74,6 +74,7 @@ .pru_send = ng_btsocket_hci_raw_send, .pru_shutdown = NULL, .pru_sockaddr = ng_btsocket_hci_raw_sockaddr, + .pru_close = ng_btsocket_hci_raw_close, }; /* @@ -92,6 +93,7 @@ .pru_send = ng_btsocket_l2cap_raw_send, .pru_shutdown = NULL, .pru_sockaddr = ng_btsocket_l2cap_raw_sockaddr, + .pru_close = ng_btsocket_l2cap_raw_close, }; /* @@ -112,6 +114,7 @@ .pru_send = ng_btsocket_l2cap_send, .pru_shutdown = NULL, .pru_sockaddr = ng_btsocket_l2cap_sockaddr, + .pru_close = ng_btsocket_l2cap_close, }; /* @@ -132,6 +135,7 @@ .pru_send = ng_btsocket_rfcomm_send, .pru_shutdown = NULL, .pru_sockaddr = ng_btsocket_rfcomm_sockaddr, + .pru_close = ng_btsocket_rfcomm_close, }; /* --- //depot/vendor/freebsd/src/sys/netgraph/bluetooth/socket/ng_btsocket_hci_raw.c 2006/05/17 00:16:08 +++ //depot/user/rwatson/sockref/src/sys/netgraph/bluetooth/socket/ng_btsocket_hci_raw.c 2006/06/14 22:56:14 @@ -876,9 +876,13 @@ void ng_btsocket_hci_raw_abort(struct socket *so) { - ng_btsocket_hci_raw_detach(so); } /* ng_btsocket_hci_raw_abort */ +void +ng_btsocket_hci_raw_close(struct socket *so) +{ +} /* ng_btsocket_hci_raw_close */ + /* * Create new raw HCI socket */ --- //depot/vendor/freebsd/src/sys/netgraph/bluetooth/socket/ng_btsocket_l2cap.c 2006/04/01 15:45:41 +++ //depot/user/rwatson/sockref/src/sys/netgraph/bluetooth/socket/ng_btsocket_l2cap.c 2006/06/14 22:56:14 @@ -1917,9 +1917,16 @@ { so->so_error = ECONNABORTED; - ng_btsocket_l2cap_detach(so); + (void)ng_btsocket_l2cap_disconnect(so); } /* ng_btsocket_l2cap_abort */ +void +ng_btsocket_l2cap_close(struct socket *so) +{ + + (void)ng_btsocket_l2cap_disconnect(so); +} /* ng_btsocket_l2cap_close */ + /* * Accept connection on socket. Nothing to do here, socket must be connected * and ready, so just return peer address and be done with it. --- //depot/vendor/freebsd/src/sys/netgraph/bluetooth/socket/ng_btsocket_l2cap_raw.c 2006/04/01 15:45:41 +++ //depot/user/rwatson/sockref/src/sys/netgraph/bluetooth/socket/ng_btsocket_l2cap_raw.c 2006/06/14 22:56:14 @@ -575,9 +575,17 @@ void ng_btsocket_l2cap_raw_abort(struct socket *so) { - ng_btsocket_l2cap_raw_detach(so); + + (void)ng_btsocket_l2cap_raw_disconnect(so); } /* ng_btsocket_l2cap_raw_abort */ +void +ng_btsocket_l2cap_raw_close(struct socket *so) +{ + + (void)ng_btsocket_l2cap_raw_disconnect(so); +} /* ng_btsocket_l2cap_raw_close */ + /* * Create and attach new socket */ --- //depot/vendor/freebsd/src/sys/netgraph/bluetooth/socket/ng_btsocket_rfcomm.c 2006/04/01 15:45:41 +++ //depot/user/rwatson/sockref/src/sys/netgraph/bluetooth/socket/ng_btsocket_rfcomm.c 2006/06/14 22:56:14 @@ -346,10 +346,17 @@ void ng_btsocket_rfcomm_abort(struct socket *so) { + so->so_error = ECONNABORTED; + (void)ng_btsocket_rfcomm_disconnect(so); +} /* ng_btsocket_rfcomm_abort */ - ng_btsocket_rfcomm_detach(so); -} /* ng_btsocket_rfcomm_abort */ +void +ng_btsocket_rfcomm_close(struct socket *so) +{ + + (void)ng_btsocket_rfcomm_disconnect(so); +} /* ng_btsocket_rfcomm_close */ /* * Accept connection on socket. Nothing to do here, socket must be connected --- //depot/vendor/freebsd/src/sys/netgraph/ng_socket.c 2006/06/13 21:38:14 +++ //depot/user/rwatson/sockref/src/sys/netgraph/ng_socket.c 2006/06/14 22:56:14 @@ -1087,6 +1087,8 @@ } /* * Control and data socket type descriptors + * + * XXXRW: Perhaps _close should do something? */ static struct pr_usrreqs ngc_usrreqs = { @@ -1100,6 +1102,7 @@ .pru_send = ngc_send, .pru_shutdown = NULL, .pru_sockaddr = ng_setsockaddr, + .pru_close = NULL, }; static struct pr_usrreqs ngd_usrreqs = { @@ -1113,6 +1116,7 @@ .pru_send = ngd_send, .pru_shutdown = NULL, .pru_sockaddr = ng_setsockaddr, + .pru_close = NULL, }; /* --- //depot/vendor/freebsd/src/sys/netinet/ip_divert.c 2006/04/21 10:18:17 +++ //depot/user/rwatson/sockref/src/sys/netinet/ip_divert.c 2006/06/14 23:41:26 @@ -427,6 +427,24 @@ } static void +div_close(struct socket *so) +{ +#if 0 + /* + * XXXRW: Tear-down to do here? + */ + struct inpcb *inp; + + inp = sotoinpcb(so); + KASSERT(inp != NULL, ("div_close: inp == NULL")); + INP_INFO_WLOCK(&divcbinfo); + INP_LOCK(inp); + INP_UNLOCK(inp); + INP_INFO_WUNLOCK(&divcbinfo); +#endif +} + +static void div_detach(struct socket *so) { struct inpcb *inp; @@ -634,7 +652,8 @@ .pru_send = div_send, .pru_shutdown = div_shutdown, .pru_sockaddr = div_sockaddr, - .pru_sosetlabel = in_pcbsosetlabel + .pru_sosetlabel = in_pcbsosetlabel, + .pru_close = div_close, }; struct protosw div_protosw = { --- //depot/vendor/freebsd/src/sys/netinet/raw_ip.c 2006/05/21 19:30:34 +++ //depot/user/rwatson/sockref/src/sys/netinet/raw_ip.c 2006/06/14 23:41:26 @@ -615,46 +615,65 @@ } static void -rip_pcbdetach(struct socket *so, struct inpcb *inp) +rip_detach(struct socket *so) { + struct inpcb *inp; - INP_INFO_WLOCK_ASSERT(&ripcbinfo); - INP_LOCK_ASSERT(inp); - - if (so == ip_mrouter && ip_mrouter_done) - ip_mrouter_done(); - if (ip_rsvp_force_done) - ip_rsvp_force_done(so); - if (so == ip_rsvpd) - ip_rsvp_done(); + /* XXXRW: Should assert detachment. */ + inp = sotoinpcb(so); + KASSERT(inp != NULL, ("rip_detach: inp == NULL")); + INP_INFO_WLOCK(&ripcbinfo); + INP_LOCK(inp); in_pcbdetach(inp); in_pcbfree(inp); + INP_INFO_WUNLOCK(&ripcbinfo); } static void -rip_detach(struct socket *so) +rip_abort(struct socket *so) { struct inpcb *inp; inp = sotoinpcb(so); - KASSERT(inp != NULL, ("rip_detach: inp == NULL")); + KASSERT(inp != NULL, ("rip_abort: inp == NULL")); + INP_INFO_WLOCK(&ripcbinfo); INP_LOCK(inp); - rip_pcbdetach(so, inp); + if (so == ip_mrouter && ip_mrouter_done) + ip_mrouter_done(); + if (ip_rsvp_force_done) + ip_rsvp_force_done(so); + if (so == ip_rsvpd) + ip_rsvp_done(); + inp->inp_faddr.s_addr = INADDR_ANY; + SOCK_LOCK(so); + so->so_state &= ~SS_ISCONNECTED; + SOCK_UNLOCK(so); + INP_UNLOCK(inp); INP_INFO_WUNLOCK(&ripcbinfo); } static void -rip_abort(struct socket *so) +rip_close(struct socket *so) { struct inpcb *inp; inp = sotoinpcb(so); - KASSERT(inp != NULL, ("rip_abort: inp == NULL")); + KASSERT(inp != NULL, ("rip_close: inp == NULL")); + INP_INFO_WLOCK(&ripcbinfo); INP_LOCK(inp); - soisdisconnected(so); - rip_pcbdetach(so, inp); + if (so == ip_mrouter && ip_mrouter_done) + ip_mrouter_done(); + if (ip_rsvp_force_done) + ip_rsvp_force_done(so); + if (so == ip_rsvpd) + ip_rsvp_done(); + inp->inp_faddr.s_addr = INADDR_ANY; + SOCK_LOCK(so); + so->so_state &= ~SS_ISCONNECTED; + SOCK_UNLOCK(so); + INP_UNLOCK(inp); INP_INFO_WUNLOCK(&ripcbinfo); } @@ -902,5 +921,6 @@ .pru_send = rip_send, .pru_shutdown = rip_shutdown, .pru_sockaddr = rip_sockaddr, - .pru_sosetlabel = in_pcbsosetlabel + .pru_sosetlabel = in_pcbsosetlabel, + .pru_close = rip_close, }; --- //depot/vendor/freebsd/src/sys/netinet/tcp_subr.c 2006/04/25 11:43:02 +++ //depot/user/rwatson/sockref/src/sys/netinet/tcp_subr.c 2006/06/15 21:14:18 @@ -800,18 +800,7 @@ KASSERT(so->so_state & SS_PROTOREF, ("tcp_close: !SS_PROTOREF")); inp->inp_vflag &= ~INP_SOCKREF; - tcp_discardcb(tp); -#ifdef INET6 - if (inp->inp_vflag & INP_IPV6PROTO) { - in6_pcbdetach(inp); - in6_pcbfree(inp); - } else { -#endif - in_pcbdetach(inp); - in_pcbfree(inp); -#ifdef INET6 - } -#endif + INP_UNLOCK(inp); ACCEPT_LOCK(); SOCK_LOCK(so); so->so_state &= ~SS_PROTOREF; @@ -1777,12 +1766,6 @@ KASSERT(so->so_state & SS_PROTOREF, ("tcp_twstart: !SS_PROTOREF")); inp->inp_vflag &= ~INP_SOCKREF; -#ifdef INET6 - if (inp->inp_vflag & INP_IPV6PROTO) - in6_pcbdetach(inp); - else -#endif - in_pcbdetach(inp); INP_UNLOCK(inp); ACCEPT_LOCK(); SOCK_LOCK(so); @@ -1835,12 +1818,11 @@ /* * At this point, we are in one of two situations: * - * (1) We have no socket, just an inpcb<->twtcp pair. Release it all - * after validating. + * (1) We have no socket, just an inpcb<->twtcp pair. We can free + * all state. * - * (2) We have a socket, which we may or may now own the reference - * for. If we own the reference, release all the state after - * validating. If not, leave it for the socket close to clean up. + * (2) We have a socket -- if we own a reference, release it and + * notify the socket layer. */ inp = tw->tw_inpcb; KASSERT((inp->inp_vflag & INP_TIMEWAIT), ("tcp_twclose: !timewait")); @@ -1855,22 +1837,15 @@ so = inp->inp_socket; if (so != NULL) { + /* + * If there's a socket, handle two cases: first, we own a + * strong reference, which we will now release, or we don't + * in which case another reference exists (XXXRW: think + * about this more), and we don't need to take action. + */ if (inp->inp_vflag & INP_SOCKREF) { - /* - * If a socket is present, and we own the only - * reference, we need to tear down the socket and the - * inpcb. - */ inp->inp_vflag &= ~INP_SOCKREF; -#ifdef INET6 - if (inp->inp_vflag & INP_IPV6PROTO) { - in6_pcbdetach(inp); - in6_pcbfree(inp); - } else { - in_pcbdetach(inp); - in_pcbfree(inp); - } -#endif + INP_UNLOCK(inp); ACCEPT_LOCK(); SOCK_LOCK(so); KASSERT(so->so_state & SS_PROTOREF, --- //depot/vendor/freebsd/src/sys/netinet/tcp_usrreq.c 2006/06/04 09:36:00 +++ //depot/user/rwatson/sockref/src/sys/netinet/tcp_usrreq.c 2006/06/15 21:30:27 @@ -137,12 +137,10 @@ } /* - * tcp_detach() releases any protocol state that can be reasonably released - * when a socket shutdown is requested, and is a shared code path for - * tcp_usr_detach() and tcp_usr_abort(), the two socket close entry points. - * - * Accepts pcbinfo, inpcb locked, will unlock the inpcb (if needed) on - * return. + * tcp_detach is called when the socket layer loses its final reference + * to the socket, be it a file descriptor reference, a reference from TCP, + * etc. At this point, there is only one case in which we will keep around + * inpcb state: time wait. */ static void tcp_detach(struct socket *so, struct inpcb *inp) @@ -158,19 +156,24 @@ KASSERT(so->so_pcb == inp, ("tcp_detach: so_pcb != inp")); KASSERT(inp->inp_socket == so, ("tcp_detach: inp_socket != so")); + tp = intotcpcb(inp); + if (inp->inp_vflag & INP_TIMEWAIT) { + /* + * There are two cases to handle: one in which the time wait + * state is being discarded (INP_DROPPED), and one in which + * this connection will remain in timewait. In the former, + * it is time to discard all state (except tcptw, which has + * already been discarded by the timewait close code, which + * should be further up the call stack somewhere). In the + * latter case, we detach from the socket, but leave the pcb + * present until timewait ends. + * + * XXXRW: Would it be cleaner to free the tcptw here? + */ if (inp->inp_vflag & INP_DROPPED) { - /* - * Connection was in time wait and has been dropped; - * the calling path is either via tcp_twclose(), or - * as a result of an eventual soclose() after - * tcp_twclose() has been called. In either case, - * tcp_twclose() has detached the tcptw from the - * inpcb, so we just detach and free the inpcb. - * - * XXXRW: Would it be cleaner to free the tcptw - * here? - */ + KASSERT(tp == NULL, ("tcp_detach: INP_TIMEWAIT && " + "INP_DROPPED && tp != NULL")); #ifdef INET6 if (isipv6) { in6_pcbdetach(inp); @@ -183,11 +186,6 @@ } #endif } else { - /* - * Connection is in time wait and has not yet been - * dropped; allow the socket to be discarded, but - * need to keep inpcb until end of time wait. - */ #ifdef INET6 if (isipv6) in6_pcbdetach(inp); @@ -198,20 +196,21 @@ } } else { /* - * If not in timewait, there are two possible paths. First, - * the TCP connection is either embryonic or done, in which - * case we tear down all state. Second, it may still be - * active, in which case we acquire a reference to the socket - * and will free it later when TCP is done. + * If the connection is not in timewait, we consider two + * two conditions: one in which no further processing is + * necessary (dropped || embryonic), and one in which TCP is + * not yet done, but no longer requires the socket, so the + * pcb will persist for the time being. + * + * XXXRW: Does the second case still occur? */ - tp = intotcpcb(inp); if (inp->inp_vflag & INP_DROPPED || tp->t_state < TCPS_SYN_SENT) { tcp_discardcb(tp); #ifdef INET6 if (isipv6) { - in_pcbdetach(inp); - in_pcbfree(inp); + in6_pcbdetach(inp); + in6_pcbfree(inp); } else { #endif in_pcbdetach(inp); @@ -220,11 +219,12 @@ } #endif } else { - SOCK_LOCK(so); - so->so_state |= SS_PROTOREF; - SOCK_UNLOCK(so); - inp->inp_vflag |= INP_SOCKREF; - INP_UNLOCK(inp); +#ifdef INET6 + if (isipv6) + in6_pcbdetach(inp); + else +#endif + in_pcbdetach(inp); } } } @@ -251,15 +251,6 @@ ("tcp_usr_detach: inp_socket == NULL")); TCPDEBUG1(); - /* - * First, if we still have full TCP state, and we're not dropped, - * initiate a disconnect. - */ - if (!(inp->inp_vflag & INP_TIMEWAIT) && - !(inp->inp_vflag & INP_DROPPED)) { - tp = intotcpcb(inp); - tcp_disconnect(tp); - } tcp_detach(so, inp); tp = NULL; TCPDEBUG2(PRU_DETACH); @@ -928,16 +919,16 @@ } /* - * Abort the TCP. - * - * First, drop the connection. Then collect state if possible. + * Abort the TCP. Drop the connection abruptly. */ static void tcp_usr_abort(struct socket *so) { struct inpcb *inp; struct tcpcb *tp; +#if 0 TCPDEBUG0; +#endif inp = sotoinpcb(so); KASSERT(inp != NULL, ("tcp_usr_abort: inp == NULL")); @@ -946,20 +937,75 @@ INP_LOCK(inp); KASSERT(inp->inp_socket != NULL, ("tcp_usr_abort: inp_socket == NULL")); +#if 0 TCPDEBUG1(); +#endif /* - * First, if we still have full TCP state, and we're not dropped, - * drop. + * If we still have full TCP state, and we're not dropped, drop. */ if (!(inp->inp_vflag & INP_TIMEWAIT) && !(inp->inp_vflag & INP_DROPPED)) { tp = intotcpcb(inp); tcp_drop(tp, ECONNABORTED); } - tcp_detach(so, inp); - tp = NULL; - TCPDEBUG2(PRU_DETACH); + if (!(inp->inp_vflag & INP_DROPPED)) { + SOCK_LOCK(so); + so->so_state |= SS_PROTOREF; + SOCK_UNLOCK(so); + inp->inp_vflag |= INP_SOCKREF; + } +#if 0 + tp = intotcpcb(inp); + TCPDEBUG2(PRU_ABORT); +#endif + INP_UNLOCK(inp); + INP_INFO_WUNLOCK(&tcbinfo); +} + +/* + * TCP socket is closed. Start friendly disconnect. + */ +static void +tcp_usr_close(struct socket *so) +{ + struct inpcb *inp; + struct tcpcb *tp; +#if 0 + TCPDEBUG0; +#endif + + inp = sotoinpcb(so); + KASSERT(inp != NULL, ("tcp_usr_close: inp == NULL")); + + INP_INFO_WLOCK(&tcbinfo); + INP_LOCK(inp); + KASSERT(inp->inp_socket != NULL, + ("tcp_usr_close: inp_socket == NULL")); +#if 0 + TCPDEBUG1(); +#endif + + /* + * If we still have full TCP state, and we're not dropped, initiate + * a disconnect. + */ + if (!(inp->inp_vflag & INP_TIMEWAIT) && + !(inp->inp_vflag & INP_DROPPED)) { + tp = intotcpcb(inp); + tcp_disconnect(tp); + } + if (!(inp->inp_vflag & INP_DROPPED)) { + SOCK_LOCK(so); + so->so_state |= SS_PROTOREF; + SOCK_UNLOCK(so); + inp->inp_vflag |= INP_SOCKREF; + } +#if 0 + tp = intotcpcb(inp); + TCPDEBUG2(PRU_CLOSE); +#endif + INP_UNLOCK(inp); INP_INFO_WUNLOCK(&tcbinfo); } @@ -1021,7 +1067,8 @@ .pru_send = tcp_usr_send, .pru_shutdown = tcp_usr_shutdown, .pru_sockaddr = tcp_sockaddr, - .pru_sosetlabel = in_pcbsosetlabel + .pru_sosetlabel = in_pcbsosetlabel, + .pru_close = tcp_usr_close, }; #ifdef INET6 @@ -1041,7 +1088,8 @@ .pru_send = tcp_usr_send, .pru_shutdown = tcp_usr_shutdown, .pru_sockaddr = in6_mapped_sockaddr, - .pru_sosetlabel = in_pcbsosetlabel + .pru_sosetlabel = in_pcbsosetlabel, + .pru_close = tcp_usr_close, }; #endif /* INET6 */ --- //depot/vendor/freebsd/src/sys/netinet/udp_usrreq.c 2006/06/03 19:30:35 +++ //depot/user/rwatson/sockref/src/sys/netinet/udp_usrreq.c 2006/06/14 23:41:26 @@ -948,9 +948,12 @@ KASSERT(inp != NULL, ("udp_abort: inp == NULL")); INP_INFO_WLOCK(&udbinfo); INP_LOCK(inp); - soisdisconnected(so); - in_pcbdetach(inp); - in_pcbfree(inp); + if (inp->inp_faddr.s_addr != INADDR_ANY) { + in_pcbdisconnect(inp); + inp->inp_laddr.s_addr = INADDR_ANY; + soisdisconnected(so); + } + INP_UNLOCK(inp); INP_INFO_WUNLOCK(&udbinfo); } @@ -997,6 +1000,24 @@ return error; } +static void +udp_close(struct socket *so) +{ + struct inpcb *inp; + + inp = sotoinpcb(so); + KASSERT(inp != NULL, ("udp_close: inp == NULL")); + INP_INFO_WLOCK(&udbinfo); + INP_LOCK(inp); + if (inp->inp_faddr.s_addr != INADDR_ANY) { + in_pcbdisconnect(inp); + inp->inp_laddr.s_addr = INADDR_ANY; + soisdisconnected(so); + } + INP_UNLOCK(inp); + INP_INFO_WUNLOCK(&udbinfo); +} + static int udp_connect(struct socket *so, struct sockaddr *nam, struct thread *td) { @@ -1029,6 +1050,7 @@ { struct inpcb *inp; + /* XXXRW: Should assert detach has taken place. */ inp = sotoinpcb(so); KASSERT(inp != NULL, ("udp_detach: inp == NULL")); INP_INFO_WLOCK(&udbinfo); @@ -1121,5 +1143,6 @@ .pru_sosend = sosend_dgram, .pru_shutdown = udp_shutdown, .pru_sockaddr = udp_sockaddr, - .pru_sosetlabel = in_pcbsosetlabel + .pru_sosetlabel = in_pcbsosetlabel, + .pru_close = udp_close, }; --- //depot/vendor/freebsd/src/sys/netinet6/raw_ip6.c 2006/04/12 03:11:12 +++ //depot/user/rwatson/sockref/src/sys/netinet6/raw_ip6.c 2006/06/14 23:41:26 @@ -587,25 +587,58 @@ inp = sotoinpcb(so); KASSERT(inp != NULL, ("rip6_detach: inp == NULL")); - /* xxx: RSVP */ - if (so == ip6_mrouter) - ip6_mrouter_done(); + INP_INFO_WLOCK(&ripcbinfo); + INP_LOCK(inp); if (inp->in6p_icmp6filt) { FREE(inp->in6p_icmp6filt, M_PCB); inp->in6p_icmp6filt = NULL; } - INP_INFO_WLOCK(&ripcbinfo); - INP_LOCK(inp); in6_pcbdetach(inp); in6_pcbfree(inp); INP_INFO_WUNLOCK(&ripcbinfo); } +/* XXXRW: This can't ever be called. */ static void rip6_abort(struct socket *so) { + struct inpcb *inp; + + inp = sotoinpcb(so); + KASSERT(inp != NULL, ("rip6_abort: inp == NULL")); + + /* xxx: RSVP */ + if (so == ip6_mrouter) + ip6_mrouter_done(); +#if 0 + /* XXXRW: Some work to do here? */ + INP_INFO_WLOCK(&ripcbinfo); + INP_LOCK(inp); + INP_UNLOCK(inp); + INP_INFO_WUNLOCK(&ripcbinfo); +#endif + soisdisconnected(so); +} + +static void +rip6_close(struct socket *so) +{ + struct inpcb *inp; + + inp = sotoinpcb(so); + KASSERT(inp != NULL, ("rip6_close: inp == NULL")); + + /* xxx: RSVP */ + if (so == ip6_mrouter) + ip6_mrouter_done(); +#if 0 + /* XXXRW: Some work to do here? */ + INP_INFO_WLOCK(&ripcbinfo); + INP_LOCK(inp); + INP_UNLOCK(inp); + INP_INFO_WUNLOCK(&ripcbinfo); +#endif soisdisconnected(so); - rip6_detach(so); } static int @@ -795,4 +828,5 @@ .pru_send = rip6_send, .pru_shutdown = rip6_shutdown, .pru_sockaddr = in6_setsockaddr, + .pru_close = rip6_close, }; --- //depot/vendor/freebsd/src/sys/netinet6/udp6_usrreq.c 2006/05/20 13:30:39 +++ //depot/user/rwatson/sockref/src/sys/netinet6/udp6_usrreq.c 2006/06/14 23:41:26 @@ -480,11 +480,24 @@ inp = sotoinpcb(so); KASSERT(inp != NULL, ("udp6_abort: inp == NULL")); +#ifdef INET + if (inp->inp_vflag & INP_IPV4) { + struct pr_usrreqs *pru; + + pru = inetsw[ip_protox[IPPROTO_UDP]].pr_usrreqs; + (*pru->pru_abort)(so); + return; + } +#endif + INP_INFO_WLOCK(&udbinfo); INP_LOCK(inp); - soisdisconnected(so); - in6_pcbdetach(inp); - in6_pcbfree(inp); + if (!IN6_IS_ADDR_UNSPECIFIED(&inp->in6p_faddr)) { + in6_pcbdisconnect(inp); + inp->in6p_laddr = in6addr_any; + soisdisconnected(so); + } + INP_UNLOCK(inp); INP_INFO_WUNLOCK(&udbinfo); } @@ -566,6 +579,34 @@ return error; } +static void +udp6_close(struct socket *so) +{ + struct inpcb *inp; + + inp = sotoinpcb(so); + KASSERT(inp != NULL, ("udp6_close: inp == NULL")); + +#ifdef INET + if (inp->inp_vflag & INP_IPV4) { + struct pr_usrreqs *pru; + + pru = inetsw[ip_protox[IPPROTO_UDP]].pr_usrreqs; + (*pru->pru_disconnect)(so); + return; + } +#endif + INP_INFO_WLOCK(&udbinfo); + INP_LOCK(inp); + if (!IN6_IS_ADDR_UNSPECIFIED(&inp->in6p_faddr)) { + in6_pcbdisconnect(inp); + inp->in6p_laddr = in6addr_any; + soisdisconnected(so); + } + INP_UNLOCK(inp); + INP_INFO_WUNLOCK(&udbinfo); +} + static int udp6_connect(struct socket *so, struct sockaddr *nam, struct thread *td) { @@ -756,5 +797,6 @@ .pru_send = udp6_send, .pru_shutdown = udp_shutdown, .pru_sockaddr = in6_mapped_sockaddr, - .pru_sosetlabel = in_pcbsosetlabel + .pru_sosetlabel = in_pcbsosetlabel, + .pru_close = udp6_close }; --- //depot/vendor/freebsd/src/sys/netipsec/keysock.c 2006/04/04 10:13:28 +++ //depot/user/rwatson/sockref/src/sys/netipsec/keysock.c 2006/06/16 22:41:14 @@ -439,6 +439,17 @@ } /* + * key_close() + * derived from net/rtsock.c:rts_close(). + */ +static void +key_abort(struct socket *so) +{ + + raw_usrreqs.pru_close(so); +} + +/* * key_connect() * derived from net/rtsock.c:rts_connect() */ @@ -553,6 +564,7 @@ .pru_send = key_send, .pru_shutdown = key_shutdown, .pru_sockaddr = key_sockaddr, + .pru_close = key_close, }; /* sysctl */ --- //depot/vendor/freebsd/src/sys/netipx/ipx_usrreq.c 2006/04/11 23:20:47 +++ //depot/user/rwatson/sockref/src/sys/netipx/ipx_usrreq.c 2006/06/16 22:59:58 @@ -88,6 +88,7 @@ static int ipx_shutdown(struct socket *so); static int ripx_attach(struct socket *so, int proto, struct thread *td); static int ipx_output(struct ipxpcb *ipxp, struct mbuf *m0); +static void ipx_usr_close(struct socket *so); struct pr_usrreqs ipx_usrreqs = { .pru_abort = ipx_usr_abort, @@ -101,6 +102,7 @@ .pru_send = ipx_send, .pru_shutdown = ipx_shutdown, .pru_sockaddr = ipx_sockaddr, + .pru_close = ipx_usr_close, }; struct pr_usrreqs ripx_usrreqs = { @@ -115,6 +117,7 @@ .pru_send = ipx_send, .pru_shutdown = ipx_shutdown, .pru_sockaddr = ipx_sockaddr, + .pru_close = ipx_usr_close, }; /* @@ -432,14 +435,8 @@ ipx_usr_abort(so) struct socket *so; { - struct ipxpcb *ipxp = sotoipxpcb(so); - KASSERT(ipxp != NULL, ("ipx_usr_abort: ipxp == NULL")); - IPX_LIST_LOCK(); - IPX_LOCK(ipxp); - ipx_pcbdetach(ipxp); - ipx_pcbfree(ipxp); - IPX_LIST_UNLOCK(); + /* XXXRW: Possibly ipx_disconnect() here? */ soisdisconnected(so); } @@ -482,6 +479,15 @@ return (error); } +static void +ipx_usr_close(so) + struct socket *so; +{ + + /* XXXRW: Possibly ipx_disconnect() here? */ + soisdisconnected(so); +} + static int ipx_connect(so, nam, td) struct socket *so; @@ -513,6 +519,7 @@ { struct ipxpcb *ipxp = sotoipxpcb(so); + /* XXXRW: Should assert detached. */ KASSERT(ipxp != NULL, ("ipx_detach: ipxp == NULL")); IPX_LIST_LOCK(); IPX_LOCK(ipxp); --- //depot/vendor/freebsd/src/sys/netipx/spx_usrreq.c 2006/04/01 15:45:41 +++ //depot/user/rwatson/sockref/src/sys/netipx/spx_usrreq.c 2006/06/16 22:59:58 @@ -101,6 +101,7 @@ static int spx_accept(struct socket *so, struct sockaddr **nam); static int spx_attach(struct socket *so, int proto, struct thread *td); static int spx_bind(struct socket *so, struct sockaddr *nam, struct thread *td); +static void spx_usr_close(struct socket *so); static int spx_connect(struct socket *so, struct sockaddr *nam, struct thread *td); static void spx_detach(struct socket *so); @@ -131,6 +132,7 @@ .pru_send = spx_send, .pru_shutdown = spx_shutdown, .pru_sockaddr = ipx_sockaddr, + .pru_close = spx_usr_close, }; struct pr_usrreqs spx_usrreq_sps = { @@ -149,6 +151,7 @@ .pru_send = spx_send, .pru_shutdown = spx_shutdown, .pru_sockaddr = ipx_sockaddr, + .pru_close = spx_usr_close, }; void @@ -1320,9 +1323,7 @@ IPX_LIST_LOCK(); IPX_LOCK(ipxp); spx_drop(cb, ECONNABORTED); - spx_pcbdetach(ipxp); - ipx_pcbdetach(ipxp); - ipx_pcbfree(ipxp); + IPX_UNLOCK(ipxp); IPX_LIST_UNLOCK(); } @@ -1459,6 +1460,28 @@ return (error); } +static void +spx_usr_close(struct socket *so) +{ + struct ipxpcb *ipxp; + struct spxpcb *cb; + + ipxp = sotoipxpcb(so); + KASSERT(ipxp != NULL, ("spx_usr_close: ipxp == NULL")); + + cb = ipxtospxpcb(ipxp); + KASSERT(cb != NULL, ("spx_usr_close: cb == NULL")); + + IPX_LIST_LOCK(); + IPX_LOCK(ipxp); + if (cb->s_state > TCPS_LISTEN) + spx_disconnect(cb); + else + spx_close(cb); + IPX_UNLOCK(ipxp); + IPX_LIST_UNLOCK(); +} + /* * Initiate connection to peer. Enter SYN_SENT state, and mark socket as * connecting. Start keep-alive timer, setup prototype header, send initial @@ -1518,6 +1541,9 @@ struct ipxpcb *ipxp; struct spxpcb *cb; + /* + * XXXRW: Should assert appropriately detached. + */ ipxp = sotoipxpcb(so); KASSERT(ipxp != NULL, ("spx_detach: ipxp == NULL")); @@ -1526,12 +1552,7 @@ IPX_LIST_LOCK(); IPX_LOCK(ipxp); - if (cb->s_state > TCPS_LISTEN) - spx_disconnect(cb); - else - spx_close(cb); spx_pcbdetach(ipxp); - ipx_pcbdetach(ipxp); ipx_pcbfree(ipxp); IPX_LIST_UNLOCK(); } --- //depot/vendor/freebsd/src/sys/netkey/keysock.c 2006/04/01 15:57:25 +++ //depot/user/rwatson/sockref/src/sys/netkey/keysock.c 2006/06/14 22:56:14 @@ -348,6 +348,17 @@ } /* + * key_close() + * derived from net/rtsock.c:rts_close() + */ +static void +key_close(struct socket *so) +{ + + raw_usrreqs.pru_close(so); +} + +/* * key_connect() * derived from net/rtsock.c:rts_connect() */ @@ -460,6 +471,7 @@ .pru_send = key_send, .pru_shutdown = key_shutdown, .pru_sockaddr = key_sockaddr, + .pru_close = key_close, }; /* sysctl */ --- //depot/vendor/freebsd/src/sys/netnatm/natm.c 2006/04/23 16:35:19 +++ //depot/user/rwatson/sockref/src/sys/netnatm/natm.c 2006/06/14 22:56:14 @@ -336,7 +336,12 @@ natm_usr_abort(struct socket *so) { - natm_usr_detach(so); +} + +static void +natm_usr_close(struct socket *so) +{ + } static int @@ -366,6 +371,7 @@ .pru_send = natm_usr_send, .pru_shutdown = natm_usr_shutdown, .pru_sockaddr = natm_usr_sockaddr, + .pru_close = natm_usr_close, }; /* --- //depot/vendor/freebsd/src/sys/sys/protosw.h 2006/06/16 22:36:03 +++ //depot/user/rwatson/sockref/src/sys/sys/protosw.h 2006/06/16 22:37:21 @@ -170,7 +170,8 @@ #define PRU_PROTOSEND 21 /* send to below */ /* end for protocol's internal use */ #define PRU_SEND_EOF 22 /* send and close */ -#define PRU_NREQ 22 +#define PRU_CLOSE 23 /* socket close */ +#define PRU_NREQ 23 #ifdef PRUREQUESTS const char *prurequests[] = { @@ -179,8 +180,8 @@ "RCVD", "SEND", "ABORT", "CONTROL", "SENSE", "RCVOOB", "SENDOOB", "SOCKADDR", "PEERADDR", "CONNECT2", "FASTTIMO", "SLOWTIMO", - "PROTORCV", "PROTOSEND", - "SEND_EOF", + "PROTORCV", "PROTOSEND", "SEND_EOF", "SETLABEL", + "CLOSE", }; #endif @@ -244,6 +245,7 @@ int (*pru_sopoll)(struct socket *so, int events, struct ucred *cred, struct thread *td); void (*pru_sosetlabel)(struct socket *so); + void (*pru_close)(struct socket *so); }; /* @@ -279,6 +281,7 @@ int pru_sopoll_notsupp(struct socket *so, int events, struct ucred *cred, struct thread *td); void pru_sosetlabel_null(struct socket *so); +void pru_close_notsupp(struct socket *so); #endif /* _KERNEL */ From owner-freebsd-arch@FreeBSD.ORG Sun Jun 18 08:34:46 2006 Return-Path: X-Original-To: arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CDCFE16A479 for ; Sun, 18 Jun 2006 08:34:46 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 88CB143D45 for ; Sun, 18 Jun 2006 08:34:46 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 2417E46C38 for ; Sun, 18 Jun 2006 04:34:46 -0400 (EDT) Date: Sun, 18 Jun 2006 09:34:46 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: arch@FreeBSD.org In-Reply-To: <20060618014337.V67789@fledge.watson.org> Message-ID: <20060618093149.W99683@fledge.watson.org> References: <20060618014337.V67789@fledge.watson.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Subject: Re: Proposal: add pru_close protosw method, refactor abort/detach X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jun 2006 08:34:46 -0000 On Sun, 18 Jun 2006, Robert Watson wrote: > Attached is a patch that attempts to further rationalize tear-down. > Specifically, it refactors pru_detach (disconnect and conditionally free) > and pru_abort (disconnect abruptly and free) into three protocol switch > functions: > > pru_close: socket has been closed and a sensible shutdown without data loss > is desired. > > pru_abort: socket is being aborted, generally due to insufficient queue > space in a listen socket, or close of a listen socket while connections are > waiting to be accepted: close abruptly and potentially with data loss. > > pru_detach: teardown is now unconditional -- both the protocol and socket > are done. I realized, of course, that I omitted to specifically describe the specific chicken-and-egg problem that kicked this off: if the protocol lends a lock to the socket layer for use over the reference count, then the protocol detach must occur after any last possible use of that lock, and in the current world order, the lock was used several times after the call to detach -- possibly before a second call to detach, for example. The new arrangement basically guarantees that the socket is done calling into the protocol and using the potentially lent lock before calling a now unconditional detach. Robert N M Watson Computer Laboratory University of Cambridge From owner-freebsd-arch@FreeBSD.ORG Sat Jun 24 17:46:25 2006 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DCF7E16A492 for ; Sat, 24 Jun 2006 17:46:25 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8E34643D5E for ; Sat, 24 Jun 2006 17:46:23 +0000 (GMT) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 0286751814; Sat, 24 Jun 2006 19:46:21 +0200 (CEST) Received: from localhost (dlj212.neoplus.adsl.tpnet.pl [83.24.39.212]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id A23995131F for ; Sat, 24 Jun 2006 19:46:16 +0200 (CEST) Date: Sat, 24 Jun 2006 19:43:31 +0200 From: Pawel Jakub Dawidek To: freebsd-arch@FreeBSD.org Message-ID: <20060624174331.GB2134@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="5I6of5zJg18YgZEa" Content-Disposition: inline X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 User-Agent: mutt-ng/devel-r804 (FreeBSD) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.5 required=3.0 tests=BAYES_00,RCVD_IN_NJABL_DUL, RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: Subject: Accessing disks via their serial numbers. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Jun 2006 17:46:25 -0000 --5I6of5zJg18YgZEa Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi. I'd like to extend glabel(8) to create providers related to disks based on their serial numbers and everntually driver name. For example disk ad0 could also be accessed via /dev/disk/ata/3JX0LMGA (/dev/disk// or /dev/disk/). I want to discuss mechanism for obtaining such informations. Currently, when disk(9) KPI is used, BIO_GETATTR requests are not passed down to the disks. We can eventually change this, but probably use additional method (not d_strategy). We can also not pass enitre bio structure, but only attribute name and buffer for the data. This is also good time to think of other informations we would like to export using such mechanism, so we know it will be flexible enough to handle them. It could be eventually useful to be able to ask the disk which attributes it has, so we can fetch them in a loop. With BIO_GETATTR we don't know which attributes provider can return. Comments, ideas? --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --5I6of5zJg18YgZEa Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (FreeBSD) iD8DBQFEnXnDForvXbEpPzQRAlLUAJ9P0kRP2VYVR6JipLLc30DS6iIA3QCfWDwF Vkp8Ju3zROw+sdIR8jDsy6M= =tAbB -----END PGP SIGNATURE----- --5I6of5zJg18YgZEa--