From owner-freebsd-net@FreeBSD.ORG Tue Jul 31 16:56:17 2007 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3AE0916A418 for ; Tue, 31 Jul 2007 16:56:17 +0000 (UTC) (envelope-from csjp@sub.vaned.net) Received: from sub.vaned.net (sub.vaned.net [205.200.235.40]) by mx1.freebsd.org (Postfix) with ESMTP id 08EE313C465 for ; Tue, 31 Jul 2007 16:56:16 +0000 (UTC) (envelope-from csjp@sub.vaned.net) Received: by sub.vaned.net (Postfix, from userid 1001) id B9C145C3B; Tue, 31 Jul 2007 11:25:15 -0500 (CDT) Date: Tue, 31 Jul 2007 11:25:15 -0500 From: "Christian S.J. Peron" To: freebsd-net@freebsd.org Message-ID: <20070731162515.GA3684@sub> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.2i Cc: rwatson@freebsd.org Subject: divert and deadlock issues X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Jul 2007 16:56:17 -0000 Group, Robert Watson and I have been discussing some of the consequences around not having Giant picked up in the network stack for mpsafenet=0. One of the issues that kept coming up was a number of lock ordering issues around divert: Upon quick inspection I found: LOR #163 - Locking interactions between IPSEC and divert LOR #181 - Locking interactions between PFIL and divert LOR #202 - Locking interactions between Multi-cast and divert (??) LOR #203 - Locking interactions between IPFW and divert Most of these exist because the lock ordering between inbound and outbound directions are reversed. Also, the notion of inbound and outbound can be slightly complicated in some areas. Upon quick inspection of the code, it looks like all of these issues can be fixed by simply dropping the inp/divert pcb info locks over the call to ip_output(). >From ip_divert.c: [..] INP_INFO_WLOCK(&divcbinfo); inp = sotoinpcb(so); INP_LOCK(inp); /* * Don't allow both user specified and setsockopt options, * and don't allow packet length sizes that will crash */ if (((ip->ip_hl != (sizeof (*ip) >> 2)) && inp->inp_options) || ((u_short)ntohs(ip->ip_len) > m->m_pkthdr.len)) { error = EINVAL; m_freem(m); } else { /* Convert fields to host order for ip_output() */ ip->ip_len = ntohs(ip->ip_len); ip->ip_off = ntohs(ip->ip_off); /* Send packet to output processing */ ipstat.ips_rawout++; /* XXX */ #ifdef MAC mac_create_mbuf_from_inpcb(inp, m); #endif error = ip_output(m, inp->inp_options, NULL, ((so->so_options & SO_DONTROUTE) ? IP_ROUTETOIF : 0) | IP_ALLOWBROADCAST | IP_RAWOUTPUT, inp->inp_moptions, NULL); } INP_UNLOCK(inp); INP_INFO_WUNLOCK(&divcbinfo); [..] One idea was to duplicate the socket options mbuf and pass in a NULL pointer for the multi-cast options. Keep in mind that these are multicast options associated with a divert socket. So I guess the questions: (1) Are there any users that are specifying multicast options on divert sockets? (2) Are there any users that are specifying socket options in general for divert sockets? Any feedback would be greatly appreciated. Thanks -- Christian S.J. Peron csjp@FreeBSD.ORG FreeBSD Committer