Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 6 Aug 2007 22:06:36 +0000 (UTC)
From:      "Christian S.J. Peron" <csjp@FreeBSD.org>
To:        src-committers@FreeBSD.org, cvs-src@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   cvs commit: src/sys/netinet in_mcast.c ip_divert.c
Message-ID:  <200708062206.l76M6aP3064189@repoman.freebsd.org>

next in thread | raw e-mail | index | archive | help
csjp        2007-08-06 22:06:36 UTC

  FreeBSD src repository

  Modified files:
    sys/netinet          in_mcast.c ip_divert.c 
  Log:
  Over the past couple of years, there have been a number of reports relating
  the use of divert sockets to dead locks.  A number of LORs have been reported
  between divert and a number of other network subsystems including: IPSEC, Pfil,
  multicast, ipfw and others.  Other dead locks could occur because of recursive
  entry into the IP stack.  This change should take care of most if not all of
  these issues.
  
  A summary of the changes follow:
  
  - We disallow multicast operations on divert sockets.  It really doesn't make
    semantic sense to allow this, since typically you would set multicast
    parameters on multicast end points.
  
    NOTE: As a part of this change, we actually dis-allow multicast options on
    any socket that IS a divert socket OR IS NOT a SOCK_RAW or SOCK_DGRAM family
  
  - We check to see if there are any socket options that have been specified on
    the socket, and if there was (which is very un-common and also probably
    doesnt make sense to support) we duplicate the mbuf carrying the options.
  
  - We then drop the INP/INFO locks over the call to ip_output().  It should be
    noted that since we no longer support multicast operations on divert sockets
    and we have duplicated any socket options, we no longer need the reference
    to the pcb to be coherent.
  
  - Finally, we replaced the call to ip_input() to use netisr queuing.  This
    should remove the recursive entry into the IP stack from divert.
  
  By dropping the locks over the call to ip_output() we eliminate all the lock
  ordering issues above.  By switching over to netisr on the inbound path,
  we can no longer recursively enter the ip_input() code via divert.
  
  I have tested this change by using the following command:
  
  ipfwpcap -r 8000 - | tcpdump -r - -nn -v
  
  This should exercise the input and re-injection (outbound) path, which is
  very similar to the work load performed by natd(8).  Additionally, I have
  run some ospf daemons which have a heavy reliance on raw sockets and
  multicast.
  
  Approved by:    re@ (kensmith)
  MFC after:      1 month
  LOR:            163
  LOR:            181
  LOR:            202
  LOR:            203
  Discussed with: julian, andre et al (on freebsd-net)
  In collaboration with:  bms [1], rwatson [2]
  
  [1] bms helped out with the multicast decisions
  [2] rwatson submitted the original netisr patches and came up with some
      of the original ideas on how to combat this issue.
  
  Revision  Changes    Path
  1.3       +21 -0     src/sys/netinet/in_mcast.c
  1.129     +45 -10    src/sys/netinet/ip_divert.c



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200708062206.l76M6aP3064189>