Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Nov 2009 22:27:26 GMT
From:      Jonathan Looney <jonlooney@gmail.com>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/140647: [patch] e1000 driver does not correctly handle multicast promiscuous mode with 128 or more multicast addresses
Message-ID:  <200911172227.nAHMRQPp014849@www.freebsd.org>
Resent-Message-ID: <200911172230.nAHMU2rk081334@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         140647
>Category:       kern
>Synopsis:       [patch] e1000 driver does not correctly handle multicast promiscuous mode with 128 or more multicast addresses
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Nov 17 22:30:02 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     Jonathan Looney
>Release:        9.0-CURRENT
>Organization:
>Environment:
FreeBSD freebsdcurrent-vm.edu.juniper.net 9.0-CURRENT FreeBSD 9.0-CURRENT #1: Tue Nov 17 12:05:06 UTC 2009     root@freebsdcurrent-vm.edu.juniper.net:/usr/obj/usr/src/sys/GENERIC  i386
>Description:
The e1000 driver contains two related bugs in the way it handles multicast promiscuous mode for an interface when there are MAX_NUM_MULTICAST_ADDRESSES (currently, 128) or more multicast addresses in use on an interface.

Currently, when there are less than MAX_NUM_MULTICAST_ADDRESSES (currently, 128) multicast addresses in use on an interface, the code downloads the list to the card.  When there are MAX_NUM_MULTICAST_ADDRESSES or more multicast addresses in use on an interface, the code instead just sets the card to be in multicast promiscuous mode.  The current code (from em_set_multi())is here:

	if (mcnt >= MAX_NUM_MULTICAST_ADDRESSES) {
		reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL);
		reg_rctl |= E1000_RCTL_MPE;
		E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl);
	} else
		e1000_update_mc_addr_list(&adapter->hw, mta, mcnt);

When there are MAX_NUM_MULTICAST_ADDRESSES or more multicast addresses in use on an interface, the code does not update the multicast address list on the card.  Therefore, if multicast promiscuous mode becomes disabled, the card will not have a current and complete list of all multicast addresses.

There are two bugs related to the handling of multicast promiscuous mode when there are MAX_NUM_MULTICAST_ADDRESSES (currently, 128) or more multicast addresses in use on an interface:

1) When there are MAX_NUM_MULTICAST_ADDRESSES or more multicast addresses in use on an interface, the code enables multicast promiscuous mode.  However, when the number of multicast addresses in use on an interface drops below MAX_NUM_MULTICAST_ADDRESSES, the code does not disable multicast promiscuous mode.  This may result in the system unnecessarily receiving extra multicast packets to process.

2) When the number of multicast addresses reaches MAX_NUM_MULTICAST_ADDRESSES, the code enables multicast promiscuous mode.  However, when the IFF_ALLMULTI or IFF_PROMISC flag are removed on an interface, the driver will disable multicast promiscuous mode without regard to whether multicast promiscuous mode is still needed for multicast support.  This bug could be triggered, for example, if someone ran tcpdump in promiscuous mode (the defaut mode for tcpdump) and then stopped it while the number of multicast groups joined on an interface was greater than or equal to MAX_NUM_MULTICAST_ADDRESSES.  In this case, the card may not receive traffic for all joined multicast addresses.

I have verified that these bugs are present in the CVS HEAD; however, note that these bugs are present at least as far back as 6.3.
>How-To-Repeat:
In order to demonstrate these bugs, I have written four files for use with mtest:

mtest-cmds-step1: This joins 99 groups (225.0.0.x, where x is a number between 1 and 99, inclusive).
mtest-cmds-step2: This joins 30 groups (225.0.0.x, where x is a number between 100 and 129, inclusive).
mtest-cmds-step3: This leaves 30 groups (225.0.0.x, where x is a number between 100 and 129, inclusive).
mtest-cmds-step4: This joins 50 groups (225.0.0.x, where x is a number between 100 and 149, inclusive).

To use these files to demonstrate this bug, you need the following:
a) The device under test: a FreeBSD host with interface em0.  (You will probably need two windows open to this host.  You can use an interface other than em0 if you edit the mtest files and change the interface in the commands in these instructions.)

b) A second host to generate multicast traffic.  This host should be on the same network as the device under test's em0 interface.

Procedure to demonstrate these bugs:

1) On the host generating the traffic, ping 225.0.0.145.

2) On the device under test, start tcpdump in non-promiscuous mode (for example, 'tcpdump -i em0 -p icmp').  You should not see the multicast traffic yet.

3) On a different window on the device under test, start mtest.

4) Type 'f mtest-cmds-step1' to join the 99 groups.  Unless you already have joined 29 other groups on this interface, this should not cause the number of multicast addresses to reach MAX_NUM_MULTICAST_ADDRESSES.  Therefore, the tcpdump should not receive the traffic destined to 225.0.0.145 (as that was not one of the addresses joined).

5) Type 'f mtest-cmds-step2' to join 30 additional groups.  This should cause the number of multicast addresses to reach MAX_NUM_MULTICAST_ADDRESSES.  Therefore, the driver should enable multicast promiscuous mode for the interface and tcpdump should now begin to receive traffic destined to 225.0.0.145 (even though that was not one of the addresses that was joined).

6) Type 'f mtest-cmds-step3' to leave the 30 groups joined in the last step.  This should cause the number of multicast addresses to fall below MAX_NUM_MULTICAST_ADDRESSES.  That *should* mean that the card should leave multicast promiscuous mode; however, because of the first bug listed above, the card will remain in multicast promiscuous mode.  You can verify this by observing that tcpdump is still receiving traffic destined to 225.0.0.145 (even though that is not one of the addresses that is currently joined).

7) Type 'f mtest-cmds-step4' to join 50 additional groups.  This should cause the number of multicast addresses to reach MAX_NUM_MULTICAST_ADDRESSES.  Therefore, the driver should put the interface in multicast promiscuous mode and tcpdump should receive traffic destined to 225.0.0.145 (and, that is one of the addresses that was joined in this step).

8) Type 'p em0 1' to cause the interface to enter promiscuous mode. Then, type 'p em0 0' to cause the interface to leave promiscuous mode.  That *should not* prevent the card from receiving traffic destined for multicast addresses that are joined on the interface; however, because of the second bug listed above, the interface will leave multicast promiscuous mode and the interface will only receive traffic for the multicast addresses last programmed into the card.  You can verify this by observing that tcpdump stops receiving traffic destined to 225.0.0.145, even though that multicast address was joined in step 7 (and is still joined).
>Fix:
I have written a patch that addresses these bugs in these ways:

1) When the number of multicast addresses in use on an interface is below MAX_NUM_MULTICAST_ADDRESSES, the code checks the E1000_RCTL to see if multicast promiscuous mode is enabled.  If it is enabled, it checks to see if the IFF_ALLMULTI or IFF_PROMISC flags are set in the if_flags.  If neither is set, it disables multicast promiscuous mode.  In order to avoid brief outages, the code only disables multicast promiscuous mode after it has already written the new multicast addresses to the card.

2) When the IFF_ALLMULTI or IFF_PROMISC if_flags change, the code currently calls em_disable_promisc() and then em_set_promisc().  These two functions will reset the unicast promiscuous, multicast promiscuous, and save bad packet modes as appropriate for the if_flags.  However, in order to ensure that changes to these flags do not inadvertently disable multicast promiscuous mode when multicast promiscuous mode is needed due to the number of multicast addresses joined on an interface, I added a call to em_set_multi() after em_set_promisc().  This will allow em_set_multi() to re-enable multicast promiscuous mode, if needed.

I have compiled and tested this patch on CURRENT (CVS head, updated today).  I have also applied this same patch to 8.0-RC3 (CVS RELENG_8_0 tag, updated today) and compiled and tested on that release.  I also have patches that I have compiled and tested for older releases, were that to be necessary.

Patch attached with submission follows:

Index: if_em.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/e1000/if_em.c,v
retrieving revision 1.26
diff -u -r1.26 if_em.c
--- if_em.c	10 Sep 2009 21:14:55 -0000	1.26
+++ if_em.c	17 Nov 2009 19:27:12 -0000
@@ -1277,6 +1277,17 @@
 				    (IFF_PROMISC | IFF_ALLMULTI)) {
 					em_disable_promisc(adapter);
 					em_set_promisc(adapter);
+					/* em_set_multi() will set
+					   multicast promiscuous mode
+					   when there is a large number
+					   of groups.  Therefore, if we
+					   play around with the
+					   promiscuous flags, we should
+					   call em_set_multi to give it
+					   a chance to re-enable
+					   multicast promiscuous mode,
+					   if required. */
+					em_set_multi(adapter);
 				}
 			} else
 				em_init_locked(adapter);
@@ -2555,12 +2566,20 @@
 	}
 	if_maddr_runlock(ifp);
 
+	reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL);
 	if (mcnt >= MAX_NUM_MULTICAST_ADDRESSES) {
-		reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL);
 		reg_rctl |= E1000_RCTL_MPE;
-		E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl);
-	} else
+	} else {
+		/* Turn off multicast promiscuous mode if it
+		   is on and it doesn't need to be enabled
+		   according to the if_flags. */
+		if ((reg_rctl & E1000_RCTL_MPE) &&
+		    !(ifp->if_flags & (IFF_ALLMULTI | IFF_PROMISC))) {
+			reg_rctl &= (~E1000_RCTL_MPE);
+		}
 		e1000_update_mc_addr_list(&adapter->hw, mta, mcnt);
+	}
+	E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl);
 
 	if (adapter->hw.mac.type == e1000_82542 && 
 	    adapter->hw.revision_id == E1000_REVISION_2) {


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200911172227.nAHMRQPp014849>