Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 01 Feb 2011 11:56:30 -0800
From:      Sean Bruno <seanbru@yahoo-inc.com>
To:        Mike Tancsa <mike@sentex.net>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Ivan Voras <ivoras@freebsd.org>, Jack Vogel <jfvogel@gmail.com>, Jan Koum <jan@whatsapp.com>, "freebsd-hardware@freebsd.org" <freebsd-hardware@freebsd.org>
Subject:   Re: em driver, 82574L chip, and possibly ASPM
Message-ID:  <1296590190.2326.6.camel@hitfishpass-lx.corp.yahoo.com>
In-Reply-To: <4D42EA74.4090807@sentex.net>
References:  <icgd44$89l$1@dough.gmane.org> <1290533941.3173.50.camel@home-yahoo>	<4CEC0548.1080801@sentex.net> <AANLkTim82pWyf_X%2Bu72uj8RkWeRUb_4KSQ8B_HpNYsP9@mail.gmail.com> <AANLkTinO1yfN--_K63-yD1LY3wusOF7wB2wwG8DUd5Z4@mail.gmail.com> <4D2C636B.5040003@sentex.net> <AANLkTimFzYZOkwdExm5JPRB7BaN8Am8pPcgrMT0wVZqy@mail.gmail.com> <4D3C4795.40205@sentex.net>  <4D42EA74.4090807@sentex.net>

next in thread | previous in thread | raw e-mail | index | archive | help

--=-jFCXdpagJIiUUJVyJJjK
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit

On Fri, 2011-01-28 at 08:10 -0800, Mike Tancsa wrote:
> On 1/23/2011 10:21 AM, Mike Tancsa wrote:
> > On 1/21/2011 4:21 AM, Jan Koum wrote:
> > One other thing I noticed is that when the nic is in its hung state, the
> > WOL option is gone ?
> > 
> > e.g
> > 
> > em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> >         options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
> >         ether 00:15:17:ed:68:a4
> > 
> > vs
> > 
> > 
> > em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> > 
> > options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
> >         ether 00:15:17:ed:68:a4
> 
> 
> Another hang last night :(
> 
> Whats really strange is that the WOL_MAGIC and TSO4 got turned back on
> somehow ? I had explicitly turned it off, but when the NIC was in its
> bad state
> 
> em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         options=2198<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
> 
> ... its back on along with TSO?  Not sure if its coincidence or a side
> effect or what.  For now, I have had to re-purpose this nic to something
> else.
> 
> debug info shows
> 
> Jan 28 00:25:10 backup3 kernel: Interface is RUNNING and INACTIVE
> Jan 28 00:25:10 backup3 kernel: em1: hw tdh = 625, hw tdt = 625
> Jan 28 00:25:10 backup3 kernel: em1: hw rdh = 903, hw rdt = 903
> Jan 28 00:25:10 backup3 kernel: em1: Tx Queue Status = 0
> Jan 28 00:25:10 backup3 kernel: em1: TX descriptors avail = 1024
> Jan 28 00:25:10 backup3 kernel: em1: Tx Descriptors avail failure = 0
> Jan 28 00:25:10 backup3 kernel: em1: RX discarded packets = 0
> Jan 28 00:25:10 backup3 kernel: em1: RX Next to Check = 903
> Jan 28 00:25:10 backup3 kernel: em1: RX Next to Refresh = 904
> Jan 28 00:25:27 backup3 kernel: em1: link state changed to DOWN
> Jan 28 00:25:30 backup3 kernel: em1: link state changed to UP
> 
> 
> 	---Mike


I'm trying to get some more testing done regarding my suggestions around
the OACTIVE assertions in the driver.  More or less, it looks like
intense periods of activity can push the driver into the OACTIVE hold
off state and the logic isn't quite right in igb(4) or em(4) to handle
it.

I suspect that something like this modification to igb(4) may be
required for em(4).

Comments?

Sean

--=-jFCXdpagJIiUUJVyJJjK
Content-Disposition: attachment; filename="if_igb.diff_oactive"
Content-Type: text/x-patch; name="if_igb.diff_oactive"; charset="UTF-8"
Content-Transfer-Encoding: 7bit

--- p4/freebsd_7/src/sys/dev/e1000/if_igb.c	2010-12-23 11:06:17.127417000 -0800
+++ p4/ybsd_7/src/sys/dev/e1000/if_igb.c	2010-12-23 11:28:50.476993000 -0800
@@ -784,10 +784,14 @@
 		return;
 
 	/* Call cleanup if number of TX descriptors low */
+#if 0
 	if (txr->tx_avail <= IGB_TX_CLEANUP_THRESHOLD)
 		igb_txeof(txr);
+#endif
 
 	while (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) {
+		if (txr->tx_avail <= IGB_TX_CLEANUP_THRESHOLD)
+			igb_txeof(txr);
 		if (txr->tx_avail <= IGB_TX_OP_THRESHOLD) {
 			ifp->if_drv_flags |= IFF_DRV_OACTIVE;
 			break;
@@ -1162,10 +1166,10 @@
 		IGB_TX_LOCK(txr);
 		if (igb_txeof(txr))
 			more = TRUE;
-		if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
-			igb_start_locked(txr, ifp);
+		/*if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) Pointless as igb_start_locked() checks this right off the bat*/
+		igb_start_locked(txr, ifp);
 		IGB_TX_UNLOCK(txr);
-		if (more) {
+		if (more || (ifp->if_drv_flags & IFF_DRV_OACTIVE)) {
 			taskqueue_enqueue(que->tq, &que->que_task);
 			return;
 		}
@@ -1361,7 +1370,7 @@
 
 no_calc:
 	/* Schedule a clean task if needed*/
-	if (more_tx || more_rx) 
+	if (more_tx || more_rx || (ifp->if_drv_flags & IFF_DRV_OACTIVE))
 		taskqueue_enqueue(que->tq, &que->que_task);
 	else
 		/* Reenable this interrupt */
@@ -1535,6 +1545,14 @@
 	if (m_head->m_flags & M_VLANTAG)
 		cmd_type_len |= E1000_ADVTXD_DCMD_VLE;
 
+/*
+ * We just did this in before invocation, seems completely 
+ * redundant, igb_handle_queue -> igb_txeof
+ * Pretty sure this is impossible as we check for the 
+ * IGB_TX_CLEANUP_THRESHOLD in igb_start_locked() which happens
+ * before this func in invoked
+ */
+#if 0
         /*
          * Force a cleanup if number of TX descriptors
          * available hits the threshold
@@ -1547,6 +1565,7 @@
 			return (ENOBUFS);
 		}
 	}
+#endif
 
 	/*
          * Map the packet for DMA.
 
 

--=-jFCXdpagJIiUUJVyJJjK--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1296590190.2326.6.camel>