From owner-svn-src-user@FreeBSD.ORG  Thu Sep 29 17:13:23 2011
Return-Path: <owner-svn-src-user@FreeBSD.ORG>
Delivered-To: svn-src-user@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A3C8B106566B;
	Thu, 29 Sep 2011 17:13:23 +0000 (UTC)
	(envelope-from adrian@FreeBSD.org)
Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c])
	by mx1.freebsd.org (Postfix) with ESMTP id 940868FC08;
	Thu, 29 Sep 2011 17:13:23 +0000 (UTC)
Received: from svn.freebsd.org (localhost [127.0.0.1])
	by svn.freebsd.org (8.14.4/8.14.4) with ESMTP id p8THDN61032057;
	Thu, 29 Sep 2011 17:13:23 GMT (envelope-from adrian@svn.freebsd.org)
Received: (from adrian@localhost)
	by svn.freebsd.org (8.14.4/8.14.4/Submit) id p8THDN09032055;
	Thu, 29 Sep 2011 17:13:23 GMT (envelope-from adrian@svn.freebsd.org)
Message-Id: <201109291713.p8THDN09032055@svn.freebsd.org>
From: Adrian Chadd <adrian@FreeBSD.org>
Date: Thu, 29 Sep 2011 17:13:23 +0000 (UTC)
To: src-committers@freebsd.org, svn-src-user@freebsd.org
X-SVN-Group: user
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Cc: 
Subject: svn commit: r225879 - user/adrian/if_ath_tx/sys/dev/ath
X-BeenThere: svn-src-user@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "SVN commit messages for the experimental &quot; user&quot;
	src tree" <svn-src-user.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-user>,
	<mailto:svn-src-user-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-user>
List-Post: <mailto:svn-src-user@freebsd.org>
List-Help: <mailto:svn-src-user-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-user>,
	<mailto:svn-src-user-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Sep 2011 17:13:23 -0000

Author: adrian
Date: Thu Sep 29 17:13:23 2011
New Revision: 225879
URL: http://svn.freebsd.org/changeset/base/225879

Log:
  Put in a temporary workaround for the strange rx behaviour I was seeing
  when RXEOL occured.
  
  Normally, I'd expect RXEOL to be followed by a dequeue of ATH_RXBUF number
  of packets (here 512 for 11n.) But I'd occasionally see this only service
  a handful at a time (and sometimes it'd service 0 frames!)
  
  Since the RX descriptor list should be consistent and correctly linked
  together after being processed, I gathered that something was interfering
  with this process (eg rxlink was being set to NULL.)
  
  I gathered that kickpcu and the interrupt mask update was racing.
  So to work around it (and it's stopped the issue appearing so far) :
  
  * clear kickpcu after the queue reset has been re-established and
    the RX has recommenced. This likely won't be very good for
    SMP machines as the taskqueue and ath_intr() calls can occur
    simultaneously on different CPUs. This means I'll have to
    eventually correctly lock this (or use atomics here.)
  
  * If a MIB interrupt also pops in at the same time and kickpcu
    is currently 1, don't reset the interrupt mask. This stops the
    RXEOL/RXORN interrupt from being triggered until we've handled
    and cleared the original case.
  
  As mentioned, this is only somewhat correct (read: not correct) for
  the single CPU case, and definitely still very racy for the SMP case.
  I'll need to come up with a better solution for this kickpcu stuff.

Modified:
  user/adrian/if_ath_tx/sys/dev/ath/if_ath.c

Modified: user/adrian/if_ath_tx/sys/dev/ath/if_ath.c
==============================================================================
--- user/adrian/if_ath_tx/sys/dev/ath/if_ath.c	Thu Sep 29 15:43:02 2011	(r225878)
+++ user/adrian/if_ath_tx/sys/dev/ath/if_ath.c	Thu Sep 29 17:13:23 2011	(r225879)
@@ -1447,9 +1447,9 @@ ath_intr(void *arg)
 			 * is in the RX queue.
 			 * This will then kick the PCU.
 			 */
-			taskqueue_enqueue_fast(sc->sc_tq, &sc->sc_rxtask);
 			sc->sc_rxlink = NULL;
 			sc->sc_kickpcu = 1;
+			taskqueue_enqueue_fast(sc->sc_tq, &sc->sc_rxtask);
 		}
 		if (status & HAL_INT_TXURN) {
 			sc->sc_stats.ast_txurn++;
@@ -1484,7 +1484,14 @@ ath_intr(void *arg)
 			 * clear whatever condition caused the interrupt.
 			 */
 			ath_hal_mibevent(ah, &sc->sc_halstats);
-			ath_hal_intrset(ah, sc->sc_imask);
+			/*
+			 * Don't reset the interrupt if we've just
+			 * kicked the PCU, or we may get a nested
+			 * RXEOL before the rxproc has had a chance
+			 * to run.
+			 */
+			if (sc->sc_kickpcu == 0)
+				ath_hal_intrset(ah, sc->sc_imask);
 		}
 		if (status & HAL_INT_RXORN) {
 			/* NB: hal marks HAL_INT_FATAL when RXORN is fatal */
@@ -4057,24 +4064,34 @@ rx_next:
 	 * need to be handled, kick the PCU if there's
 	 * been an RXEOL condition.
 	 */
+	/*
+	 * XXX TODO!
+	 * It is very likely that we're unfortunately
+	 * racing with other places where ath_hal_intrset()
+	 * may be called. It may be that we do need to
+	 * add some more locking (eg the pcu lock from ath9k/
+	 * reference), or introduce some other way to cope
+	 * with this.
+	 */
 	if (sc->sc_kickpcu) {
-		sc->sc_kickpcu = 0;
 		CTR0(ATH_KTR_ERR, "ath_rx_proc: kickpcu");
-		/*
-		 * XXX this causes a 3ms delay; and shuts down a lof
-		 * XXX is it really needed? Or is it just enough to
-		 * XXX kick the PCU again to continue RXing?
-		 */
 		device_printf(sc->sc_dev, "%s: kickpcu; handled %d packets\n",
 		    __func__, npkts);
+
 #if 0
-		ath_stoprecv(sc);
-		sc->sc_imask |= (HAL_INT_RXEOL | HAL_INT_RXORN);
+		/*
+		 * This re-links all of the descriptors together.
+		 * (Is it possible that somehow, some state/issue
+		 * is leaving us with badly linked descriptors?)
+		 * Is it possible that we're receiving another RXEOL
+		 * _during_ this function?
+		 */
 		if (ath_startrecv(sc) != 0) {
 			if_printf(ifp,
 			    "%s: couldn't restart RX after RXEOL; resetting\n",
 			    __func__);
 			ath_reset(ifp);
+			sc->sc_kickpcu = 0;
 			return;
 		}
 #endif
@@ -4087,6 +4104,7 @@ rx_next:
 		ath_hal_startpcurecv(ah);	/* re-enable PCU/DMA engine */
 
 		ath_hal_intrset(ah, sc->sc_imask);
+		sc->sc_kickpcu = 0;
 	}
 
 	if ((ifp->if_drv_flags & IFF_DRV_OACTIVE) == 0) {