From owner-svn-src-all@FreeBSD.ORG  Sat Mar 13 12:05:15 2010
Return-Path: <owner-svn-src-all@FreeBSD.ORG>
Delivered-To: svn-src-all@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 216321065673;
	Sat, 13 Mar 2010 12:05:15 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au
	[211.29.132.187])
	by mx1.freebsd.org (Postfix) with ESMTP id AAF528FC1D;
	Sat, 13 Mar 2010 12:05:14 +0000 (UTC)
Received: from c220-239-227-59.carlnfd1.nsw.optusnet.com.au
	(c220-239-227-59.carlnfd1.nsw.optusnet.com.au [220.239.227.59])
	by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	o2DC5BYn010034
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 13 Mar 2010 23:05:12 +1100
Date: Sat, 13 Mar 2010 23:05:11 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@delplex.bde.org
To: Pyun YongHyeon <yongari@freebsd.org>
In-Reply-To: <201003121818.o2CII4ri076014@svn.freebsd.org>
Message-ID: <20100313222131.K22847@delplex.bde.org>
References: <201003121818.o2CII4ri076014@svn.freebsd.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org,
	src-committers@freebsd.org
Subject: Re: svn commit: r205090 - head/sys/dev/bge
X-BeenThere: svn-src-all@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "SVN commit messages for the entire src tree \(except for &quot;
	user&quot; and &quot; projects&quot; \)" <svn-src-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-all>,
	<mailto:svn-src-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all>
List-Post: <mailto:svn-src-all@freebsd.org>
List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-all>,
	<mailto:svn-src-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Mar 2010 12:05:15 -0000

On Fri, 12 Mar 2010, Pyun YongHyeon wrote:

> Log:
>  Reorder interrupt handler a bit such that producer/consumer
>  index of status block is read first before acknowledging the
>  interrupts. Otherwise bge(4) may get stale status block as
>  acknowledging an interrupt may yield another status block update.
>
>  Reviewed by:	marius

Er, doesn't this give a race instead?  It undoes a critical part of
rev.1.169 but not the comment part which still says that the ack is
done first, and why (to ensure getting another interrupt if the status
block changes after we have looked at it).

% 	/*
% 	 * Do the mandatory PCI flush as well as get the link status.
% 	 */
% 	statusword = CSR_READ_4(sc, BGE_MAC_STS) & BGE_MACSTAT_LINK_CHANGED;
% 
% 	/* Make sure the descriptor ring indexes are coherent. */
% 	bus_dmamap_sync(sc->bge_cdata.bge_status_tag,
% 	    sc->bge_cdata.bge_status_map,
% 	    BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
% 	rx_prod = sc->bge_ldata.bge_status_block->bge_idx[0].bge_rx_prod_idx;
% 	tx_cons = sc->bge_ldata.bge_status_block->bge_idx[0].bge_tx_cons_idx;
% 	sc->bge_ldata.bge_status_block->bge_status = 0;
% 	bus_dmamap_sync(sc->bge_cdata.bge_status_tag,
% 	    sc->bge_cdata.bge_status_map,
% 	    BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);

The above presumably gives sufficiently coherent accesses to the status
block, but what happens if a status update occurs now (before the ack).
Doesn't the ack prevent an interrupt for this status update?  I think
tx_prod and tx cons (read above) don't become stale since they are only
advanced by software, and we may processes tx and rx descriptors beyond
the ones reported by status updates before or after the ack, but
statusword (read above) does become stale.

% 
% 	/*
% 	 * Ack the interrupt by writing something to BGE_MBX_IRQ0_LO.  Don't
% 	 * disable interrupts by writing nonzero like we used to, since with
% 	 * our current organization this just gives complications and
% 	 * pessimizations for re-enabling interrupts.  We used to have races
% 	 * instead of the necessary complications.  Disabling interrupts

I don't remember seeing races with the current order, but I seem to
remember seeing them when the ack was the last hardware thing in the
function.  As described in detail below, the latter gives quite a
large race window so it is easy to miss an interrupt.

% 	 * would just reduce the chance of a status update while we are
% 	 * running (by switching to the interrupt-mode coalescence
% 	 * parameters), but this chance is already very low so it is more
% 	 * efficient to get another interrupt than prevent it.

This describes why it doesn't matter if we get an extra interrupt due to
the status block being updated after the ack, even in rev.1.168 when the
race window was much larger (it was the entire runtime of bge_intr(),
which can be several hundred uS; now it is several hundred nS).

% 	 *
% 	 * We do the ack first to ensure another interrupt if there is a
% 	 * status update after the ack.  We don't check for the status

But we don't do the ack first any more.

% 	 * changing later because it is more efficient to get another
% 	 * interrupt than prevent it, not quite as above (not checking is
% 	 * a smaller optimization than not toggling the interrupt enable,
% 	 * since checking doesn't involve PCI accesses and toggling require
% 	 * the status check).  So toggling would probably be a pessimization
% 	 * even with MSI.  It would only be needed for using a task queue.
% 	 */
% 	bge_writembx(sc, BGE_MBX_IRQ0_LO, 0);

Bruce