From owner-freebsd-net@FreeBSD.ORG  Wed Oct 25 19:14:42 2006
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
X-Original-To: freebsd-net@freebsd.org
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 7425A16A40F
	for <freebsd-net@freebsd.org>; Wed, 25 Oct 2006 19:14:42 +0000 (UTC)
	(envelope-from scottl@samsco.org)
Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B635243D45
	for <freebsd-net@freebsd.org>; Wed, 25 Oct 2006 19:14:39 +0000 (GMT)
	(envelope-from scottl@samsco.org)
Received: from [10.10.3.185] ([165.236.175.187]) (authenticated bits=0)
	by pooker.samsco.org (8.13.4/8.13.4) with ESMTP id k9PJEUZl092043;
	Wed, 25 Oct 2006 13:14:35 -0600 (MDT)
	(envelope-from scottl@samsco.org)
Message-ID: <453FB78F.7060402@samsco.org>
Date: Wed, 25 Oct 2006 13:14:23 -0600
From: Scott Long <scottl@samsco.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20060206
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Doug Ambrisko <ambrisko@ambrisko.com>
References: <200610251818.k9PIIe7p062530@ambrisko.com>
In-Reply-To: <200610251818.k9PIIe7p062530@ambrisko.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=0.0 required=3.8 tests=none autolearn=failed 
	version=3.1.1
X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on pooker.samsco.org
Cc: freebsd-net <freebsd-net@freebsd.org>, John Polstra <jdp@polstra.com>
Subject: Re: em network issues
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Oct 2006 19:14:42 -0000

Doug Ambrisko wrote:
> John Polstra writes:
> | On 19-Oct-2006 Scott Long wrote:
> | > The performance measurements that Andre and I did early this year showed
> | > that the INTR_FAST handler provided a very large benefit.
> | 
> | I'm trying to understand why that's the case.  Is it because an
> | INTR_FAST interrupt doesn't have to be masked and unmasked in the
> | APIC?  I can't see any other reason for much of a performance
> | difference in that driver.  With or without INTR_FAST, you've got
> | the bulk of the work being done in a background thread -- either the
> | ithread or the taskqueue thread.  It's not clear to me that it's any
> | cheaper to run a task than it is to run an ithread.
> | 
> | A difference might show up if you had two or more em devices sharing
> | the same IRQ.  Then they'd share one ithread, but would each get their
> | own taskqueue thread.  But sharing an IRQ among multiple gigabit NICs
> | would be avoided by anyone who cared about performance, so it's not a
> | very interesting case.  Besides, when you first committed this
> | stuff, INTR_FAST interrupts were not sharable.
> | 
> | Another change you made in the same commit (if_em.c revision 1.98)
> | greatly reduced the number of PCI writes made to the RX ring consumer
> | pointer register.  That would yield a significant performance
> | improvement.  Did you see gains from INTR_FAST even without this
> | independent change?
> 
> Something that we've fixed locally in atleast one version is:
>      1)	Limit the loop in em_intr to 3 iterations
>      2)	Pass a valid value to em_process_receive_interrupts/em_rxeof
> 	a good value like 100 instead of -1.  Since this is the count
> 	for how many time to iterate over the rx stuff.  Seems this
> 	got lost in the some change of APIs.
>      3)	In em_process_receive_interrupts/em_rxeof always decrement
> 	the count on every run through the loop.  If you notice
> 	count is an is an int that starts at the passed in value
> 	of -1.  It then count-- until count==0.  Doing -1, -2, -3
> 	takes awhile until the int rolls over to 0.   Passing 100
> 	limits it more :-)  So this can run 3 * 100 versuses
> 	infinite * int roll over assuming we don't skip a count--.
> Doing these changes made our multiple em based machines a lot happier
> when slammed with traffic without starving other things that shared
> interrupts like other em cards (especially in 4.X).  Interrupt handler 
> should have limits of how long they should be able to run then let 
> someone else go.  We use this in 6.X as well and haven't had any problems 
> with our config's that use this.  We haven't tested much without these
> since we need to fix other issues and this is now a non-issue for us.
> 
> I haven't pushed this more since I first found issue 1 and the concept
> was rejected ... my machine hung in the interrupt spin loop :-(
> 
> If someone wants to examine/play with it more then that's great.
> These issues (I think they are bugs) have been in there a while.
> 
> That's my 2 cents.
> 
> Doug A.

When I was first developing and testing the INTR_FAST patches, I did a 
similar thing with limiting the loop.  I can't recall why I dopped that 
(or if it was even me that dropped it).  I think it's a good idea to
generally have.  One concern that I've had with the whole 
INTR_FAST/taskqueue scheme is that having the rx loop be unbounded could
cause a livelock on UP.  In fact, I'm pretty sure that the performance
measurements done with the smartbits included having the loop be bounded.

Scott