From owner-freebsd-net@FreeBSD.ORG  Wed Oct 25 18:33:10 2006
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
X-Original-To: freebsd-net@freebsd.org
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2A9F516A47C
	for <freebsd-net@freebsd.org>; Wed, 25 Oct 2006 18:33:10 +0000 (UTC)
	(envelope-from jfvogel@gmail.com)
Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.177])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 5323A43D76
	for <freebsd-net@freebsd.org>; Wed, 25 Oct 2006 18:33:07 +0000 (GMT)
	(envelope-from jfvogel@gmail.com)
Received: by py-out-1112.google.com with SMTP id c59so152209pyc
	for <freebsd-net@freebsd.org>; Wed, 25 Oct 2006 11:33:06 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com;
	h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references;
	b=RH7GDy0W8zuo7YjuiH0cQtTQYcjbrfpH7pz43EcQlMH7zhvUPsUYRUxrTCIRU4wGV0REQGAZZ/khrV1xGR3nmFp6t/hyDTUR90kSPR7J5WQR6kasdjwgq+IYGtXrQsSFZdlrvdpZ4cpd8EAzO2nVstF6WyUrSMRTRqam7uETYkI=
Received: by 10.35.128.1 with SMTP id f1mr1528002pyn;
	Wed, 25 Oct 2006 11:33:06 -0700 (PDT)
Received: by 10.35.118.6 with HTTP; Wed, 25 Oct 2006 11:33:05 -0700 (PDT)
Message-ID: <2a41acea0610251133s7eadf41fn937aa6c43e6136a2@mail.gmail.com>
Date: Wed, 25 Oct 2006 11:33:06 -0700
From: "Jack Vogel" <jfvogel@gmail.com>
To: "Doug Ambrisko" <ambrisko@ambrisko.com>
In-Reply-To: <200610251818.k9PIIe7p062530@ambrisko.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <XFMail.20061019152433.jdp@polstra.com>
	<200610251818.k9PIIe7p062530@ambrisko.com>
Cc: freebsd-net <freebsd-net@freebsd.org>, Scott Long <scottl@samsco.org>,
	John Polstra <jdp@polstra.com>
Subject: Re: em network issues
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Oct 2006 18:33:10 -0000

On 10/25/06, Doug Ambrisko <ambrisko@ambrisko.com> wrote:
> John Polstra writes:
> | On 19-Oct-2006 Scott Long wrote:
> | > The performance measurements that Andre and I did early this year showed
> | > that the INTR_FAST handler provided a very large benefit.
> |
> | I'm trying to understand why that's the case.  Is it because an
> | INTR_FAST interrupt doesn't have to be masked and unmasked in the
> | APIC?  I can't see any other reason for much of a performance
> | difference in that driver.  With or without INTR_FAST, you've got
> | the bulk of the work being done in a background thread -- either the
> | ithread or the taskqueue thread.  It's not clear to me that it's any
> | cheaper to run a task than it is to run an ithread.
> |
> | A difference might show up if you had two or more em devices sharing
> | the same IRQ.  Then they'd share one ithread, but would each get their
> | own taskqueue thread.  But sharing an IRQ among multiple gigabit NICs
> | would be avoided by anyone who cared about performance, so it's not a
> | very interesting case.  Besides, when you first committed this
> | stuff, INTR_FAST interrupts were not sharable.
> |
> | Another change you made in the same commit (if_em.c revision 1.98)
> | greatly reduced the number of PCI writes made to the RX ring consumer
> | pointer register.  That would yield a significant performance
> | improvement.  Did you see gains from INTR_FAST even without this
> | independent change?
>
> Something that we've fixed locally in atleast one version is:
>      1) Limit the loop in em_intr to 3 iterations
>      2) Pass a valid value to em_process_receive_interrupts/em_rxeof
>         a good value like 100 instead of -1.  Since this is the count
>         for how many time to iterate over the rx stuff.  Seems this
>         got lost in the some change of APIs.
>      3) In em_process_receive_interrupts/em_rxeof always decrement
>         the count on every run through the loop.  If you notice
>         count is an is an int that starts at the passed in value
>         of -1.  It then count-- until count==0.  Doing -1, -2, -3
>         takes awhile until the int rolls over to 0.   Passing 100
>         limits it more :-)  So this can run 3 * 100 versuses
>         infinite * int roll over assuming we don't skip a count--.
> Doing these changes made our multiple em based machines a lot happier
> when slammed with traffic without starving other things that shared
> interrupts like other em cards (especially in 4.X).  Interrupt handler
> should have limits of how long they should be able to run then let
> someone else go.  We use this in 6.X as well and haven't had any problems
> with our config's that use this.  We haven't tested much without these
> since we need to fix other issues and this is now a non-issue for us.
>
> I haven't pushed this more since I first found issue 1 and the concept
> was rejected ... my machine hung in the interrupt spin loop :-(
>
> If someone wants to examine/play with it more then that's great.
> These issues (I think they are bugs) have been in there a while.
>
> That's my 2 cents.
>
> Doug A.

Interesting, I had forgotten about a couple of these issues. Timely
email because I now have a test setup that has repro'd at least one
version of the reported problems and I am currently debugging. This
is something I can test.

Thanks Doug,

Jack