From owner-freebsd-net@FreeBSD.ORG Wed Oct 25 18:33:10 2006 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2A9F516A47C for ; Wed, 25 Oct 2006 18:33:10 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.177]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5323A43D76 for ; Wed, 25 Oct 2006 18:33:07 +0000 (GMT) (envelope-from jfvogel@gmail.com) Received: by py-out-1112.google.com with SMTP id c59so152209pyc for ; Wed, 25 Oct 2006 11:33:06 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=RH7GDy0W8zuo7YjuiH0cQtTQYcjbrfpH7pz43EcQlMH7zhvUPsUYRUxrTCIRU4wGV0REQGAZZ/khrV1xGR3nmFp6t/hyDTUR90kSPR7J5WQR6kasdjwgq+IYGtXrQsSFZdlrvdpZ4cpd8EAzO2nVstF6WyUrSMRTRqam7uETYkI= Received: by 10.35.128.1 with SMTP id f1mr1528002pyn; Wed, 25 Oct 2006 11:33:06 -0700 (PDT) Received: by 10.35.118.6 with HTTP; Wed, 25 Oct 2006 11:33:05 -0700 (PDT) Message-ID: <2a41acea0610251133s7eadf41fn937aa6c43e6136a2@mail.gmail.com> Date: Wed, 25 Oct 2006 11:33:06 -0700 From: "Jack Vogel" To: "Doug Ambrisko" In-Reply-To: <200610251818.k9PIIe7p062530@ambrisko.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <200610251818.k9PIIe7p062530@ambrisko.com> Cc: freebsd-net , Scott Long , John Polstra Subject: Re: em network issues X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Oct 2006 18:33:10 -0000 On 10/25/06, Doug Ambrisko wrote: > John Polstra writes: > | On 19-Oct-2006 Scott Long wrote: > | > The performance measurements that Andre and I did early this year showed > | > that the INTR_FAST handler provided a very large benefit. > | > | I'm trying to understand why that's the case. Is it because an > | INTR_FAST interrupt doesn't have to be masked and unmasked in the > | APIC? I can't see any other reason for much of a performance > | difference in that driver. With or without INTR_FAST, you've got > | the bulk of the work being done in a background thread -- either the > | ithread or the taskqueue thread. It's not clear to me that it's any > | cheaper to run a task than it is to run an ithread. > | > | A difference might show up if you had two or more em devices sharing > | the same IRQ. Then they'd share one ithread, but would each get their > | own taskqueue thread. But sharing an IRQ among multiple gigabit NICs > | would be avoided by anyone who cared about performance, so it's not a > | very interesting case. Besides, when you first committed this > | stuff, INTR_FAST interrupts were not sharable. > | > | Another change you made in the same commit (if_em.c revision 1.98) > | greatly reduced the number of PCI writes made to the RX ring consumer > | pointer register. That would yield a significant performance > | improvement. Did you see gains from INTR_FAST even without this > | independent change? > > Something that we've fixed locally in atleast one version is: > 1) Limit the loop in em_intr to 3 iterations > 2) Pass a valid value to em_process_receive_interrupts/em_rxeof > a good value like 100 instead of -1. Since this is the count > for how many time to iterate over the rx stuff. Seems this > got lost in the some change of APIs. > 3) In em_process_receive_interrupts/em_rxeof always decrement > the count on every run through the loop. If you notice > count is an is an int that starts at the passed in value > of -1. It then count-- until count==0. Doing -1, -2, -3 > takes awhile until the int rolls over to 0. Passing 100 > limits it more :-) So this can run 3 * 100 versuses > infinite * int roll over assuming we don't skip a count--. > Doing these changes made our multiple em based machines a lot happier > when slammed with traffic without starving other things that shared > interrupts like other em cards (especially in 4.X). Interrupt handler > should have limits of how long they should be able to run then let > someone else go. We use this in 6.X as well and haven't had any problems > with our config's that use this. We haven't tested much without these > since we need to fix other issues and this is now a non-issue for us. > > I haven't pushed this more since I first found issue 1 and the concept > was rejected ... my machine hung in the interrupt spin loop :-( > > If someone wants to examine/play with it more then that's great. > These issues (I think they are bugs) have been in there a while. > > That's my 2 cents. > > Doug A. Interesting, I had forgotten about a couple of these issues. Timely email because I now have a test setup that has repro'd at least one version of the reported problems and I am currently debugging. This is something I can test. Thanks Doug, Jack