From owner-cvs-all@FreeBSD.ORG Sat Dec 23 22:33:49 2006 Return-Path: X-Original-To: cvs-all@FreeBSD.org Delivered-To: cvs-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 72F6F16A407; Sat, 23 Dec 2006 22:33:49 +0000 (UTC) (envelope-from oleg@lath.rinet.ru) Received: from lath.rinet.ru (lath.rinet.ru [195.54.192.90]) by mx1.freebsd.org (Postfix) with ESMTP id CCA9613C455; Sat, 23 Dec 2006 22:33:48 +0000 (UTC) (envelope-from oleg@lath.rinet.ru) Received: from lath.rinet.ru (localhost [127.0.0.1]) by lath.rinet.ru (8.13.8/8.13.8) with ESMTP id kBNMXlx0034381 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 24 Dec 2006 01:33:47 +0300 (MSK) (envelope-from oleg@lath.rinet.ru) Received: (from oleg@localhost) by lath.rinet.ru (8.13.8/8.13.8/Submit) id kBNMXlKb034380; Sun, 24 Dec 2006 01:33:47 +0300 (MSK) (envelope-from oleg) Date: Sun, 24 Dec 2006 01:33:47 +0300 From: Oleg Bulyzhin To: Robert Watson Message-ID: <20061223223347.GB33627@lath.rinet.ru> References: <20061223213014.U35809@fledge.watson.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061223213014.U35809@fledge.watson.org> User-Agent: Mutt/1.5.13 (2006-08-11) Cc: cvs-src@FreeBSD.org, Scott Long , src-committers@FreeBSD.org, cvs-all@FreeBSD.org, John Polstra Subject: Re: cvs commit: src/sys/dev/bge if_bge.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Dec 2006 22:33:49 -0000 On Sat, Dec 23, 2006 at 09:36:33PM +0000, Robert Watson wrote: > > On Sat, 23 Dec 2006, John Polstra wrote: > > >>That said, dropping and regrabbing the driver lock in the rxeof routine > >>of any driver is bad. It may be safe to do, but it incurs horrible > >>performance penalties. It essentially allows the time-critical, high > >>priority RX path to be constantly preempted by the lower priority > >>if_start or if_ioctl paths. Even without this preemption and priority > >>inversion, you're doing an excessive number of expensive lock ops in the > >>fast path. > > > >We currently make this a lot worse than it needs to be by handing off the > >received packets one at a time, unlocking and relocking for every packet. > >It would be better if the driver's receive interrupt handler would harvest > >all of the incoming packets and queue them locally. Then, at the end, hand > >off the linked list of packets to the network stack wholesale, unlocking > >and relocking only once. (Actually, the list could probably be handed off > >at the very end of the interrupt service routine, after the driver has > >already dropped its lock.) We wouldn't even need a new primitive, if > >ether_input() and the other if_input() functions were enhanced to deal > >with a possible list of packets instead of just a single one. > > I try this experiement every few years, and generally don't measure much > improvement. I'll try it again with 10gbps early next year once back in > the office again. The more interesting transition is between the link > layer and the network layer, which is high on my list of topics to look > into in the next few weeks. In particular, reworking the ifqueue handoff. > The tricky bit is balancing latency, overhead, and concurrency... > > FYI, there are several sets of patches floating around to modify if_em to > hand off queues of packets to the link layer, etc. They probably need > updating, of course, since if_em has changed quite a bit in the last year. > In my implementaiton, I add a new input routine that accepts mbuf packet > queues. I'm just curious, do you remember average length of mbuf queue in your tests? While experimenting with bge(4) driver (taskqueue, interrupt moderation, converted bge_rxeof() to above scheme), i've found it's quite easy to exhaust available mbuf clusters under load (trying to queue hundreids of received packets). So i had to limit rx queue to rather low length. > > Robert N M Watson > Computer Laboratory > University of Cambridge -- Oleg. ================================================================ === Oleg Bulyzhin -- OBUL-RIPN -- OBUL-RIPE -- oleg@rinet.ru === ================================================================