From owner-freebsd-stable@FreeBSD.ORG Thu Nov 18 18:09:26 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8FE7316A4CE; Thu, 18 Nov 2004 18:09:26 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 189D343D1F; Thu, 18 Nov 2004 18:09:26 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.13.1/8.13.1) with ESMTP id iAII7lca075651; Thu, 18 Nov 2004 13:07:47 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)iAII7lsn075648; Thu, 18 Nov 2004 18:07:47 GMT (envelope-from robert@fledge.watson.org) Date: Thu, 18 Nov 2004 18:07:47 +0000 (GMT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Daniel Eriksson In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-current@freebsd.org cc: freebsd-stable@freebsd.org Subject: RE: serious networking (em) performance (ggate and NFS) problem X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Nov 2004 18:09:26 -0000 On Thu, 18 Nov 2004, Daniel Eriksson wrote: > I have a Tyan Tiger MPX board (dual AthlonMP) that has two 64bit PCI > slots. I have an Adaptec 29160 and a dual port Intel Pro/1000 MT > plugged into those slots. > > As can be seen from the vmstat -i output below, em1 shares ithread with > ahc0. This is with ACPI disabled. With ACPI enabled all devices get > their own ithread (I think, not 100% sure). However, because of some > hardware problem (interrupt routing?), em0 interrupts will somehow leak > into atapci1+, generating a higher interrupt load. I'm not sure how > expensive this is. I see precisely this problem on several motherboards, including the Intel Westville. There's some speculation on the source of the problem, but I see related problems in 4.x as well. Either I get them on different interrupts but both fire, or on the same interrupt. FYI, picking the right one depends a bit on your configuration, but generally scheduling multiple ithreads is more expensive than running multiple handlers in the same ithread, so I think it's generally preferable to run with them on the same interrupt. Especially if nothing on the same interrupt is acquiring Giant. Acquiring and dropping Giant uncontended is cheaper than context switching, however. > Finally, my question. What would you recommend: > 1) Run with ACPI disabled and debug.mpsafenet=1 and hope that the mix of > giant-safe and giant-locked (em and ahc) doesn't trigger any bugs. This is > what I currently do. This shouldn't cause bugs; the ithread handler is smart and will acquire Giant around the ahc code. That will also make it slower due to the extra mutex operations, however. > 2) Run with ACPI disabled and debug.mpsafenet=0 and accept lower network > performance (it is a high-traffic server, so I'm not sure this is a valid > option). > 3) Run with ACPI enabled and debug.mpsafenet=1 and accept that em0 > interrupts "leak" to the atapci1+ ithread. This I have done in the past. I think you want to run the ahc stuff, unfortunately. The good news is that the higher the load, the more interrupt mitigation/coalescing will kick in for if_em, so the fewer you'll see. Under load, usually my boxes hang out at 4k-6k interrupts/sec for if_em and don't go much above that. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Principal Research Scientist, McAfee Research