From owner-freebsd-stable@FreeBSD.ORG  Thu Nov 18 18:09:26 2004
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 8FE7316A4CE; Thu, 18 Nov 2004 18:09:26 +0000 (GMT)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 189D343D1F; Thu, 18 Nov 2004 18:09:26 +0000 (GMT)
	(envelope-from robert@fledge.watson.org)
Received: from fledge.watson.org (localhost [127.0.0.1])
	by fledge.watson.org (8.13.1/8.13.1) with ESMTP id iAII7lca075651;
	Thu, 18 Nov 2004 13:07:47 -0500 (EST)
	(envelope-from robert@fledge.watson.org)
Received: from localhost (robert@localhost)iAII7lsn075648;
	Thu, 18 Nov 2004 18:07:47 GMT
	(envelope-from robert@fledge.watson.org)
Date: Thu, 18 Nov 2004 18:07:47 +0000 (GMT)
From: Robert Watson <rwatson@freebsd.org>
X-Sender: robert@fledge.watson.org
To: Daniel Eriksson <daniel_k_eriksson@telia.com>
In-Reply-To: <!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAA0VcX9IoJqUaXPS8MjT1PdsKAAAAQAAAA/qWWqwitlkyUSHwJEUT+bwEAAAAA@telia.com>
Message-ID: <Pine.NEB.3.96L.1041118180419.66045I-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: freebsd-current@freebsd.org
cc: freebsd-stable@freebsd.org
Subject: RE: serious networking (em) performance (ggate and NFS) problem
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Production branch of FreeBSD source code
	<freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Nov 2004 18:09:26 -0000


On Thu, 18 Nov 2004, Daniel Eriksson wrote:

> I have a Tyan Tiger MPX board (dual AthlonMP) that has two 64bit PCI
> slots.  I have an Adaptec 29160 and a dual port Intel Pro/1000 MT
> plugged into those slots. 
> 
> As can be seen from the vmstat -i output below, em1 shares ithread with
> ahc0. This is with ACPI disabled. With ACPI enabled all devices get
> their own ithread (I think, not 100% sure). However, because of some
> hardware problem (interrupt routing?), em0 interrupts will somehow leak
> into atapci1+, generating a higher interrupt load. I'm not sure how
> expensive this is. 

I see precisely this problem on several motherboards, including the Intel
Westville.  There's some speculation on the source of the problem, but I
see related problems in 4.x as well.  Either I get them on different
interrupts but both fire, or on the same interrupt.  FYI, picking the
right one depends a bit on your configuration, but generally scheduling
multiple ithreads is more expensive than running multiple handlers in the
same ithread, so I think it's generally preferable to run with them on the
same interrupt.  Especially if nothing on the same interrupt is acquiring
Giant.  Acquiring and dropping Giant uncontended is cheaper than context
switching, however.

> Finally, my question. What would you recommend:
> 1) Run with ACPI disabled and debug.mpsafenet=1 and hope that the mix of
> giant-safe and giant-locked (em and ahc) doesn't trigger any bugs. This is
> what I currently do.

This shouldn't cause bugs; the ithread handler is smart and will acquire
Giant around the ahc code.  That will also make it slower due to the extra
mutex operations, however.

> 2) Run with ACPI disabled and debug.mpsafenet=0 and accept lower network
> performance (it is a high-traffic server, so I'm not sure this is a valid
> option).
> 3) Run with ACPI enabled and debug.mpsafenet=1 and accept that em0
> interrupts "leak" to the atapci1+ ithread. This I have done in the past.

I think you want to run the ahc stuff, unfortunately.  The good news is
that the higher the load, the more interrupt mitigation/coalescing will
kick in for if_em, so the fewer you'll see.  Under load, usually my boxes
hang out at 4k-6k interrupts/sec for if_em and don't go much above that. 

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Principal Research Scientist, McAfee Research