From owner-freebsd-amd64@FreeBSD.ORG Fri Feb 17 15:50:33 2006 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B9ACF16A420 for ; Fri, 17 Feb 2006 15:50:33 +0000 (GMT) (envelope-from joseph.koshy@gmail.com) Received: from xproxy.gmail.com (xproxy.gmail.com [66.249.82.201]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3F9AD43D45 for ; Fri, 17 Feb 2006 15:50:33 +0000 (GMT) (envelope-from joseph.koshy@gmail.com) Received: by xproxy.gmail.com with SMTP id s19so295545wxc for ; Fri, 17 Feb 2006 07:50:32 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=nlybLfFM5j5dt+Z8siiEawNfRzMKoJbQa8Iz+xK1MCjbcA1YkcqEz8ytLbRVRnGxr5t+3edenHt7kQ5qAdZrff3PlRFisqKiYjzpbgQMaue3S1BNOaN9lGKI4C9f5VP5RTQdk0Z4BFS8vdIsAZ184Si9Umc9ZQvPy1qIDOZX708= Received: by 10.70.76.1 with SMTP id y1mr424990wxa; Fri, 17 Feb 2006 07:50:31 -0800 (PST) Received: by 10.70.116.10 with HTTP; Fri, 17 Feb 2006 07:50:30 -0800 (PST) Message-ID: <84dead720602170750j119080c9g32ec9f1ac0e3944d@mail.gmail.com> Date: Fri, 17 Feb 2006 21:20:30 +0530 From: Joseph Koshy To: Andrew Gallatin In-Reply-To: <17397.58669.457047.277510@grasshopper.cs.duke.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <17397.58669.457047.277510@grasshopper.cs.duke.edu> Cc: freebsd-amd64@freebsd.org Subject: Re: non-temporal copyin/copyout? X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Feb 2006 15:50:33 -0000 > I'm bringing this up because I've noticed that FreeBSD 10GbE > performance is far below Solaris/amd64 and linux/x86_64 when > using the PCI-e 10GbE adaptor that I'm doing drivers for. > For example, Solaris can recieve a netperf TCP stream at There was a bug in my port of netperf; I had left the `HISTOGRAM' option turned on, which causes it to slow down significantly. v2.3.1,1 is the latest & bugfixed version of the port. > 9.75Gb/sec while using only 47% CPU as measured by vmstat. > (eg, it is using a little less than a single core). In > contrast, FreeBSD is limited to 7.7Gb/sec, and uses nearly > 90% CPU. When profiling with hwpmc, I see a profile which > shows up to 70% of the time is spent in copyout. You could use the following events to probe the system: "k8-dc-miss" : data cache misses "k8-bu-fill-request-l2-miss,mask=3Ddc-fill" : L2 fills for the data cache "k8-dc-misaligned-data-reference": in case there are any "k8-fr-interrupts-masked-while-pending-cycles": for finding spots in the code where spin-locks are being held for long. You may need to tweak the sample rate (the -n option to pmcstat); the default of 65536 events per sample may be too high or too low for some of these. Using pmcstat -p EVENT will give a feel for a good sample rate to choose for EVENT. -- FreeBSD Volunteer, http://people.freebsd.org/~jkoshy