From owner-freebsd-performance@FreeBSD.ORG Wed Feb 6 20:28:12 2008 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 045A916A41B for ; Wed, 6 Feb 2008 20:28:12 +0000 (UTC) (envelope-from kris@FreeBSD.org) Received: from weak.local (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id D55CC13C43E; Wed, 6 Feb 2008 20:28:09 +0000 (UTC) (envelope-from kris@FreeBSD.org) Message-ID: <47AA1858.3050307@FreeBSD.org> Date: Wed, 06 Feb 2008 21:28:08 +0100 From: Kris Kennaway User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Stefan Lambrev References: <4794E6CC.1050107@moneybookers.com> <47A0B023.5020401@moneybookers.com> <47A3074A.3040409@moneybookers.com> <47A72EAB.6070602@moneybookers.com> <20080204182945.GA49276@heff.fud.org.nz> <47A780C0.2060201@moneybookers.com> <47A799A6.3070502@moneybookers.com> <47A84751.8020109@moneybookers.com> <47A8D233.8020506@FreeBSD.org> <47A8DCD6.3060209@moneybookers.com> <47A8E1F1.4040309@FreeBSD.org> <47A98CDC.2090407@moneybookers.com> <47A993D0.1060901@FreeBSD.org> <47A99736.8060809@moneybookers.com> <47A99B16.6030305@FreeBSD.org> <47A9B636.3040509@moneybookers.com> <47A9C43A.3030203@moneybookers.com> In-Reply-To: <47A9C43A.3030203@moneybookers.com> Content-Type: multipart/mixed; boundary="------------010607010803050401040208" Cc: freebsd-performance@freebsd.org Subject: Re: network performance X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Feb 2008 20:28:12 -0000 This is a multi-part message in MIME format. --------------010607010803050401040208 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Stefan Lambrev wrote: >> I'll use again hwpmc and LOCK_PROFILING to see what's going on. >> And will try the same benchmark on quad core processor as now numbers >> of cores/cpus matter :) >> > Here are promised results - http://89.186.204.158/lock_profiling-8.txt Thanks. There is further work needed on the route locking, and also you are hitting limitations of the em driver (or possibly hardware; if you only have a single transmit queue then outbound packets from multiple CPUs have to be serialized in the driver no matter what). Hopefully there will be further improvements in the coming months, and these changes will also migrate into CVS. If you want to start hacking things to see how much further progress is feasible, you can apply the attached hack that nulls out all route locking :) This should be OK as long as your routes are not changing, although you might get some spam on the console (if this is excessive, comment out the printfs also ;-). It may not help much though, all the contention will probably just fall through onto the ethernet driver. > Btw I got kernel panic first time when I run sysctl debug.lock.prof.stats Yeah, it is a bit broken in 8.0 even in CVS. Also make sure not to reset it while the CPUs are loaded :) > I'm still trying to get hwpmc working with my cpu's and new kernel. > Do you have any patches Kris? > Is it supposed to work with your sources on my CPU? > I can fetch your latest src/lib/libpmc from from p4 if this will help :) It works on my systems...try with libpmc from my branch, make sure to install the new includes first and then rebuild and reinstall libpmc and pmcstat. I have attached a patch against the CVS libpmc which might be easier than checking it out from p4...it relies on kernel changes also though, which are in the kernel you already have but not in CVS. Kris --------------010607010803050401040208 Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0"; name="pmc.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="pmc.diff" --- //depot/vendor/freebsd/src/lib/libpmc/libpmc.c 2007/12/07 14:42:05 +++ //depot/user/kris/contention/lib/libpmc/libpmc.c 2007/12/28 20:32:24 @@ -46,16 +46,14 @@ #if defined(__i386__) static int k7_allocate_pmc(enum pmc_event _pe, char *_ctrspec, struct pmc_op_pmcallocate *_pmc_config); +static int p5_allocate_pmc(enum pmc_event _pe, char *_ctrspec, + struct pmc_op_pmcallocate *_pmc_config); #endif #if defined(__amd64__) || defined(__i386__) static int k8_allocate_pmc(enum pmc_event _pe, char *_ctrspec, struct pmc_op_pmcallocate *_pmc_config); static int p4_allocate_pmc(enum pmc_event _pe, char *_ctrspec, struct pmc_op_pmcallocate *_pmc_config); -#endif -#if defined(__i386__) -static int p5_allocate_pmc(enum pmc_event _pe, char *_ctrspec, - struct pmc_op_pmcallocate *_pmc_config); static int p6_allocate_pmc(enum pmc_event _pe, char *_ctrspec, struct pmc_op_pmcallocate *_pmc_config); #endif @@ -1282,26 +1280,6 @@ return (0); } -#endif - -#if defined(__i386__) - -/* - * Pentium style PMCs - */ - -static struct pmc_event_alias p5_aliases[] = { - EV_ALIAS("cycles", "tsc"), - EV_ALIAS(NULL, NULL) -}; - -static int -p5_allocate_pmc(enum pmc_event pe, char *ctrspec, - struct pmc_op_pmcallocate *pmc_config) -{ - return (-1 || pe || ctrspec || pmc_config); /* shut up gcc */ -} - /* * Pentium Pro style PMCs. These PMCs are found in Pentium II, Pentium III, * and Pentium M CPUs. @@ -1629,9 +1607,30 @@ return (0); } + #endif +#if defined(__i386__) + /* + * Pentium style PMCs + */ + +static struct pmc_event_alias p5_aliases[] = { + EV_ALIAS("cycles", "tsc"), + EV_ALIAS(NULL, NULL) +}; + +static int +p5_allocate_pmc(enum pmc_event pe, char *ctrspec, + struct pmc_op_pmcallocate *pmc_config) +{ + return -1 || pe || ctrspec || pmc_config; /* shut up gcc */ +} + +#endif + +/* * API entry points */ @@ -1940,6 +1939,8 @@ pmc_mdep_event_aliases = p5_aliases; pmc_mdep_allocate_pmc = p5_allocate_pmc; break; +#endif +#if defined(__amd64__) || defined(__i386__) case PMC_CPU_INTEL_P6: /* P6 ... Pentium M CPUs have */ case PMC_CPU_INTEL_PII: /* similar PMCs. */ case PMC_CPU_INTEL_PIII: @@ -1947,8 +1948,6 @@ pmc_mdep_event_aliases = p6_aliases; pmc_mdep_allocate_pmc = p6_allocate_pmc; break; -#endif -#if defined(__amd64__) || defined(__i386__) case PMC_CPU_INTEL_PIV: pmc_mdep_event_aliases = p4_aliases; pmc_mdep_allocate_pmc = p4_allocate_pmc; --------------010607010803050401040208 Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0"; name="routehack.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="routehack.diff" ==== //depot/user/kris/net/net/route.c#2 - /zoo/kris/net/net/route.c ==== @@ -1153,7 +1153,6 @@ struct radix_node_head *rnh = rt_tables[dst->sa_family]; int dlen = SA_SIZE(dst), glen = SA_SIZE(gate); -again: RT_LOCK_ASSERT(rt); /* @@ -1187,15 +1186,6 @@ return (EADDRINUSE); /* failure */ } /* - * Try to reacquire the lock on rt, and if it fails, - * clean state and restart from scratch. - */ - if (!RT_TRYLOCK(rt)) { - RTFREE_LOCKED(gwrt); - RT_LOCK(rt); - goto again; - } - /* * If there is already a gwroute, then drop it. If we * are asked to replace route with itself, then do * not leak its refcounter. ==== //depot/user/kris/net/net/route.h#2 - /zoo/kris/net/net/route.h ==== @@ -288,6 +288,7 @@ #define RT_LOCK_INIT(_rt) \ rw_init_flags(&(_rt)->rt_lock, "rtentry", RW_DUPOK) +#if 0 #define RT_LOCK(_rt) rw_wlock(&(_rt)->rt_lock) #define RT_TRYLOCK(_rt) rw_try_wlock(&(_rt)->rt_lock) #define RT_UNLOCK(_rt) rw_wunlock(&(_rt)->rt_lock) @@ -297,6 +298,16 @@ #define RT_LOCK_DESTROY(_rt) rw_destroy(&(_rt)->rt_lock) #define RT_LOCK_ASSERT(_rt) rw_assert(&(_rt)->rt_lock, RA_LOCKED) #define RT_UNLOCK_ASSERT(_rt) rw_assert(&(_rt)->rt_lock, RA_UNLOCKED) +#endif +#define RT_LOCK(_rt) +#define RT_TRYLOCK(_rt) +#define RT_UNLOCK(_rt) +#define RT_LOCK_SHARED(_rt) +#define RT_UNLOCK_SHARED(_rt) +#define RT_LOCK_DOWNGRADE(_rt) +#define RT_LOCK_DESTROY(_rt) +#define RT_LOCK_ASSERT(_rt) +#define RT_UNLOCK_ASSERT(_rt) #define RT_ADDREF(_rt) do { \ RT_LOCK_ASSERT(_rt); \ --------------010607010803050401040208--