From owner-freebsd-performance@FreeBSD.ORG Mon Jun 10 10:49:47 2013 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 4816A483 for ; Mon, 10 Jun 2013 10:49:47 +0000 (UTC) (envelope-from menyy@mellanox.com) Received: from eu1sys200aog111.obsmtp.com (eu1sys200aog111.obsmtp.com [207.126.144.131]) by mx1.freebsd.org (Postfix) with ESMTP id 63CFE1B14 for ; Mon, 10 Jun 2013 10:49:45 +0000 (UTC) Received: from MTLCAS01.mtl.com ([193.47.165.155]) (using TLSv1) by eu1sys200aob111.postini.com ([207.126.147.11]) with SMTP ID DSNKUbWvQyCi6G4FUsmjVhFo7qLg1qhpbFFp@postini.com; Mon, 10 Jun 2013 10:49:46 UTC Received: from MTLDAG01.mtl.com ([10.0.8.75]) by MTLCAS01.mtl.com ([10.0.8.71]) with mapi id 14.03.0123.003; Mon, 10 Jun 2013 13:48:08 +0300 From: Meny Yossefi To: "freebsd-performance@freebsd.org" Subject: NUMA awareness Thread-Topic: NUMA awareness Thread-Index: Ac5lxM1wXvXxqIP+RaCR9q7vBRlceA== Date: Mon, 10 Jun 2013 10:48:07 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.0.13.1] MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 10:49:47 -0000 Hi, I'm using FreeBSD 9.1. I'd like to know what's the status with the NUMA support. I understand there should be initial though not full support by now. Is tha= t correct ? Is there a new NUMA api for memory allocation ? Thanks, Meny Yossefi | SW Engineer | FreeBSD team Mellanox Technologies Ltd Work: +972-74-7129121, Cell: +972-52-8379557 From owner-freebsd-performance@FreeBSD.ORG Wed Jun 12 22:58:51 2013 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id CC0A2D48 for ; Wed, 12 Jun 2013 22:58:51 +0000 (UTC) (envelope-from obrien@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id AC36C192D; Wed, 12 Jun 2013 22:58:51 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r5CMwpaA024261; Wed, 12 Jun 2013 22:58:51 GMT (envelope-from obrien@freefall.freebsd.org) Received: (from obrien@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r5CMwpJw024258; Wed, 12 Jun 2013 22:58:51 GMT (envelope-from obrien) Date: Wed, 12 Jun 2013 15:58:49 -0700 From: "David O'Brien" To: freebsd-performance@freebsd.org Subject: Scaling and performance issues with FreeBSD 9 (& 10) on 4 socket systems Message-ID: <20130612225849.GA2858@dragon.NUXI.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) Cc: Simon Gerraty X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: obrien@freebsd.org List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Jun 2013 22:58:51 -0000 Hi all, $WORK is looking to see what throwing money at a build can do. We've been building the product on FreeBSD 8.3/amd64 in a 7.1/i386 jail, on on Supermicro X8DT3 http://www.supermicro.com/products/motherboard/qpi/5500/x8dt3.cfm w/Xeon X5690 3.47GHz: "2 package(s) x 6 core(s) x 2 SMT threads". These are nicknamed "Montana-2" or "M2". The new machines are Supermicro X9QR7-TF+/X9QRi-F+ http://www.supermicro.com/products/motherboard/Xeon/C600/X9QR7-TF_.cfm w/Xeon E5-4650 2.70GHz: "4 package(s) x 8 core(s) x 2 SMT threads". These are nicknamed "jbm" (build machine) during their testing phase. As part of this we also upgraded the host FreeBSD to 9.1 from 8.3. We've found that a "make -j28" build on the M2 machines is considerably faster than on the jbm machines. While somewhat faster might not be surprising given the faster CPU's, the build is 2x longer on jbm. (jbm runs are FreeBSD 9.1/amd64 with the same 7.1/i386 jail installation as the M2) In addition, we've found the scaling on jbm to be quite bad as additional jobs or threads or processes are run. ----------%<----------%<----------%<----------%<----------%<---------- In order to better quantify this, a co-worker who's been playing with Bitcoin suggested the Bitcoin "vanitygen" multi-threaded application as a good test. We have results of 1-64 threads on a jbm machine running (1) FreeBSD 9.1/amd64, (2) FreeBSD 10-current/amd64 [no Witness], (3) FreeBSD 8.4/amd64, (4) FreeBSD 10-current/amd64 SCHED_4BSD, & (5) Fedora 18/amd64. The results are graphed at http://people.freebsd.org/~obrien/jbm/vanitygen/vanity-perf-graph.png We found FreeBSD 8.4 to perform better than FreeBSD 9.1, and Linux considerably better than both on the same machine. ----------%<----------%<----------%<----------%<----------%<---------- To get more data, I ran /usr/ports/benchmarks/sysbench/ and see similar concerning results between the M2 machines and jbm machines (on the native amd64 host, not within the i386 jail). This is a multi-threaded application. The graph for that is http://people.freebsd.org/~obrien/jbm/sysbench/sysbench.png ----------%<----------%<----------%<----------%<----------%<---------- Simon also did some benchmarking using an integrity file signing server. This one is a multi-process benchmark doing 2k RSA+SHA1 calculations. The graph for that is http://people.freebsd.org/~obrien/jbm/sigs/sigs.png ----------%<----------%<----------%<----------%<----------%<---------- We've tried various things and haven't been able to explain why FreeBSD isn't scaling on the new hardware. Nor why it performs so much worse than FreeBSD on the older "M2" machines. Thoughts? -- -- David (obrien@FreeBSD.org) P.S. vanitygen is from 'git clone git://github.com/samr7/vanitygen.git' and builds with this patch: --- Makefile.ORIG 2013-06-06 21:39:10.000000000 -0700 +++ Makefile 2013-06-06 21:38:45.000000000 -0700 @@ -1,2 +1,2 @@ -LIBS=-lpcre -lcrypto -lm -lpthread -CFLAGS=-ggdb -O3 -Wall +LIBS=-L/usr/local/lib -lpcre -lcrypto -lm -lpthread +CFLAGS=-ggdb -O3 -Wall -I/usr/local/include From owner-freebsd-performance@FreeBSD.ORG Thu Jun 13 11:32:48 2013 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 38D0B2DF for ; Thu, 13 Jun 2013 11:32:48 +0000 (UTC) (envelope-from feld@feld.me) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by mx1.freebsd.org (Postfix) with ESMTP id 0C13F179E for ; Thu, 13 Jun 2013 11:32:47 +0000 (UTC) Received: from compute2.internal (compute2.nyi.mail.srv.osa [10.202.2.42]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id C1F4F2093F for ; Thu, 13 Jun 2013 07:32:41 -0400 (EDT) Received: from frontend2.nyi.mail.srv.osa ([10.202.2.161]) by compute2.internal (MEProxy); Thu, 13 Jun 2013 07:32:41 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=feld.me; h= content-type:to:subject:references:date:mime-version :content-transfer-encoding:from:message-id:in-reply-to; s= mesmtp; bh=5xxAU9/F3ieRHBIpCG6UgkONg6A=; b=ZNsYwMCOzVjSp4dBRwJqJ cKZjbcg9Fym7x9iBEjv91GNP75qYmYrd7dOT7/uo4n7oDpeif1mVkUG4M/6U9AF6 StPYhPIOo2nutLXL72qE/FrGE6P166fc6GiIHEN/pY3ET6g7zkn4IaxZyVy4xtL1 jSnrhD0a7t+v0rmiNqdlew= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-type:to:subject:references:date :mime-version:content-transfer-encoding:from:message-id :in-reply-to; s=smtpout; bh=5xxAU9/F3ieRHBIpCG6UgkONg6A=; b=Hb+g 9ZJK0qN9qCeS3Yqe7Afm5dnAtk5ayNcFOHwCcAq+WkZXQCFpvcKxXakwcHYSD/dR Fv2QRcqENOm3wjv3IFycon1lZeCipKMvGb6emuD550WGGsJ0VkmttP7HvWAhMDW5 axF6nWF/bAGrxtllclNE2EcKrcibQWL8uKevIN0= X-Sasl-enc: fyM221nkNbflCSs8S6OaJlVBSkV35qof+WipTF4rrwBt 1371123161 Received: from markf.office.supranet.net (unknown [66.170.8.18]) by mail.messagingengine.com (Postfix) with ESMTPA id 8A3CA680286 for ; Thu, 13 Jun 2013 07:32:41 -0400 (EDT) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: "freebsd-performance@freebsd.org" Subject: Re: Scaling and performance issues with FreeBSD 9 (& 10) on 4 socket systems References: <20130612225849.GA2858@dragon.NUXI.org> Date: Thu, 13 Jun 2013 06:32:41 -0500 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Mark Felder" Message-ID: In-Reply-To: <20130612225849.GA2858@dragon.NUXI.org> User-Agent: Opera Mail/12.15 (FreeBSD) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jun 2013 11:32:48 -0000 On Wed, 12 Jun 2013 17:58:49 -0500, David O'Brien wrote: > We found FreeBSD 8.4 to perform better than FreeBSD 9.1, and Linux > considerably better than both on the same machine. http://svnweb.freebsd.org/base?view=revision&revision=241246 The above link is likely why 8.4 is better than 9.1 on the same machine. > We've tried various things and haven't been able to explain why FreeBSD > isn't scaling on the new hardware. Nor why it performs so much worse > than FreeBSD on the older "M2" machines. The CPUs between those machines are quite different. I'm sure we're looking at different cache sizes, different behavior for the hyperthreading, etc. I'm sure others would be greatly interested in you providing the same benchmark results for a recent snapshot of HEAD as well. From owner-freebsd-performance@FreeBSD.ORG Thu Jun 13 12:02:38 2013 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 295BB548 for ; Thu, 13 Jun 2013 12:02:38 +0000 (UTC) (envelope-from remy@activnetworks.com) Received: from fr-exchange.activnetworks.com (anwadmin.net8.nerim.net [213.41.185.85]) by mx1.freebsd.org (Postfix) with ESMTP id ACB0E192B for ; Thu, 13 Jun 2013 12:02:36 +0000 (UTC) Received: from rn.activnetworks.com ([192.168.1.100]) by fr-exchange.activnetworks.com with Microsoft SMTPSVC(6.0.3790.4675); Thu, 13 Jun 2013 14:01:27 +0200 Message-ID: <51B9B497.70800@activnetworks.com> Date: Thu, 13 Jun 2013 14:01:27 +0200 From: Remy Nonnenmacher User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130522 Thunderbird/17.0.6 MIME-Version: 1.0 To: Mark Felder Subject: Re: Scaling and performance issues with FreeBSD 9 (& 10) on 4 socket systems References: <20130612225849.GA2858@dragon.NUXI.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 13 Jun 2013 12:01:27.0579 (UTC) FILETIME=[BBE99EB0:01CE682D] Cc: "freebsd-performance@freebsd.org" X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jun 2013 12:02:38 -0000 On 06/13/13 13:32, Mark Felder wrote: > On Wed, 12 Jun 2013 17:58:49 -0500, David O'Brien > wrote: > >> We found FreeBSD 8.4 to perform better than FreeBSD 9.1, and Linux >> considerably better than both on the same machine. > > http://svnweb.freebsd.org/base?view=revision&revision=241246 > > The above link is likely why 8.4 is better than 9.1 on the same machine. > >> We've tried various things and haven't been able to explain why FreeBSD >> isn't scaling on the new hardware. Nor why it performs so much worse >> than FreeBSD on the older "M2" machines. > > The CPUs between those machines are quite different. I'm sure we're > looking at different cache sizes, different behavior for the > hyperthreading, etc. I'm sure others would be greatly interested in you > providing the same benchmark results for a recent snapshot of HEAD as well. > _______________________________________________ > freebsd-performance@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to > "freebsd-performance-unsubscribe@freebsd.org" We had same problem on 4x12 cores (AMD) machines. After investigating using hwpmc, it appears that performance was killed by a scheduler function trying to find "least used cpu" that unfortunately works on contended structures (ie: lots a cores are fighting to get works). A solution was found by using artificially long queue of stuck process (steal_thresh bumped to over 8) and by cpu affinity crafting. Was a year ago and from my memory. I guess you may give a try to see if it helps. Disregard is a scheduler specialist contradicts. Thanks. From owner-freebsd-performance@FreeBSD.ORG Thu Jun 13 18:31:22 2013 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B1E72198 for ; Thu, 13 Jun 2013 18:31:22 +0000 (UTC) (envelope-from obrien@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 9F5531CD5; Thu, 13 Jun 2013 18:31:22 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r5DIVM4T062812; Thu, 13 Jun 2013 18:31:22 GMT (envelope-from obrien@freefall.freebsd.org) Received: (from obrien@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r5DIVL6e062809; Thu, 13 Jun 2013 18:31:21 GMT (envelope-from obrien) Date: Thu, 13 Jun 2013 11:31:19 -0700 From: "David O'Brien" To: Mark Felder Subject: Re: Scaling and performance issues with FreeBSD 9 (& 10) on 4 socket systems Message-ID: <20130613183119.GA40198@dragon.NUXI.org> References: <20130612225849.GA2858@dragon.NUXI.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Operating-System: FreeBSD 10.0-CURRENT X-MUA-Host: dragon.NUXI.org X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? User-Agent: Mutt/1.5.20 (2009-06-14) Cc: "freebsd-performance@freebsd.org" X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: obrien@freebsd.org List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jun 2013 18:31:22 -0000 On Thu, Jun 13, 2013 at 06:32:41AM -0500, Mark Felder wrote: > The CPUs between those machines are quite different. I wouldn't say they are "quite" different. It's not like comparing Netburst to Core2, or I believe even original Core2 to Sandybridge. I may be wrong, I've not followed Intel cores from a micro-architecture POV too closely. If anything it's typical for a newer micro-architecture to perform the same at a lower clock speed. > I'm sure we're > looking at different cache sizes, different behavior for the > hyperthreading, Is there something specific you are thinking of? The Xeon E5-4650 has 20M "smart" cache organized as ??? The Xeon X5690 has 12M "smart" cache organized as ??? I know the AMD cache hierarchy for L1 I&D, L2, L3; but I'm not seeing this as clearly spelled out for these Xeons. > etc. I'm sure others would be greatly interested in you > providing the same benchmark results for a recent snapshot of HEAD as well. 10-CURRENT results were in http://people.freebsd.org/~obrien/jbm/vanitygen/vanity-perf-graph.png as "fbsd10". Or are you suggesting something else? thanks for your thoughts! -- -- David (obrien@FreeBSD.org) From owner-freebsd-performance@FreeBSD.ORG Thu Jun 13 18:36:32 2013 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 66F376C0; Thu, 13 Jun 2013 18:36:32 +0000 (UTC) (envelope-from feld@feld.me) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by mx1.freebsd.org (Postfix) with ESMTP id 163101D2E; Thu, 13 Jun 2013 18:36:31 +0000 (UTC) Received: from compute2.internal (compute2.nyi.mail.srv.osa [10.202.2.42]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 9ADCA21059; Thu, 13 Jun 2013 14:36:29 -0400 (EDT) Received: from frontend2.nyi.mail.srv.osa ([10.202.2.161]) by compute2.internal (MEProxy); Thu, 13 Jun 2013 14:36:29 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=feld.me; h= content-type:to:cc:subject:references:date:mime-version :content-transfer-encoding:from:message-id:in-reply-to; s= mesmtp; bh=jFeAwIEEZvWgFAGUoKOA3Uwh+lg=; b=IN3ljfIhPshvifkOpY7/X g10QP0ti2yn/asjqs6x5wrf3pWqtuphsTbSLrfnUwUzX7Nh+LHW0DLJEkBd6ci0Y GQNmQ1VcA/dV4RlbrLHmfNkSsXubA7JUYC355B4YtUsYq3D0CRI2uawYKUhWnNdT jqyy3CEod6sSUxcIQuuwi8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-type:to:cc:subject:references :date:mime-version:content-transfer-encoding:from:message-id :in-reply-to; s=smtpout; bh=jFeAwIEEZvWgFAGUoKOA3Uwh+lg=; b=Fg8A 3P6TJLOmSGarkqqVdOAB8dKCAwbmKWDXo9STHdBvGBCPs9qeAf8CQahlGT/ZzecL l4kW5YwGgnz/wkwG3j/FaOxRODzF43niD33R673HbEn8YzpMAdNfyeJeFybMHRle bN9gxPlxMgRyIyvE5qHgpFLS0Ewh6vmRxzRpX8Y= X-Sasl-enc: x0/rIzYLGg23iVVoomsRVkbH8jbJ+femYOL5/StuHlIN 1371148584 Received: from markf.office.supranet.net (unknown [66.170.8.18]) by mail.messagingengine.com (Postfix) with ESMTPA id 75F1B68028B; Thu, 13 Jun 2013 14:36:24 -0400 (EDT) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: "David O'Brien" Subject: Re: Scaling and performance issues with FreeBSD 9 (& 10) on 4 socket systems References: <20130612225849.GA2858@dragon.NUXI.org> <20130613183119.GA40198@dragon.NUXI.org> Date: Thu, 13 Jun 2013 13:36:24 -0500 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Mark Felder" Message-ID: In-Reply-To: <20130613183119.GA40198@dragon.NUXI.org> User-Agent: Opera Mail/12.15 (FreeBSD) Cc: "freebsd-performance@freebsd.org" X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jun 2013 18:36:32 -0000 On Thu, 13 Jun 2013 13:31:19 -0500, David O'Brien wrote: > 10-CURRENT results were in > http://people.freebsd.org/~obrien/jbm/vanitygen/vanity-perf-graph.png > as "fbsd10". Or are you suggesting something else? Whoops, I missed fbsd10 on that graph. Sorry! From owner-freebsd-performance@FreeBSD.ORG Fri Jun 14 02:05:11 2013 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 041AEEED for ; Fri, 14 Jun 2013 02:05:11 +0000 (UTC) (envelope-from davidxu@freebsd.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id CD2E5183C; Fri, 14 Jun 2013 02:05:10 +0000 (UTC) Received: from xyf.my.dom (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r5E258sL055350; Fri, 14 Jun 2013 02:05:09 GMT (envelope-from davidxu@freebsd.org) Message-ID: <51BA7A78.7010904@freebsd.org> Date: Fri, 14 Jun 2013 10:05:44 +0800 From: David Xu User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:17.0) Gecko/20130416 Thunderbird/17.0.5 MIME-Version: 1.0 To: Remy Nonnenmacher Subject: Re: Scaling and performance issues with FreeBSD 9 (& 10) on 4 socket systems References: <20130612225849.GA2858@dragon.NUXI.org> <51B9B497.70800@activnetworks.com> In-Reply-To: <51B9B497.70800@activnetworks.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Cc: "freebsd-performance@freebsd.org" X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Jun 2013 02:05:11 -0000 On 2013/06/13 20:01, Remy Nonnenmacher wrote: > > On 06/13/13 13:32, Mark Felder wrote: >> On Wed, 12 Jun 2013 17:58:49 -0500, David O'Brien >> wrote: >> >>> We found FreeBSD 8.4 to perform better than FreeBSD 9.1, and Linux >>> considerably better than both on the same machine. >> >> http://svnweb.freebsd.org/base?view=revision&revision=241246 >> >> The above link is likely why 8.4 is better than 9.1 on the same machine. >> >>> We've tried various things and haven't been able to explain why FreeBSD >>> isn't scaling on the new hardware. Nor why it performs so much worse >>> than FreeBSD on the older "M2" machines. >> >> The CPUs between those machines are quite different. I'm sure we're >> looking at different cache sizes, different behavior for the >> hyperthreading, etc. I'm sure others would be greatly interested in you >> providing the same benchmark results for a recent snapshot of HEAD as >> well. >> _______________________________________________ >> freebsd-performance@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-performance >> To unsubscribe, send any mail to >> "freebsd-performance-unsubscribe@freebsd.org" > > We had same problem on 4x12 cores (AMD) machines. After investigating > using hwpmc, it appears that performance was killed by a scheduler > function trying to find "least used cpu" that unfortunately works on > contended structures (ie: lots a cores are fighting to get works). A > solution was found by using artificially long queue of stuck process > (steal_thresh bumped to over 8) and by cpu affinity crafting. > > Was a year ago and from my memory. I guess you may give a try to see if > it helps. > > Disregard is a scheduler specialist contradicts. > > Thanks. > AMD's cache is very different than Intel, AFAIK eariler than Bulldozer, AMD's L3 is exclusive cache, util Bulldozer, AMD describes the L3 cache as a “non-inclusive victim cache”, it is still different than Intel which is inclusive. "- In sched_pickcpu() change general logic of CPU selection. First look for idle CPU, sharing last level cache with previously used one, skipping SMT CPU groups. If none found, search all CPUs for the least loaded one, where the thread with its priority can run now. If none found, search just for the least loaded CPU." For exclusive cache, the L3 has second-hand data, not hot data, when a thread is migrated, will have negative effect, its hot data is lost. I'd prefer to search idle CPU from L2, then L3. From owner-freebsd-performance@FreeBSD.ORG Fri Jun 14 11:02:25 2013 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id CDACCAE5; Fri, 14 Jun 2013 11:02:25 +0000 (UTC) (envelope-from remy@activnetworks.com) Received: from fr-exchange.activnetworks.com (anwadmin.net8.nerim.net [213.41.185.85]) by mx1.freebsd.org (Postfix) with ESMTP id 55FD41125; Fri, 14 Jun 2013 11:02:24 +0000 (UTC) Received: from rn.activnetworks.com ([192.168.1.100]) by fr-exchange.activnetworks.com with Microsoft SMTPSVC(6.0.3790.4675); Fri, 14 Jun 2013 12:50:22 +0200 Message-ID: <51BAF56E.7030700@activnetworks.com> Date: Fri, 14 Jun 2013 12:50:22 +0200 From: Remy Nonnenmacher User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130522 Thunderbird/17.0.6 MIME-Version: 1.0 To: David Xu Subject: Re: Scaling and performance issues with FreeBSD 9 (& 10) on 4 socket systems References: <20130612225849.GA2858@dragon.NUXI.org> <51B9B497.70800@activnetworks.com> <51BA7A78.7010904@freebsd.org> In-Reply-To: <51BA7A78.7010904@freebsd.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-OriginalArrivalTime: 14 Jun 2013 10:50:22.0572 (UTC) FILETIME=[F82EF2C0:01CE68EC] Cc: "freebsd-performance@freebsd.org" X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Jun 2013 11:02:25 -0000 On 06/14/13 04:05, David Xu wrote: > On 2013/06/13 20:01, Remy Nonnenmacher wrote: >> >> On 06/13/13 13:32, Mark Felder wrote: >>> On Wed, 12 Jun 2013 17:58:49 -0500, David O'Brien >>> wrote: >>> >>>> We found FreeBSD 8.4 to perform better than FreeBSD 9.1, and Linux >>>> considerably better than both on the same machine. >>> >>> http://svnweb.freebsd.org/base?view=revision&revision=241246 >>> >>> The above link is likely why 8.4 is better than 9.1 on the same machine. >>> >>>> We've tried various things and haven't been able to explain why FreeBSD >>>> isn't scaling on the new hardware. Nor why it performs so much worse >>>> than FreeBSD on the older "M2" machines. >>> >>> The CPUs between those machines are quite different. I'm sure we're >>> looking at different cache sizes, different behavior for the >>> hyperthreading, etc. I'm sure others would be greatly interested in you >>> providing the same benchmark results for a recent snapshot of HEAD as >>> well. >>> _______________________________________________ >>> freebsd-performance@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance >>> To unsubscribe, send any mail to >>> "freebsd-performance-unsubscribe@freebsd.org" >> >> We had same problem on 4x12 cores (AMD) machines. After investigating >> using hwpmc, it appears that performance was killed by a scheduler >> function trying to find "least used cpu" that unfortunately works on >> contended structures (ie: lots a cores are fighting to get works). A >> solution was found by using artificially long queue of stuck process >> (steal_thresh bumped to over 8) and by cpu affinity crafting. >> >> Was a year ago and from my memory. I guess you may give a try to see if >> it helps. >> >> Disregard is a scheduler specialist contradicts. >> >> Thanks. >> > > AMD's cache is very different than Intel, AFAIK eariler than Bulldozer, > AMD's L3 is exclusive cache, util Bulldozer, AMD describes the L3 cache > as a “non-inclusive victim cache”, it is still different than Intel > which is inclusive. > > "- In sched_pickcpu() change general logic of CPU selection. First > look for idle CPU, sharing last level cache with previously used one, > skipping SMT CPU groups. If none found, search all CPUs for the least > loaded > one, where the thread with its priority can run now. If none found, search > just for the least loaded CPU." > > For exclusive cache, the L3 has second-hand data, not hot data, when a > thread is migrated, will have negative effect, its hot data is lost. > I'd prefer to search idle CPU from L2, then L3. > > The problem was not really the excellent job done on cache locality via cpu detection. It was more a scaling problem with the number of cores that exacerbate a contention when trying to steal works from others queues. Basically, what happened (I say happened because I've not retested recently), is that you may have 1 core running and 47 others fighting in a loop where there is one winner and 46 losers, all of them playing with locks, and O(N=48) loops. All in all, you see degraded performance with little indication of a cause. This is where hwpmc is a wonderfull tool... Bumping up steal-thresh up changes the pattern. If it works for you, then the cause is probably the same.