From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 8 12:51:10 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1BF85B74; Wed, 8 Jan 2014 12:51:10 +0000 (UTC) Received: from cargobay.net (cargobay.net [162.220.58.155]) by mx1.freebsd.org (Postfix) with ESMTP id E84531586; Wed, 8 Jan 2014 12:51:09 +0000 (UTC) Received: from [192.168.0.16] (unknown [65.35.151.3]) by cargobay.net (Postfix) with ESMTPSA id 695DD173E; Wed, 8 Jan 2014 12:50:05 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: Working on NUMA support From: "Chad J. Milios" X-Mailer: iPhone Mail (11B554a) In-Reply-To: Date: Wed, 8 Jan 2014 07:50:55 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: <00E55D89-BDD1-41AD-BBF6-6752B90E8324@ccsys.com> References: To: Andrew Bates X-Mailman-Approved-At: Wed, 08 Jan 2014 13:25:34 +0000 Cc: "freebsd-hackers@freebsd.org" , Adrian Chadd X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Jan 2014 12:51:10 -0000 I'm very interested and excited to watch the progress you are making. Many t= hank yous for grappling this task on FreeBSD! Are you designing/implementing a fully hierarchy-aware approach (nested doma= ins) or is each memory domain separate (not nested, simply "mine" or "not mi= ne" in respect to a particular thread context)? > On Jan 8, 2014, at 3:23 AM, Andrew Bates wrote: >=20 > Hey Adrian, >=20 > We spent the last few months on research/design and plan on spending the > next few months preparing tests, building the physical server(s) to test > on, and finishing the prototype'd functions. >=20 > There is some white-board and pen/paper design that hasn't yet made it to > the github repo, but the prototypes online are hopefully a solid base for > what we intend to complete. In the meantime, we are excited to hear who > else is interested in NUMA and if anyone has suggestions or concerns about= > our approach. >=20 >=20 >> On Wed, Jan 8, 2014 at 12:03 AM, Adrian Chadd wrote:= >>=20 >> Cool! Do you have any working code to implement the API, or is this >> just in the design phase right now? >>=20 >>=20 >> -a >>=20 >>=20 >>> On 6 January 2014 12:11, Andrew Bates wrote: >>> Hey all, >>>=20 >>> My name is Andrew Bates, and I would like to take a bit of your time to >>> talk about NUMA support. >>>=20 >>> Supporting Non-Uniform Memory Access in FreeBSD is something that has >> been >>> brought up in the past < >> http://freebsd.1045724.n5.nabble.com/NUMA-Support-is-there-in-FreeBSD-td4= 865200.html >>> . >>> This is becoming increasingly important now that multiprocessor >>> systems >>> are an expanding technology, thus performance is scaling in terms of cpu= >>> count, rather than just clock rate. >>>=20 >>> There is a great opportunity here to optimize performance. After being >>> asked to look into this by the EMC Isilon Storage Division, myself and a= >>> few colleagues advised by Andrew Pilloud and Jeff Roberson would like to= >>> propose APIs to handle basic memory allocation/management to specific >> NUMA >>> domains. >>>=20 >>> What we have devised so far consists of two levels. First there are the= >>> KPIs, to expose NUMA functionality at a thread level of domain affinity.= >>> Secondly, there would be a userspace/interface to take advantage of the >>> proposed APIs, thus giving users the capability to make their >> applications >>> NUMA-aware. >>>=20 >>> We took the time to look into how many other systems (Linux, Macintosh, >>> Solaris, Windows) already approach this problem, so there are some >> aspects >>> of our solution that are similar to how Linux and Solaris handle NUMA. >>> Unlike Linux libnuma, we are only proposing a few additions and a minima= l >>> library that can easily be expanded later to suit users=E2=80=99 needs. >>>=20 >>>=20 >>> KISS in mind, we came up with the following KPI prototypes >> (freebsdnuma.h) >>> to uncover NUMA in a usable fashion: >>>=20 >>>=20 >>> - >>>=20 >>> cpuset_get_memory_affinity() >>> - >>>=20 >>> cpuset_set_memory_affinity() >>> - >>>=20 >>> move_pages() >>> - >>>=20 >>> migrate_pages() >>> - >>>=20 >>> get_numa_cpus() >>> - >>>=20 >>> get_numa_weights() >>>=20 >>>=20 >>> Then to the second part, we have the following userspace API prototypes >>> (numanor.h) for our interface and testing purposes: >>>=20 >>>=20 >>> - >>>=20 >>> is_numa_available() >>> - >>>=20 >>> set_thread_on_domain() >>> - >>>=20 >>> set_memory_policy() >>> - >>>=20 >>> move_thread() >>>=20 >>>=20 >>> In much much more detail, you can learn more about these prototypes, thi= s >>> project, view our progress, track along, and give input on our github >> repo >>> < https://github.com/andrewbates09/freebsd-numa > or simply via email. >> This >>> repo currently includes fully commented prototypes (like a mini man page= ) >>> and will later include additions to the project. >>>=20 >>> If anyone has any comments, suggestions, concerns, quandaries, or just >>> general thoughts please feel free to contact us, as we would love to hea= r >>> your input! >>>=20 >>> The Leaders: Sakire Arslan Ay, Andrew Pilloud, Jeff Roberson >>> The Team: Andrew Bates, Joshua Clark, Alex Schuldberg, Dustin Walker >>>=20 >>> -- >>> V/Respectfully, >>> Andrew M Bates >>> _______________________________________________ >>> freebsd-hackers@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers >>> To unsubscribe, send any mail to " >> freebsd-hackers-unsubscribe@freebsd.org" >=20 >=20 >=20 > --=20 > V/Respectfully, > Andrew M Bates > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"=