Date: Tue, 29 Jul 2014 14:51:19 -0700 From: Andrew Bates <andrewbates09@gmail.com> To: Adrian Chadd <adrian@freebsd.org> Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>, Jeff Roberson <jeff@freebsd.org> Subject: Re: Working on NUMA support Message-ID: <CAPi5Lmk1zGSOg%2Bp2Hdhw2Uk%2BMZfOX-=bFFTsdH%2B5ZZwd7_oqeg@mail.gmail.com> In-Reply-To: <CAJ-Vmom-wWZLCuuAEKDO1vuaGaSQM-=4e3xoh3OeVibc6m9Z8A@mail.gmail.com> References: <CAPi5LmkRO4QLbR2JQV8FuT=jw2jjcCRbP8jT0kj1g8Ks%2B7jv8A@mail.gmail.com> <CAJ-VmonJPT-NUSi=Wnu7a0oNwe8V=LQMZ-fZGriC7H44edRVLg@mail.gmail.com> <CAPi5Lm=8Z3fh_vxKY26qC3oEv1Ap%2BRvFGRAOhRosF5UEnDTVpw@mail.gmail.com> <00E55D89-BDD1-41AD-BBF6-6752B90E8324@ccsys.com> <CAJ-Vmom-wWZLCuuAEKDO1vuaGaSQM-=4e3xoh3OeVibc6m9Z8A@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hey Adrian, Yes, there has been progress on this - although admittedly not as much as we'd like at this point. I believe to what you're talking about, we have the layout for CPU affinity/locality. I need to go through and cleanup a good half-dozen branches of code. Myself a mere mortal standing on the shoulders of giants in a room of titans, I have to merge in my changes with Jeff's pertinent branch to get this closer to useable. >From my experience and research, in terms of access/response time: 1. localized DMA < all remote 2. (localized DMA + spillover remote) >= all remote As ugly as it may be, I think I said that right.. There have been a few changes since that original email, but yes what we're working to address is the userland <---> kernelspace. On Sat, Jul 26, 2014 at 1:11 PM, Adrian Chadd <adrian@freebsd.org> wrote: > Hi all! > > Has there been any further progress on this? > > I've been working on making the receive side scaling support usable by > mere mortals and I've reached a point where I'm going to need this > awareness in the 10ge/40ge drivers for the hardware I have access to. > > I'm right now more interested in the kernel driver/allocator side of > things, so: > > * when bringing up a NIC, figure out what are the "most local" CPUs to run > on; > * for each NIC queue, figure out what the "most local" bus resources > are for NIC resources like descriptors and packet memory (eg mbufs); > * for each NIC queue, figure out what the "most local" resources are > for local driver structures that the NIC doesn't touch (eg per-queue > state); > * for each RSS bucket, figure out what the "most local" resources are > for things like packet memory (mbufs), tcp/udp/inp control structures, > etc. > > I had a chat with jhb yesterday and he reminded me that y'all at > isilon have been looking into this. > > He described a few interesting cases from the kernel side to me. > > * On architectures with external IO controllers, the path cost from an > IO device to multiple CPUs may be (almost) equivalent, so there's not > a huge penalty to allocate things on the wrong CPU. I think it'll be > nice to get CPU local affinity where possible so we can parallelise > DRAM access fully, but we can play with this and see. > * On architectures with CPU-integrated IO controllers, there's a large > penalty for doing inter-CPU IO, > * .. but there's not such a huge penalty for doing inter-CPU memory access. > > Given that, we may find that we should always put the IO resources > local to the CPU it's attached to, even if we decide to run some / all > of the IO for the device on another CPU. Ie, any RAM that the IO > device is doing data or descriptor DMA into should be local to that > device. John said that in his experience it seemed the penalty for a > non-local CPU touching memory was much less than device DMA crossing > QPI. > > So the tricky bit is figuring that out and expressing it all in a way > that allows us to do memory allocation and CPU binding in a more aware > way. The other half of this tricky thing is to allow it to be easily > overridden by a curious developer or system administrator that wants > to experiment with different policies. > > Now, I'm very specifically only addressing the low level kernel IO / > memory allocation requirements here. There's other things to worry > about up in userland; I think you're trying to address that in your > KPI descriptions. > > Thoughts? > > > -a > -- V/Respectfully, Andrew M Bates
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAPi5Lmk1zGSOg%2Bp2Hdhw2Uk%2BMZfOX-=bFFTsdH%2B5ZZwd7_oqeg>