From owner-freebsd-hackers@FreeBSD.ORG Sat Mar 28 18:10:30 2015 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5D36FFB1; Sat, 28 Mar 2015 18:10:30 +0000 (UTC) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1331324C; Sat, 28 Mar 2015 18:10:30 +0000 (UTC) Received: from slw by zxy.spb.ru with local (Exim 4.84 (FreeBSD)) (envelope-from ) id 1YbvBi-000Okc-NQ; Sat, 28 Mar 2015 21:10:26 +0300 Date: Sat, 28 Mar 2015 21:10:26 +0300 From: Slawa Olhovchenkov To: Adrian Chadd Subject: Re: irq cpu binding Message-ID: <20150328181026.GB23643@zxy.spb.ru> References: <20150328112035.GZ23643@zxy.spb.ru> <20150328154031.GA23643@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false Cc: "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Mar 2015 18:10:30 -0000 On Sat, Mar 28, 2015 at 10:43:07AM -0700, Adrian Chadd wrote: > On 28 March 2015 at 08:40, Slawa Olhovchenkov wrote: > > On Sat, Mar 28, 2015 at 08:20:08AM -0700, Adrian Chadd wrote: > > > >> On 28 March 2015 at 04:20, Slawa Olhovchenkov wrote: > >> > Can someone describe how on FreeBSD/amd64 do interrupt handling? > >> > Can be interrupt handler (hardware interrupt) direct dispatch to > >> > specific CPU core (and only to this core)? > >> > Can be all work be only on this core (ithread, device driver interrupt > >> > handler, finalise)? > >> > >> Yes - you can use cpuset on the interrupt to get them bound that way. > >> > >> John and I are trying to make that whole process more automated and > >> NUMA friendly. I'm debugging some of his work at the moment. > > > > cpuset don't work as expected -- I see irq handling on other cpu. > > Well, when you see "irq handling on other cpu", what do you mean? > How are you using cpuset to move things around? I have dual Xenon and dual 82599EB (handled bu second socket). irq270: ix0:que 0 293090597 2483 irq271: ix0:que 1 290379344 2460 irq272: ix0:que 2 292648245 2479 irq273: ix0:link 1 0 irq274: ix1:que 0 294816977 2498 irq275: ix1:que 1 292665696 2479 irq276: ix1:que 2 294411404 2494 irq277: ix1:link 2 0 First, I do from 'cpuset -l 6 -x 270' to 'cpuset -l 11 -x 276'. I try 'cpuset -l 0 pmcstat -S CPU_CLK_UNHALTED_CORE -O sample.out -c 2 -l 10' and next pmcstat -R sample.out -G out.txt. And I see many ixgbe_msix_que in out.txt (I save out.txt, you can see this, plot flame graph, etc). After this I see in `ps -axdHO lwp | grep ix` more then one irq handler: # ps -axdHO lwp | grep ix 94661 100976 0 S+ 0:00.00 | | `-- grep ix 12 100064 - WL 840:32.50 - [intr/irq270: ix0:qu] 12 100066 - WL 843:33.09 - [intr/irq271: ix0:qu] 12 100068 - RL 828:25.61 - [intr/irq272: ix0:qu] 12 100070 - WL 0:00.00 - [intr/irq273: ix0:li] 12 100072 - RL 858:21.86 - [intr/irq274: ix1:qu] 12 100074 - RL 858:43.72 - [intr/irq275: ix1:qu] 12 100076 - WL 843:01.04 - [intr/irq276: ix1:qu] 12 100078 - WL 0:00.00 - [intr/irq277: ix1:li] 0 100065 - RLs 70:03.18 [kernel/ix0 que] 0 100067 - RLs 71:06.67 [kernel/ix0 que] 0 100069 - DLs 68:23.48 [kernel/ix0 que] 0 100071 - DLs 0:01.48 [kernel/ix0 linkq] 0 100073 - DLs 50:25.37 [kernel/ix1 que] 0 100075 - DLs 49:08.89 [kernel/ix1 que] 0 100077 - DLs 48:18.12 [kernel/ix1 que] 0 100079 - DLs 0:00.00 [kernel/ix1 linkq] I am don't know what thread binded by 'cpuset -x'. I think intr/irqNNN (as more CPU spending). I do 'cpuset -l 6 -t 100065' .. 'cpuset -l 11 -t 100077'. After this I do `cpuset -l 0 pmcstat -S CPU_CLK_UNHALTED_CORE -O sample2.out -c 2 -l 10` and `pmcstat -R sample2.out -G out2.txt` (I save out2.txt). I still see ixgbe_msix_que: 06.94% [17981] tcp_output @ /boot/kernel/kernel 100.0% [17981] tcp_do_segment 100.0% [17981] tcp_input 100.0% [17981] ip_input 100.0% [17981] netisr_dispatch_src 100.0% [17981] ether_demux 100.0% [17981] ether_nh_input 100.0% [17981] netisr_dispatch_src 97.56% [17542] ixgbe_rxeof @ /boot/kernel/if_ixgbe.ko 75.54% [13252] ixgbe_msix_que 100.0% [13252] intr_event_execute_handlers @ /boot/kernel/kernel 100.0% [13252] ithread_loop 100.0% [13252] fork_exit 24.46% [4290] ixgbe_handle_que @ /boot/kernel/if_ixgbe.ko 100.0% [4290] taskqueue_run_locked @ /boot/kernel/kernel 100.0% [4290] taskqueue_thread_loop 100.0% [4290] fork_exit 02.44% [439] tcp_lro_flush 96.58% [424] ixgbe_rxeof @ /boot/kernel/if_ixgbe.ko 100.0% [424] ixgbe_msix_que 100.0% [424] intr_event_execute_handlers @ /boot/kernel/kernel 100.0% [424] ithread_loop 100.0% [424] fork_exit 03.42% [15] tcp_lro_rx 100.0% [15] ixgbe_rxeof @ /boot/kernel/if_ixgbe.ko 100.0% [15] ixgbe_msix_que 100.0% [15] intr_event_execute_handlers @ /boot/kernel/kernel 100.0% [15] ithread_loop 100.0% [15] fork_exit