From owner-freebsd-arch@FreeBSD.ORG Fri Jan 3 16:41:29 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A4F29178 for ; Fri, 3 Jan 2014 16:41:29 +0000 (UTC) Received: from mail.rlwinm.de (mail.rlwinm.de [IPv6:2a01:4f8:140:72e1::ac16:e45e]) by mx1.freebsd.org (Postfix) with ESMTP id 6A8AB1D7D for ; Fri, 3 Jan 2014 16:41:29 +0000 (UTC) Received: from hexe.rlwinm.de (p57A7DF07.dip0.t-ipconnect.de [87.167.223.7]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.rlwinm.de (Postfix) with ESMTPSA id 7F3ED6761 for ; Fri, 3 Jan 2014 16:41:28 +0000 (UTC) Message-ID: <52C6E83F.5050802@rlwinm.de> Date: Fri, 03 Jan 2014 17:41:35 +0100 From: Jan Bramkamp User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-arch@freebsd.org Subject: Re: Default gateway lost after netif restart References: <52BC5177.70903@hostek.com> In-Reply-To: <52BC5177.70903@hostek.com> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jan 2014 16:41:29 -0000 On 26.12.2013 16:55, Alex Long wrote: > I am new to FreeBSD so I apologize if this is the wrong place to post > this. But there is a flaw in the logic regarding restart of the netif > service. I understand that after restarting the netif service, you have > to manually restart the routing service. The problem is that if you are > configuring a machine remotely and you have to restart the netif service > for whatever reason, your defaut gateway is lost, thus preventing you > from restarting the routing service and you lose connectivity to the > machine. > > Now I get around this by creating a shell script that does both and just > executing that script. This works but it is sloppy in my opinion. It > does not makes sense to restart a network service and lose ANY network > functionality (i.e. your routes) once it comes back up. While annoying to find out that way it makes sense. The netif script deals with interfaces and by implication with directly connected routes. It doesn't deal with static routes. If you restart it interface it temporarily looses all its addresses. This flushes their directly connected routes and the hull created over the next hop relation. Restarting first netif than routing should solve your problem. Make sure that no SIGHUP will kill you script. From owner-freebsd-arch@FreeBSD.ORG Sat Jan 4 00:55:50 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 58D7276A for ; Sat, 4 Jan 2014 00:55:50 +0000 (UTC) Received: from mail-qe0-x235.google.com (mail-qe0-x235.google.com [IPv6:2607:f8b0:400d:c02::235]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 1948A119F for ; Sat, 4 Jan 2014 00:55:50 +0000 (UTC) Received: by mail-qe0-f53.google.com with SMTP id nc12so16097400qeb.26 for ; Fri, 03 Jan 2014 16:55:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=LIhRyRl9dMcl+EqubfYVk6aTYddga12hkxhXFFT6ViE=; b=i+G1L0+NuUodPA3oX2UCb23FJ2s3/5ZyDefZwB69LyBWPUTMP6jgv5BHHWN/ICL39z m9qZpHqJADX0ecmZ5sR2/8sB4tzZlkR0P52P8wJUm3TVqBo1KLwalXRMu03kEdIld4Tb H2v8K9m5qhpsAX28ZQUuMkzN6PbLuubYAB+1p7hlWGj0MXfZWEQ5iVgyOfnQjC2hFejr 3uguFwQm/jWFB1UH06G/UW4qLTAVuiX8mG9Ljvhhxt0tfMjRc+BLlqLz6zO6yMRhXG5l /m31caQPn3mGr3/Gy8ceHoub3okjG/uae7jdu/6W3ZUkF/IvElM1iTUfptfh1L7jGeo9 Ajag== MIME-Version: 1.0 X-Received: by 10.224.13.141 with SMTP id c13mr145750417qaa.76.1388796948485; Fri, 03 Jan 2014 16:55:48 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.224.52.8 with HTTP; Fri, 3 Jan 2014 16:55:48 -0800 (PST) Date: Fri, 3 Jan 2014 16:55:48 -0800 X-Google-Sender-Auth: QE9XBmD7cGltLoDlq7Y04nacAaM Message-ID: Subject: Acquiring a lock on the same CPU that holds it - what can be done? From: Adrian Chadd To: "freebsd-arch@freebsd.org" Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jan 2014 00:55:50 -0000 Hi, So here's a fun one. When doing TCP traffic + socket affinity + thread pinning experiments, I seem to hit this very annoying scenario that caps my performance and scalability. Assume I've lined up everything relating to a socket to run on the same CPU (ie, TX, RX, TCP timers, userland thread): * userland code calls something, let's say "kqueue" * the kqueue lock gets grabbed * an interrupt comes in for the NIC * the NIC code runs some RX code, and eventually hits something that wants to push a knote up * and the knote is for the same kqueue above * .. so it grabs the lock.. * .. contests.. * Then the scheduler flips us back to the original userland thread doing TX * The userland thread finishes its kqueue manipulation and releases the queue lock * .. the scheduler then immediately flips back to the NIC thread waiting for the lock, grabs the lock, does a bit of work, then releases the lock I see this on kqueue locks, sendfile locks (for sendfile notification) and vm locks (for the VM page referencing/dereferencing.) This happens very frequently. It's very noticable with large numbers of sockets as the chances of hitting a lock in the NIC RX path that overlaps with something in the userland TX path that you are currently fiddling with (eg kqueue manipulation) or sending data (eg vm_page locks or sendfile locks for things you're currently transmitting) is very high. As I increase traffic and the number of sockets, the amount of context switches goes way up (to 300,000+) and the lock contention / time spent doing locking is non-trivial. Linux doesn't "have this" problem - the lock primitives let you disable driver bottom halves. So, in this instance, I'd just grab the lock with spin_lock_bh() and all the driver bottom halves would not be run. I'd thus not have this scheduler ping-ponging and lock contention as it'd never get a chance to happen. So, does anyone have any ideas? Has anyone seen this? Shall we just implement a way of doing selective thread disabling, a la spin_lock_bh() mixed with spl${foo}() style stuff? Thanks, -adrian From owner-freebsd-arch@FreeBSD.ORG Sat Jan 4 03:19:22 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id DCDEFE98 for ; Sat, 4 Jan 2014 03:19:21 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C6B321A8E for ; Sat, 4 Jan 2014 03:19:21 +0000 (UTC) Received: from xp5k.my.domain (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s043JKAa011265 for ; Sat, 4 Jan 2014 03:19:21 GMT (envelope-from listlog2011@gmail.com) Message-ID: <52C77DB8.5020305@gmail.com> Date: Sat, 04 Jan 2014 11:19:20 +0800 From: David Xu User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: freebsd-arch@freebsd.org Subject: Re: Acquiring a lock on the same CPU that holds it - what can be done? References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list Reply-To: davidxu@freebsd.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jan 2014 03:19:22 -0000 On 2014/01/04 08:55, Adrian Chadd wrote: > Hi, > > So here's a fun one. > > When doing TCP traffic + socket affinity + thread pinning experiments, > I seem to hit this very annoying scenario that caps my performance and > scalability. > > Assume I've lined up everything relating to a socket to run on the > same CPU (ie, TX, RX, TCP timers, userland thread): > > * userland code calls something, let's say "kqueue" > * the kqueue lock gets grabbed > * an interrupt comes in for the NIC > * the NIC code runs some RX code, and eventually hits something that > wants to push a knote up > * and the knote is for the same kqueue above > * .. so it grabs the lock.. > * .. contests.. > * Then the scheduler flips us back to the original userland thread doing TX > * The userland thread finishes its kqueue manipulation and releases > the queue lock > * .. the scheduler then immediately flips back to the NIC thread > waiting for the lock, grabs the lock, does a bit of work, then > releases the lock > > I see this on kqueue locks, sendfile locks (for sendfile notification) > and vm locks (for the VM page referencing/dereferencing.) > > This happens very frequently. It's very noticable with large numbers > of sockets as the chances of hitting a lock in the NIC RX path that > overlaps with something in the userland TX path that you are currently > fiddling with (eg kqueue manipulation) or sending data (eg vm_page > locks or sendfile locks for things you're currently transmitting) is > very high. As I increase traffic and the number of sockets, the amount > of context switches goes way up (to 300,000+) and the lock contention > / time spent doing locking is non-trivial. > > Linux doesn't "have this" problem - the lock primitives let you > disable driver bottom halves. So, in this instance, I'd just grab the > lock with spin_lock_bh() and all the driver bottom halves would not be > run. I'd thus not have this scheduler ping-ponging and lock contention > as it'd never get a chance to happen. > > So, does anyone have any ideas? Has anyone seen this? Shall we just > implement a way of doing selective thread disabling, a la > spin_lock_bh() mixed with spl${foo}() style stuff? > > Thanks, > > > -adrian > This is how turnstile based mutex works, AFAIK it is for realtime, same as POSIX pthread priority inheritance mutex, realtime does not mean high performance, in fact, it introduces more context switches and hurts throughput. I think default mutex could be patched to call critical_enter when mutex_lock is called, and spin forever, and call critical_leave when the mutex is unlocked, bypass turnstile. The turnstile design assumes the whole system must be scheduled on global thread priority, but who did say a system must be based on this? Recently, I had ported Linux CFS like scheduler to FreeBSD on our perforce server, it is based on start-time fair queue, and I found turnstile is such a bad thing. it makes me can not schedule thread based on class: rt > timeshare > idle, but must face with a global thread priority change. I have stopped porting it, although it is now fully work on UP, it supports nested group scheduling, I can watch video smoothly while doing "make -j10 buildwork" on same UP machine. My scheduler does not work on SMP, too much priority propagation work makes me go away, non-preemption spinlock works well for such a system, propagating thread weight on a scheduler tree is not practical. Regards, David Xu From owner-freebsd-arch@FreeBSD.ORG Sat Jan 4 03:25:23 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 79363F60; Sat, 4 Jan 2014 03:25:23 +0000 (UTC) Received: from mail-qa0-x231.google.com (mail-qa0-x231.google.com [IPv6:2607:f8b0:400d:c00::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 252E21AEF; Sat, 4 Jan 2014 03:25:23 +0000 (UTC) Received: by mail-qa0-f49.google.com with SMTP id ii20so1145992qab.8 for ; Fri, 03 Jan 2014 19:25:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=B5bSnR4zc4F5EI414D0CKvcNGoDZ8ch7jN3oHrNUXeE=; b=TO8H7rgOW3mUe9GT+Sz8Y1Yv65eYR9AX+mAgW/TDQulH2Nf9LiqPK7V5EyrT74kQ80 eRyEeqZk5KjfCmvepb0I6/Lk4RoLYamtiyseiCrISNHsV1DjJbQ3D8I/0AMPD01wWm0Y EJaadMJWcERP/BNZtaXqQfaWY2KW/RzA28WrNFfhWQSOZGMcrznGU9PFsckmDLEfZPRb GZMbdn0X1MnlfD0ivfFTEGGaJNrmAjduQimbLObwU1f8sSxkWgJu5LtHbpdL1kzZQfEh uc61DqHClHoojVCkyDFyOKbU0jQa1elij6Y6N4kiWTC9QW/DkE+93r6ColO8ZThNpgTY NO4A== MIME-Version: 1.0 X-Received: by 10.224.13.141 with SMTP id c13mr146664121qaa.76.1388805921968; Fri, 03 Jan 2014 19:25:21 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.224.52.8 with HTTP; Fri, 3 Jan 2014 19:25:21 -0800 (PST) In-Reply-To: <52C77DB8.5020305@gmail.com> References: <52C77DB8.5020305@gmail.com> Date: Fri, 3 Jan 2014 19:25:21 -0800 X-Google-Sender-Auth: BpUuRvy4lYraVjrPG1IP3m6A_n8 Message-ID: Subject: Re: Acquiring a lock on the same CPU that holds it - what can be done? From: Adrian Chadd To: David Xu Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jan 2014 03:25:23 -0000 Doesn't critical_enter / exit enable/disable interrupts? We don't necessarily want to do -that-, as that can be expensive. Just not scheduling certain tasks that would interfere would be good enough. -a On 3 January 2014 19:19, David Xu wrote: > On 2014/01/04 08:55, Adrian Chadd wrote: >> Hi, >> >> So here's a fun one. >> >> When doing TCP traffic + socket affinity + thread pinning experiments, >> I seem to hit this very annoying scenario that caps my performance and >> scalability. >> >> Assume I've lined up everything relating to a socket to run on the >> same CPU (ie, TX, RX, TCP timers, userland thread): >> >> * userland code calls something, let's say "kqueue" >> * the kqueue lock gets grabbed >> * an interrupt comes in for the NIC >> * the NIC code runs some RX code, and eventually hits something that >> wants to push a knote up >> * and the knote is for the same kqueue above >> * .. so it grabs the lock.. >> * .. contests.. >> * Then the scheduler flips us back to the original userland thread doing TX >> * The userland thread finishes its kqueue manipulation and releases >> the queue lock >> * .. the scheduler then immediately flips back to the NIC thread >> waiting for the lock, grabs the lock, does a bit of work, then >> releases the lock >> >> I see this on kqueue locks, sendfile locks (for sendfile notification) >> and vm locks (for the VM page referencing/dereferencing.) >> >> This happens very frequently. It's very noticable with large numbers >> of sockets as the chances of hitting a lock in the NIC RX path that >> overlaps with something in the userland TX path that you are currently >> fiddling with (eg kqueue manipulation) or sending data (eg vm_page >> locks or sendfile locks for things you're currently transmitting) is >> very high. As I increase traffic and the number of sockets, the amount >> of context switches goes way up (to 300,000+) and the lock contention >> / time spent doing locking is non-trivial. >> >> Linux doesn't "have this" problem - the lock primitives let you >> disable driver bottom halves. So, in this instance, I'd just grab the >> lock with spin_lock_bh() and all the driver bottom halves would not be >> run. I'd thus not have this scheduler ping-ponging and lock contention >> as it'd never get a chance to happen. >> >> So, does anyone have any ideas? Has anyone seen this? Shall we just >> implement a way of doing selective thread disabling, a la >> spin_lock_bh() mixed with spl${foo}() style stuff? >> >> Thanks, >> >> >> -adrian >> > > This is how turnstile based mutex works, AFAIK it is for realtime, > same as POSIX pthread priority inheritance mutex, realtime does not > mean high performance, in fact, it introduces more context switches > and hurts throughput. I think default mutex could be patched to > call critical_enter when mutex_lock is called, and spin forever, > and call critical_leave when the mutex is unlocked, bypass turnstile. > The turnstile design assumes the whole system must be scheduled > on global thread priority, but who did say a system must be based on this? > Recently, I had ported Linux CFS like scheduler to FreeBSD on our > perforce server, > it is based on start-time fair queue, and I found turnstile is such a > bad thing. > it makes me can not schedule thread based on class: rt > timeshare > idle, > but must face with a global thread priority change. > I have stopped porting it, although it is now fully work on UP, it supports > nested group scheduling, I can watch video smoothly while doing "make > -j10 buildwork" on > same UP machine. My scheduler does not work on SMP, too much priority > propagation > work makes me go away, non-preemption spinlock works well for such > a system, propagating thread weight on a scheduler tree is not practical. > > Regards, > David Xu > > > > > > > > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Sat Jan 4 03:45:36 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 59BB5201; Sat, 4 Jan 2014 03:45:36 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 424FC1CFA; Sat, 4 Jan 2014 03:45:36 +0000 (UTC) Received: from xp5k.my.domain (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s043jYPD016976; Sat, 4 Jan 2014 03:45:35 GMT (envelope-from listlog2011@gmail.com) Message-ID: <52C783DE.1060102@gmail.com> Date: Sat, 04 Jan 2014 11:45:34 +0800 From: David Xu User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Adrian Chadd Subject: Re: Acquiring a lock on the same CPU that holds it - what can be done? References: <52C77DB8.5020305@gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: David Xu , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list Reply-To: davidxu@freebsd.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jan 2014 03:45:36 -0000 On 2014/01/04 11:25, Adrian Chadd wrote: > Doesn't critical_enter / exit enable/disable interrupts? > > We don't necessarily want to do -that-, as that can be expensive. Just > not scheduling certain tasks that would interfere would be good > enough. > > > -a Does critical_enter disable interrupts ? Long time ago, I saw it does not. If I remembered it correctly, spinlock_enter disables interrupt, critical_enter still allows interrupt, but current thread can not be preempted, it is deferred. > > On 3 January 2014 19:19, David Xu wrote: >> On 2014/01/04 08:55, Adrian Chadd wrote: >>> Hi, >>> >>> So here's a fun one. >>> >>> When doing TCP traffic + socket affinity + thread pinning experiments, >>> I seem to hit this very annoying scenario that caps my performance and >>> scalability. >>> >>> Assume I've lined up everything relating to a socket to run on the >>> same CPU (ie, TX, RX, TCP timers, userland thread): >>> >>> * userland code calls something, let's say "kqueue" >>> * the kqueue lock gets grabbed >>> * an interrupt comes in for the NIC >>> * the NIC code runs some RX code, and eventually hits something that >>> wants to push a knote up >>> * and the knote is for the same kqueue above >>> * .. so it grabs the lock.. >>> * .. contests.. >>> * Then the scheduler flips us back to the original userland thread doing TX >>> * The userland thread finishes its kqueue manipulation and releases >>> the queue lock >>> * .. the scheduler then immediately flips back to the NIC thread >>> waiting for the lock, grabs the lock, does a bit of work, then >>> releases the lock >>> >>> I see this on kqueue locks, sendfile locks (for sendfile notification) >>> and vm locks (for the VM page referencing/dereferencing.) >>> >>> This happens very frequently. It's very noticable with large numbers >>> of sockets as the chances of hitting a lock in the NIC RX path that >>> overlaps with something in the userland TX path that you are currently >>> fiddling with (eg kqueue manipulation) or sending data (eg vm_page >>> locks or sendfile locks for things you're currently transmitting) is >>> very high. As I increase traffic and the number of sockets, the amount >>> of context switches goes way up (to 300,000+) and the lock contention >>> / time spent doing locking is non-trivial. >>> >>> Linux doesn't "have this" problem - the lock primitives let you >>> disable driver bottom halves. So, in this instance, I'd just grab the >>> lock with spin_lock_bh() and all the driver bottom halves would not be >>> run. I'd thus not have this scheduler ping-ponging and lock contention >>> as it'd never get a chance to happen. >>> >>> So, does anyone have any ideas? Has anyone seen this? Shall we just >>> implement a way of doing selective thread disabling, a la >>> spin_lock_bh() mixed with spl${foo}() style stuff? >>> >>> Thanks, >>> >>> >>> -adrian >>> >> This is how turnstile based mutex works, AFAIK it is for realtime, >> same as POSIX pthread priority inheritance mutex, realtime does not >> mean high performance, in fact, it introduces more context switches >> and hurts throughput. I think default mutex could be patched to >> call critical_enter when mutex_lock is called, and spin forever, >> and call critical_leave when the mutex is unlocked, bypass turnstile. >> The turnstile design assumes the whole system must be scheduled >> on global thread priority, but who did say a system must be based on this? >> Recently, I had ported Linux CFS like scheduler to FreeBSD on our >> perforce server, >> it is based on start-time fair queue, and I found turnstile is such a >> bad thing. >> it makes me can not schedule thread based on class: rt > timeshare > idle, >> but must face with a global thread priority change. >> I have stopped porting it, although it is now fully work on UP, it supports >> nested group scheduling, I can watch video smoothly while doing "make >> -j10 buildwork" on >> same UP machine. My scheduler does not work on SMP, too much priority >> propagation >> work makes me go away, non-preemption spinlock works well for such >> a system, propagating thread weight on a scheduler tree is not practical. >> >> Regards, >> David Xu >> >> >> >> >> >> >> >> >> _______________________________________________ >> freebsd-arch@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-arch >> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Sat Jan 4 04:14:51 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 967C134E; Sat, 4 Jan 2014 04:14:51 +0000 (UTC) Received: from mail-qe0-x230.google.com (mail-qe0-x230.google.com [IPv6:2607:f8b0:400d:c02::230]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 43BB71E4C; Sat, 4 Jan 2014 04:14:51 +0000 (UTC) Received: by mail-qe0-f48.google.com with SMTP id gc15so16496067qeb.35 for ; Fri, 03 Jan 2014 20:14:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=D1x1eL5b2OBYSlcC08Of/x/8DxwChtXLW4Wc4hWWPjE=; b=b8r2YQnATzXYG999QGBWHZwPQR6PTAMXIIXnWXmgjfb3jHLP7+Y37BPT59kuVSAo02 GIY3qVUT7u7M6n8tiJ5vZE7LVDKD2Ug82a409X7Ovyr/Eg7Q7uiDDJcon28wR40V1u6q cfeuP4UoS6+xzk3psAYHk6im7P22TMrxcIC3hbZ2uXM9eFkswU+fQvMD5Nh113HpnHgr a6/QeSTmTUHeJKXL8Q+bhQtXnvizzhKU4Y8yuxH95YP6y5nCFlPLg6hMjMtYTsZQzIDI funr1PQD++LyyITv/Ee5dt3RmycaD+D013Kv40ov9anAU0NtnYFYTI7NwixKmn2RTcRg sACw== MIME-Version: 1.0 X-Received: by 10.224.14.132 with SMTP id g4mr48887831qaa.26.1388808890481; Fri, 03 Jan 2014 20:14:50 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.224.52.8 with HTTP; Fri, 3 Jan 2014 20:14:50 -0800 (PST) In-Reply-To: <52C783DE.1060102@gmail.com> References: <52C77DB8.5020305@gmail.com> <52C783DE.1060102@gmail.com> Date: Fri, 3 Jan 2014 20:14:50 -0800 X-Google-Sender-Auth: xCkpTTl3UnId6p-rY5it5OsvDMg Message-ID: Subject: Re: Acquiring a lock on the same CPU that holds it - what can be done? From: Adrian Chadd To: David Xu Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jan 2014 04:14:51 -0000 On 3 January 2014 19:45, David Xu wrote: > Does critical_enter disable interrupts ? Long time ago, I saw it does > not. If I remembered it correctly, spinlock_enter disables interrupt, > critical_enter still allows interrupt, but current thread can not be > preempted, it is deferred. Ah there we go. Yes, okay. My bad. -a From owner-freebsd-arch@FreeBSD.ORG Sat Jan 4 04:22:40 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2CE207FF for ; Sat, 4 Jan 2014 04:22:40 +0000 (UTC) Received: from mail-qa0-x235.google.com (mail-qa0-x235.google.com [IPv6:2607:f8b0:400d:c00::235]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E39721FCE for ; Sat, 4 Jan 2014 04:22:39 +0000 (UTC) Received: by mail-qa0-f53.google.com with SMTP id j5so1172326qaq.5 for ; Fri, 03 Jan 2014 20:22:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=Aj79LJd6RHRZ5Ga0fy7JhiL57F7oaogjVnr0m8zZkWM=; b=QQxamRszrDoi8BFLeaOGncMm3bUjwkReSYJ7mb2zLiJAS0Zirq5jSOZ0Iq0cPRKe8T g4weq4AsS+sy/f7P5QOr7atA7sT+aRkuctSrOKeC0QwGIBmZwOFxprNSgZQVtLXIq0ug WB0pV2B+cIUQ0Ooxy43/k74BAMNco2TK30G8v/3wldAG1B6iSrd5El7Sc2DdZBYBTGKw XcBLrs7mBmMPEX8KC+SvQdQI+T6PPGf73DqBr2WjSRTO3xD1x8X4RHrZGVAD6zvc2ADq aPPoHRqFkCRAUSRvacFMlmAqqgL3lHoA6s5RxcZT+jJnkvDb5Zzoay0Qd9Llsx7Xt7aB 3jog== MIME-Version: 1.0 X-Received: by 10.49.34.207 with SMTP id b15mr158746247qej.49.1388809359133; Fri, 03 Jan 2014 20:22:39 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.224.52.8 with HTTP; Fri, 3 Jan 2014 20:22:39 -0800 (PST) Date: Fri, 3 Jan 2014 20:22:39 -0800 X-Google-Sender-Auth: cjM_IYWXpcQxJ7C3eDGm2KfoN_k Message-ID: Subject: profiling hangs - quiesce_cpus() From: Adrian Chadd To: "freebsd-arch@freebsd.org" Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jan 2014 04:22:40 -0000 Here's another fun one. If the idle thread on a CPU doesn't run, then quiesce_cpus() (and thus quiesce_all_cpus()) doesn't finish. The td_generation on the idlethread doesn't get bumped as the idlethread doesn't get run. Because of this, things like lock profiling just hangs when trying to get stats or enable it as it does a CPU synchronisation by the above function calls and they don't ever return. Any ideas? It makes it a bit annoying to do fine grained lock contention profiling when I'm doing this. :-) Thanks! -a