From owner-freebsd-performance@FreeBSD.ORG Sat Nov 20 14:31:09 2010 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 04C6B106564A for ; Sat, 20 Nov 2010 14:31:09 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx07.syd.optusnet.com.au (fallbackmx07.syd.optusnet.com.au [211.29.132.9]) by mx1.freebsd.org (Postfix) with ESMTP id 814408FC29 for ; Sat, 20 Nov 2010 14:31:08 +0000 (UTC) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by fallbackmx07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id oAKCcsXx025118 for ; Sat, 20 Nov 2010 23:38:54 +1100 Received: from c122-106-145-124.carlnfd1.nsw.optusnet.com.au (c122-106-145-124.carlnfd1.nsw.optusnet.com.au [122.106.145.124]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id oAKCcoPU005149 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 20 Nov 2010 23:38:51 +1100 Date: Sat, 20 Nov 2010 23:38:50 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Kostik Belousov In-Reply-To: <20101119140824.GO2392@deviant.kiev.zoral.com.ua> Message-ID: <20101120223925.T25498@besplex.bde.org> References: <4CE50849.106@zedat.fu-berlin.de> <4CE52177.3020306@freebsd.org> <20101118182324.GA36312@freebsd.org> <20101119044129.GA4063@johnny.reilly.home> <20101119094652.00003652@unknown> <4CE64879.2060802@freebsd.org> <20101119140824.GO2392@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-performance@freebsd.org, Andriy Gapon Subject: Re: TTY task group scheduling X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Nov 2010 14:31:09 -0000 On Fri, 19 Nov 2010, Kostik Belousov wrote: > On Fri, Nov 19, 2010 at 11:50:49AM +0200, Andriy Gapon wrote: >> on 19/11/2010 11:46 Bruce Cran said the following: >>> [removed current@ and stable@ from the Cc list] >>> >>> On Fri, 19 Nov 2010 15:41:29 +1100 >>> Andrew Reilly wrote: >>> >>>> On Linux. Have you ever seen those sorts of UI problems on FreeBSD? Not since FreeBSD-1 or earlier, but I don't run much bloatware. >>>> I don't watch much video on my systems, but I haven't seen that. >>>> FreeBSD has always been good at keeping user-interactive processes >>>> responsive while compiles or what-not are going on in the background. >>> >>> I've definitely seen problems when running builds in an xterm. I've >>> often resorted to canceling it and running it on a syscons console >>> instead to improve performance. >> >> So, what was it a problem with scheduler or with, e.g., "something X" >> being too slow rendering glyphs? Who can tell... > > Probably will pay a lot in negative karma by posting anything in the > thread. But I can confirm your words, that tty->xterm->X server chain > of output indeed significantly slows down the build processes. I just tried a kernel build with -j256 on a 1-core system to be unreasnable, and didn't see any sluggishness (and I notice programs taking > 10 msec to start up), but this was under my version of 5.2 with my version of SCHED_4BSD. > I usually never start build in the barebone xterm, always running screen > under xterm. make -j 10 on 4 core/HTT cpu slows up to a half, from my > unscientific impression, when run in the active screen window. Switching > to other window in screen significantly speeds it up (note the prudent > omission of any measured numbers). For me, make -s -j 256 on 1 core ran at the same speed in an xterm with another xterm watching it using top. Without -s it took 5% longer. The X output is apparently quite slow. But I rarely run X. Syscons output is much more efficient. make(1) has interesting problems determining when jobs finish. It used to wait 10 msec (?) and that gave a lot of dead time whan 10 msec became a long time for a process runtime. Maybe X is interfering with its current mechanism. During the make -j256, the load average went up to about 100 and most of the cc1's reached a low (numerically high) priority very quickly, especially on the second run when the load average was high to start (my version of the SCHED_4BSD may affect how fast or slow the priority ramps up). An interactive process competing with these cc1's has a very easy time getting scheduled to run provided it is not a bloated one that runs enough to gain a high priority itself. If it runs as much as the cc1's then it will become just one of 257 processes wanting to run and it takes a very unfair scheduler to do much better than run 1 every (default 100 msec) and thus take a default of 25.7 seconds to get back to the interactive one. At least old versions of SCHED_4BSD had significant bugs that often resulted in very unfair scheduling that happened to favour interactive processes but sometimes went the other way. The most interesting one is still there :-( : sched_exit_thread() adds the child td_estcpu to the parent td_estcpu. Combined with the inheritance of td_estcpu on fork(), this results in td_estcpu being exponential in the number of reaped children, except td_estcpu is clamped to a maximum, so td_estcpu quickly reaches the maximum td_estcpu (and td_priority quickly reaches the minimum (numerical maximum) user priority) after a few fork/exit/waits. The process doing the fork/waits is often a shell, and its interactivity becomes bad when its priority becomes low. Between about 1995 and 2000, this bug was much worse. Then there was no clamp, so td_estcpu was fully exponential in the number of children, except after about 32 doublings it overflowed to a negative value. But before it became negative, it became large, so its process gained the maximum priority and therefore found it hard to run enough to create more children. This still happens with the clamp, but "large" is now quite small and decays after a few seconds or minutes. Without the clamp, the decay took minutes or hours if not days. The doubling is fixed in my version by setting the parent td_estcpu to the maximum of the parent and child estcpu's on exit. This risks not inheriting enough (I now see a simple better method: add only the part of the child's td_estcpu that was actually due to child activity and is not just virtual cpu created on fork). The doubling was originally implemented to improve interactivity, and it "worked" bogusly by inibiting forks. E.g., for -j 256, it would probably stop make forking long before it created 256 jobs. Now with the clamp, make will just take longer to create the 256 jobs once it increased its td_estcpu to more than that of the first few jobs its started by reaping a few. Well, I tried this under -current, but only have SCHED_ULE handy to test (on a FreeBSD cluster machine). -j256 didn't seem to be enough to cause latency (even over the Pacific link). Interactivity remained perfect with -j1512. The only noticeable difference (apart from 8 cores) in top was that the load average barely reached 15 (instead of 100) with a couple of hundred sleeping processes. 8 cores might do that. Bruce