From owner-freebsd-current@FreeBSD.ORG Thu Oct 16 23:01:02 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 587DA16A4B3 for ; Thu, 16 Oct 2003 23:01:02 -0700 (PDT) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 63A0843FA3 for ; Thu, 16 Oct 2003 23:01:00 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id QAA11486; Fri, 17 Oct 2003 16:00:48 +1000 Date: Fri, 17 Oct 2003 15:59:27 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Jeff Roberson In-Reply-To: <20031015034832.E30029-100000@mail.chesapeake.net> Message-ID: <20031017150929.T6652@gamplex.bde.org> References: <20031015034832.E30029-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: current@freebsd.org Subject: Re: More ULE bugs fixed. X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Oct 2003 06:01:02 -0000 On Wed, 15 Oct 2003, Jeff Roberson wrote: > I fixed two bugs that were exposed due to more of the kernel running > outside of Giant. ULE had some issues with priority propagation that > stopped it from working very well. > > Things should be much improved. Feedback, as always, is welcome. I'd > like to look into making this the default scheduler for 5.2 if things > start looking up. I hope that scares you all into using it more. :-) How would one test if it was an improvement on the 4BSD scheduler? It is not even competitive in my simple tests. Test for scheduling buildworlds: cd /usr/src/usr.bin for i in obj depend all do MAKEOBJDIRPREFIX=/somewhere/obj time make -s -j16 $i done >/tmp/zqz 2>&1 (Run this with an empty /somewhere/obj. The all stage doesn't quite finish.) On an ABIT BP6 system with a 400MHz and a 366MHz CPU, with /usr (including /usr/src) nfs-mounted (with 100 Mbps ethernet and a reasonably fast server) and /somewhere/obj ufs1-mounted (on a fairly slow disk; no soft-updates), this gives the following times: SCHED_ULE-yesterday, with not so careful setup: 40.37 real 8.26 user 6.26 sys 278.90 real 59.35 user 41.32 sys 341.82 real 307.38 user 69.01 sys SCHED_ULE-today, run immediately after booting: 41.51 real 7.97 user 6.42 sys 306.64 real 59.66 user 40.68 sys 346.48 real 305.54 user 69.97 sys SCHED_4BSD-yesterday, with not so careful setup: [same as today except the depend step was 10 seconds slower (real)] SCHED_4BSD-today, run immediately after booting: 18.89 real 8.01 user 6.66 sys 128.17 real 58.33 user 43.61 sys 291.59 real 308.48 user 72.33 sys SCHED_4BSD-yesterday, with a UP kernel (running on the 366 MHz CPU) with many local changes and not so careful setup: 17.39 real 8.28 user 5.49 sys 130.51 real 60.97 user 34.63 sys 390.68 real 310.78 user 60.55 sys Summary: SCHED_ULE was more than twice as slow as SCHED_4BSD for the obj and depend stages. These stages have little parallelism. SCHED_ULE was only 19% slower for the all stage. It apparently misses many oppurtunities to actually run useful processes. This may be related to /usr being nfs mounted. There is lots of idling waiting for nfs even in the SCHED_4BSD case. The system times are smaller for SCHED_ULE, but this might not be significant. E.g., zeroing pages can account for several percent of the system time in buildworld, but on unbalanced systems that have too much idle time most page zero gets done in idle time and doesn't show up in the system time. Test 1 for fair scheduling related to niceness: for i in 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 do nice -$i sh -c "while :; do echo -n;done" & done top -o time [Output deleted]. This shows only a vague correlation between niceness and runtime for SCHED_ULE. However, top -o cpu shows a strong correlation between %CPU and niceness. Apparently, %CPU is very innacurate and/or not enough history is kept for long-term scheduling to be fair. Test 5 for fair scheduling related to niceness: for i in -20 -16 -12 -8 -4 0 4 8 12 16 20 do nice -$i sh -c "while :; do echo -n;done" & done time top -o cpu With SCHED_ULE, this now hangs the system, but it worked yesterday. Today it doesn't get as far as running top and it stops the nfs server responding. To unhang the system and see what the above does, run a shell at rtprio 0 and start top before the above, and use top to kill processes (I normally use "killall sh" to kill all the shells generated by tests 1-5, but killall doesn't work if it is on nfs when the nfs server is not responding). Bruce