From owner-freebsd-stable@FreeBSD.ORG Thu Dec 15 00:42:07 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7F3D11065670 for ; Thu, 15 Dec 2011 00:42:07 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta13.emeryville.ca.mail.comcast.net (qmta13.emeryville.ca.mail.comcast.net [76.96.27.243]) by mx1.freebsd.org (Postfix) with ESMTP id 627688FC16 for ; Thu, 15 Dec 2011 00:42:07 +0000 (UTC) Received: from omta08.emeryville.ca.mail.comcast.net ([76.96.30.12]) by qmta13.emeryville.ca.mail.comcast.net with comcast id 9CfW1i0070FhH24ADCi0J4; Thu, 15 Dec 2011 00:42:00 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta08.emeryville.ca.mail.comcast.net with comcast id 9Cgy1i01b1t3BNj8UCgzoK; Thu, 15 Dec 2011 00:40:59 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id DB595102C19; Wed, 14 Dec 2011 16:42:05 -0800 (PST) Date: Wed, 14 Dec 2011 16:42:05 -0800 From: Jeremy Chadwick To: "O. Hartmann" Message-ID: <20111215004205.GA11556@icarus.home.lan> References: <4EE1EAFE.3070408@m5p.com> <4EE2AE64.9060802@m5p.com> <4EE88343.2050302@m5p.com> <4EE933C6.4020209@zedat.fu-berlin.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4EE933C6.4020209@zedat.fu-berlin.de> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Tom Evans , Attilio Rao , George Mitchell , freebsd-stable@freebsd.org Subject: Re: SCHED_ULE should not be the default X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Dec 2011 00:42:07 -0000 On Thu, Dec 15, 2011 at 12:39:50AM +0100, O. Hartmann wrote: > On 12/14/11 18:54, Tom Evans wrote: > > On the other hand, we have very many benchmarks showing how poorly > > 4BSD scales on things like postgresql. We get much more load out of > > our 8.1 ULE DB and web servers than we do out of our 7.0 ones. It's > > easy to look at what you do and say "well, what suits my environment > > is clearly the best default", but I think there are probably more > > users typically running IO bound processes than CPU bound processes. > > You compare SCHED_ULE on FBSD 8.1 with SCHED_4BSD on FBSD 7.0? Shouldn't > you compare SCHED_ULE and SCHED_4BSD on the very same platform? Agreed -- this is a bad comparison. Again, I'm going to tell people to do the one thing that's painful and nobody likes to do: *look at commits* and pay close attention to the branches and any commits that involve "tagging" for a release (so you can determine what "version" of the code you might be running). http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/sched_ule.c http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/sched_4bsd.c I'm a bit busy today, otherwise I would offer to go over the SCHED_4BSD changes between 7.0-RELEASE and 8.1-RELEASE (I would need Tom to confirm those are the exact versions being used; I wish people would stop saying things like "FreeBSD x.y" because it's inaccurate). But the data is there at the above URLs, including the committers/those involves. > > I believe the correct thing to do is to put some extra documentation > > into the handbook about scheduler choice, noting the potential issues > > with loading NCPU+1 CPU bound processes. Perhaps making it easier to > > switch scheduler would also help? Replying to Tom's comment here: It is already easy to switch schedulers. You change the option in your kernel config, rebuild kernel (world isn't necessary as long as you haven't csup'd between your last rebuild and now), make installkernel, shutdown -r now, done. If what you're proposing is to make the scheduler changeable in real-time? I think that would require a **lot** of work for something that very few people would benefit from (please stop for a moment and think about the majority of the userbase, not just niche environments; I say this politely, not with any condescension BTW). Sure, it'd be "nice to have", but should be extremely low on the priority list (IMO). > Many people more experst in the issue than myself revealed some issues > in the code of both SCHED_ULE and even SCHED_4BSD. It would be a pitty > if all the discussions get flushed away like a "toilette-busisness" as > it has been done all the way in the past. Gut feeling says this is what will happen, and that's because the people who are (and have in the past) touching the scheduler bits are not involved in this conversation. We're not going to get anywhere unless those people are involved and are available to make adjustments/etc. I would love to start CC'ing them all, but I don't think that's necessarily effective. I will take the time to point out/remind folks that the number of people who *truly understand* the schedulers are few and far between. We're talking single-digit numbers, folks. And those people are already busy enough as-is. This makes solving this problem difficult. So, what I think WOULD be effective would be for someone to catalogue a list of their systems/specifications/benchmarks/software/etc. that show exactly where the problems are in their workspace when using ULE vs. 4BSD, or vice-versa. That may give the developers some leads as to how to progress. Let's also not forget about the compiler ordeal; gcc versions greatly differ (some folks overwrite the default base gcc with ones in ports), and then there's the clang stuff... Sigh. > Well, I'd like to see a kind of "standardized" benchmark. Like on > openbenchmark.org or at phoronix.com. I know that Phoronix' way of > performing benchmarks is questionable and do not reveal much of the > issues, but it is better than nothing. I would love to run such benchmarks on all of our systems, but I have no idea what kind of benchmark suites/etc. would be beneficial for the developers who maintain/touch the schedulers. You understand what I'm saying? For example, some folks earlier in the thread said the best thing to do for this would be buildworld, but then further follow-ups from others said buildworld is not effective given the I/O demands. Furthermore, I want whatever benchmark/app suite thing to be minimal as hell. It should be standalone, no dependencies (or only 1 or 2). Regarding threadsing: a colleague of mine, ex-co-worker who now works at Apple as a developer, wrote a C program while he was at my current workplace which -- pardon my French -- "beat the shit out of our Solaris boxes, thread-wise". It was customisable via command-line. The thing got some of our Solaris machines up to load averages of nearly 42000 (yes you read that right!), and spit out some benchmark-esque results when finished. I'll mention this thread to him, let him read it, and see if he has anything to say. He is *extremely* busy (even more so with the holiday coming up), so I have little faith he can/will help here, but he may give the code for it if he still has it. I believe he did have me run it on FreeBSD, but it was a long time ago. > I'm always surprised by the worse > performance of FreeBSD when it comes to threaded I/O. The differences > between Linux and FreeBSD of the same development maturity are > tremendous and scaring! Agreed. Linux has the upper hand in many areas, and this is one of them. Please do not think this means "Linux is better, FreeBSD sucks". It simply means that there are more people active/working on these things in Linux. People need to use what works best for them! -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |