From owner-freebsd-stable@FreeBSD.ORG Fri Dec 23 00:23:30 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 938AB1065670; Fri, 23 Dec 2011 00:23:30 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 14F198FC13; Fri, 23 Dec 2011 00:23:29 +0000 (UTC) Received: by vbbfr13 with SMTP id fr13so12447705vbb.13 for ; Thu, 22 Dec 2011 16:23:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=FSPrnK6PwbTWy4RygGGU8j1rUAQHXErAehW+F9jYDyI=; b=xKe50iaEs9AVuRiitVAF2U00PvSZFokTxy/dRpPQfD5aAwyan86nsloIih8J2nghIi o6abdh3ii8u0yR95DMNL8G/Qmkj4huF+v0zn5Dsnjm2YpM72tlU+1Qi3EG8q9cbh7SjP DxCobFjR0n55k7FacpnBn1PaJow0gBlxOW4ug= MIME-Version: 1.0 Received: by 10.52.29.16 with SMTP id f16mr6849502vdh.45.1324599809245; Thu, 22 Dec 2011 16:23:29 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.52.36.5 with HTTP; Thu, 22 Dec 2011 16:23:29 -0800 (PST) In-Reply-To: <20111222194740.GA36796@troutmask.apl.washington.edu> References: <4EE1EAFE.3070408@m5p.com> <20111215215554.GA87606@troutmask.apl.washington.edu> <20111222005250.GA23115@troutmask.apl.washington.edu> <20111222103145.GA42457@onelab2.iet.unipi.it> <20111222184531.GA36084@troutmask.apl.washington.edu> <4EF37E7B.4020505@FreeBSD.org> <20111222194740.GA36796@troutmask.apl.washington.edu> Date: Thu, 22 Dec 2011 16:23:29 -0800 X-Google-Sender-Auth: K5z1Y1MzzrN5mev4pPsJ7xV8A14 Message-ID: From: Adrian Chadd To: Steve Kargl Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable@freebsd.org, Andriy Gapon Subject: Re: SCHED_ULE should not be the default X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Dec 2011 00:23:30 -0000 On 22 December 2011 11:47, Steve Kargl w= rote: [snip] Thankyou for posting some actual measurements! > There is the additional observation in one of my 2008 > emails (URLs have been posted) that if you have N+1 > cpu-bound jobs with, say, job0 and job1 ping-ponging > on cpu0 (due to ULE's cpu-affinity feature) and if I > kill job2 running on cpu1, then neither job0 nor job1 > will migrate to cpu1. =A0So, one now has N cpu-bound > jobs running on N-1 cpus. .. and this sounds like a pretty serious regression. Have you ever filed a PR for it? > Finally, my initial post in this email thread was to > tell O. Hartman to quit beating his head against > a wall with ULE (in an HPC environment). =A0Switch to > 4BSD. =A0This was based on my 2008 observations and > I've now wasted 2 days gather additional information > which only re-affirms my recommendation. I personally don't think this is time wasted. You've done something that noone else has actually done - provided actual results from real-life testing, rather than a hundred posts of "I remember seeing X, so I don't use ULE." If you can definitely and consistently reproduce that N-1 cpu bound job bug, you're now in a great position to easily test and re-report KTR/schedtrace results to see what impact they have. Please don't underestimate exactly how valuable this is. How often are those two jobs migrating between CPUs? How am I supposed to read "CPU load" ? Why isn't it just sitting at 100% the whole time? Would you mind repeating this with 4BSD (the N+1 jobs) so we can see how the jobs are scheduled/interleaved? Something tells me we'll see it the jobs being scheduled evenly Adrian