From owner-cvs-src@FreeBSD.ORG Mon Nov 13 12:48:16 2006 Return-Path: X-Original-To: cvs-src@freebsd.org Delivered-To: cvs-src@freebsd.org Received: from localhost.my.domain (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id DC9C416A403; Mon, 13 Nov 2006 12:48:15 +0000 (UTC) (envelope-from davidxu@freebsd.org) From: David Xu To: Bruce Evans Date: Mon, 13 Nov 2006 20:48:07 +0800 User-Agent: KMail/1.8.2 References: <200611111311.kABDBVNH042993@repoman.freebsd.org> <200611130717.03734.davidxu@freebsd.org> <20061113193924.L75708@delplex.bde.org> In-Reply-To: <20061113193924.L75708@delplex.bde.org> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200611132048.08180.davidxu@freebsd.org> Cc: cvs-src@freebsd.org, src-committers@freebsd.org, cvs-all@freebsd.org Subject: Re: cvs commit: src/sys/kern sched_4bsd.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Nov 2006 12:48:16 -0000 On Monday 13 November 2006 17:58, Bruce Evans wrote: > > It might not be a bug of the NO_KSE, the problem is in sched_fork() and > > sched_exit(), for process which quickly fork() a child and then the child > > exits quickly, the parent's estcpu will be doubled quickly too, this > > fairness is really unfair, > > That can't be the problem, since there are no exits in the above. > I have tried the change I mentioned, and top runs quickly and the system does not have the problem as you described. > > I think your examples is the scenario, however, I don't know > > why KSE works better. this might be fixed by remembering the inherited > > estcpu in child, and decay it every second. when the child exits, > > it add really used estcpu to parent. code looks like this: > > > > in sched_fork(), we remember inherited estcpu: > > td->td_inherited_estcpu = parent->td_estcpu; > > in schedcpu(), we decay it every second (should be fixed in sched_wakeup > > too): > > td->td_inherited_estcpu = decaycpu(loadfac, td->td_inherited_cpu); > > in sched_exit(); > > parent->td_estcpu = ESTCPULIM(parent->td_estcpu, > > childtd->td_estcpu - td->td_inherited_cpu); > > > > This should fix the quickly fork() and exit() problem for parent process. > > I've known about this bug since Peter Default told me about it in late > 1999, and now use the code at the end of this mail to avoid it. However, > I remembered it incorrectly and may have misdescribed it to you. I > thought I remembered actual doubling, with estcpu soon reaching > "infinity", but the ESTCPULIM() clamp prevents it getting preposterously > high now, and I couldn't find any version that let it reach "infinity". > Versions before late 1999 had a bogus limit of UCHAR_MAX and that may > have been responsible for shells appearing to hang because it was a > better approximation to "infinity". > > I now use the following: > > % Index: sched_4bsd.c > % =================================================================== > % RCS file: /home/ncvs/src/sys/kern/sched_4bsd.c,v > % retrieving revision 1.41 > % diff -u -2 -r1.41 sched_4bsd.c > % --- sched_4bsd.c 21 Jun 2004 23:47:47 -0000 1.41 > % +++ sched_4bsd.c 8 Dec 2005 11:11:52 -0000 > % @@ -550,9 +641,20 @@ > % > % void > % -sched_exit_ksegrp(struct ksegrp *kg, struct ksegrp *child) > % +sched_exit_ksegrp(struct ksegrp *parent, struct ksegrp *child) > % { > % > % mtx_assert(&sched_lock, MA_OWNED); > % - kg->kg_estcpu = ESTCPULIM(kg->kg_estcpu + child->kg_estcpu); > % + /* > % + * XXX adding all of the child's cpu to the parent's like we used to > % + * do would be wrong, since we duplicate the parent's cpu at fork > % + * time so adding it all back would give exponential growth. In > % + * practice, the growth would have been limited by ESTCPULIM, but that > % + * would be wrong too since it is very nonlinear. Splitting the cpu > % + * at fork time would be better, but adding it all back here would > % + * still give nonlinearities since multiple processes tend to > % + * accumulate more cpu than single ones. > % + */ > % + if (parent->kg_estcpu < child->kg_estcpu) > % + parent->kg_estcpu = child->kg_estcpu; > % } > % > > This seems to work well enough in practice. It grows the parent's estcpu > quite slowly if there are a lot of fork/exits. > Yes, I knew there was the patch. > Previous versions did something different on fork too. Splitting or > otherwise reducing estcpu on fork isn't such a good idea since it > reduces the limit on the real resource hogs -- all the children, when > there are lots of children that all want to run. When the children > don't exit, hacking on the parent's estcpu doesn't help, and doubling > the child's estpcu on fork and halving it on exit is closer to being > correct than the reverse. > > At least one of Peter Dufault's versions removed all explicit accesses > to p_estcpu on fork and exit. I think the change on fork is only > cosmetic -- p_estcpu should have been automatically copied on fork. > > Anyway, this isn't the bug in non-KSE. I didn't look hard for the > reasons. Top seemed to show the priorites of the hogs not decreasing > (numerically increasing) fast enough. > I still can not find the bug although I have read all changes several times. > Bruce