From owner-freebsd-current@FreeBSD.ORG Mon May 10 00:17:39 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F24E516A4CE for ; Mon, 10 May 2004 00:17:38 -0700 (PDT) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id BFF1943D2D for ; Mon, 10 May 2004 00:17:37 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.0.86])i4A7HQ4u017712; Mon, 10 May 2004 17:17:26 +1000 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i4A7HNI2014852; Mon, 10 May 2004 17:17:25 +1000 Date: Mon, 10 May 2004 17:17:23 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Julian Elischer In-Reply-To: Message-ID: <20040510162831.X1105@gamplex.bde.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: FreeBSD current users Subject: Re: exit1()/scheduler question.. possible typo? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 May 2004 07:17:39 -0000 On Sun, 9 May 2004, Julian Elischer wrote: > in exit1 there is an assertion to test that the exiting process > is not init. (proc 1 by tradition..) > > yes later, nearly at the end we see: > > /* > * Allow the scheduler to adjust the priority of the > * parent when a kseg is exiting. > */ > if (p->p_pid != 1) > sched_exit(p->p_pptr, td); > > > firstly, the comment is wrong but, the question comes.. > "if init can not get here then why have the test?" This code rotted from differently broken code in wait1(). In RELENG_4, the corresponding code is: % if (p->p_stat == SZOMB) { % /* charge childs scheduling cpu usage to parent */ % if (curproc->p_pid != 1) { % curproc->p_estcpu = % ESTCPULIM(curproc->p_estcpu + p->p_estcpu); % } Here there are 2 processes involved (the parent (curproc) doing the wait, and the child (p) being reaped), and the check for the parent is correct. It is needed to avoid clobbering init's estcpu by adding the child's estcpu to it. [Bugs in this code begin with English usage errors in the comment and end with algorithmic brokenness. Adding the child's estcpu gives an estcpu and thus a priority that wants to be exponential in the number of children, but the clamps on estcpu and the priority prevent complete brokenness. See revs.1.4 and 1.6 of kern_exit.c and associated changes in kern_fork.c for the initial reimplementation of this bug (it was first implemented in FreeBSD-1, and caused mysterious shell hangs esprecially before rev.1.6). This is now handled better by SCHED_ULE in -current and by SCHED_4BSD in my version of -current.] Anyway, the code has moved from wait1() to exit1() and sched_exit(). The SCHED_4BSD version of sched_exit() + sched_exit_ksegrp() is still similar. It even preseves most of the English usage errors in the comment. The move to exit1() is dubious. A special case for the init parent still seems like a good idea, but the parent is not known known until wait() and often changes to init after exit(). However, this part of the bug is harmless provided p->p_pptr is locked down (which it doesn't seem to be). We can just accumulate estcpu (or something different for SCHED_ULE) in the current parent, and then throw it away (or otherwise special-case it) when the parent (or another ancestor) is reaped by init. > I get the impression that possibly this should be p->p_pptr->p_ppid p_pid > but I don't know enough about ULE to know if that makes sense. > > (maybe it should be (p->p_pptr != initproc) My version already used initproc here and elsewhere, but was missing the fix for the LHS. The call to sched_exit() was moved from wait1() to exit1() without special mention in: % RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v % Working file: kern_exit.c % head: 1.229 % ... % ---------------------------- % revision 1.209 % date: 2003/04/11 03:39:07; author: jeff; state: Exp; lines: +7 -11 % - Adjust sched hooks for fork and exec to take processes as arguments instead exit (?) % of ksegs since they primarily operation on processes. % - KSEs take ticks so pass the kse through sched_clock(). % - Add a sched_class() routine that adjusts a ksegrp pri class. % - Define a sched_fork_{kse,thread,ksegrp} and sched_exit_{kse,thread,ksegrp} % that will be used to tell the scheduler about new instances of these % structures within the same process. These will be used by THR and KSE. % - Change sched_4bsd to reflect this API update. % ---------------------------- The move seems to be just a cleanup/optimization/bug. It makes the sched* call match the caller's name, avoids some locking, and adds the bug. Bruce