From owner-cvs-all@FreeBSD.ORG Wed Feb 11 16:12:50 2004 Return-Path: Delivered-To: cvs-all@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5AA5416A4CE; Wed, 11 Feb 2004 16:12:50 -0800 (PST) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id AD12D43D1D; Wed, 11 Feb 2004 16:12:49 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])i1C0CmLE024029; Thu, 12 Feb 2004 11:12:48 +1100 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i1C0CiEJ007044; Thu, 12 Feb 2004 11:12:45 +1100 Date: Thu, 12 Feb 2004 11:12:44 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Nate Lawson In-Reply-To: <20040210104736.M81968@root.org> Message-ID: <20040212102818.Q83287@gamplex.bde.org> References: <200402061930.i16JUCpa011145@repoman.freebsd.org> <20040210101904.C50462@gamplex.bde.org> <20040209164309.L75509@root.org> <200402101120.11815.jhb@FreeBSD.org> <20040210104736.M81968@root.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: cvs-all@FreeBSD.org cc: cvs-src@FreeBSD.org cc: src-committers@FreeBSD.org cc: Tim Robbins cc: John Baldwin Subject: Re: cvs commit: src/sys/kern kern_resource.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Feb 2004 00:12:50 -0000 On Tue, 10 Feb 2004, Nate Lawson wrote: > On Tue, 10 Feb 2004, John Baldwin wrote: > > On Monday 09 February 2004 07:45 pm, Nate Lawson wrote: > > > On Tue, 10 Feb 2004, Bruce Evans wrote: > > > > Postponing it for a quantum wouldn't help much (it would either give > > > > very inaccurate times or cost the same as now), but postponing it > > > > forever might help. exit1() would have to read the current time (this > > > > is needed anyway to set switchtime), but there is no need to calculate > > > > the resource usage unless an ancestor actually uses > > > > getrusage(RUSAGE_CHILDREN, ...). ruadd() would add tick counts instead > > > > of times and the RUSAGE_CHILDREN case in getrusage() would call calcru() > > > > to scale the tick counts like the RUSAGE_SELF case already does. This > > > > would be a tiny optimization but has another advantage: calcru() has > > > > rounding errors that accumulate in p_cru and delaying calcru() would > > > > prevent them accumuating. > > > > > > I think this is an excellent idea. I suggested postponing because I > > > thought the data was used for CPU quotas also, but if it's only used by > > > getrusage(), your idea should work and make the caller pay the cost of > > > units conversion. > > > > Are one of you two going to implement that then? > > I am having trouble even finding time for ACPI at least through the end of > the month so I'm not taking on any new tasks right now. Perhaps phk can > add this task to the JKH page for one of our new recruits to pick up. :) I will do it. Parts of the hackish version are: %%% Index: sys/resourcevar.h =================================================================== RCS file: /home/ncvs/src/sys/sys/resourcevar.h,v retrieving revision 1.40 diff -u -2 -r1.40 resourcevar.h --- sys/resourcevar.h 6 Feb 2004 19:35:14 -0000 1.40 +++ sys/resourcevar.h 11 Feb 2004 22:38:40 -0000 @@ -53,4 +49,21 @@ struct rusage p_ru; /* stats for this proc */ struct rusage p_cru; /* sum of stats for reaped children */ + /* + * The next 7 struct members extend p_cru. The times in p_ru and + * p_cru are never up to date. The actual times are kept in the + * runtimes and tick counts (with control info in the "previous" + * times), and are converted when userland asks for rusage info. + * For p_ru, the actual times are in the proc struct but for p_cru + * they are here. Backwards compatibility prevents putting them + * in the user-visible rusage struct, but we should put them in + * a new kernel rusage struct. + */ + struct bintime p_cruntime; + u_int64_t p_cuticks; + u_int64_t p_csticks; + u_int64_t p_citicks; + u_int64_t p_cuu; + u_int64_t p_csu; + u_int64_t p_ciu; struct itimerval p_timer[3]; /* virtual-time timers */ #define pstat_endzero pstat_startcopy @@ -108,4 +121,5 @@ +void calccru(struct proc *p, struct timeval *up, struct timeval *sp); void calcru(struct proc *p, struct timeval *up, struct timeval *sp, struct timeval *ip); int chgproccnt(struct uidinfo *uip, int diff, int max); int chgsbsize(struct uidinfo *uip, u_int *hiwat, u_int to, Index: kern/kern_exit.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v retrieving revision 1.220 diff -u -2 -r1.220 kern_exit.c --- kern/kern_exit.c 4 Feb 2004 21:52:55 -0000 1.220 +++ kern/kern_exit.c 11 Feb 2004 22:03:04 -0000 @@ -111,5 +110,6 @@ exit1(struct thread *td, int rv) { + struct bintime new_switchtime; struct proc *p, *nq, *q; struct tty *tp; struct vnode *ttyvp; @@ -421,15 +421,5 @@ } - /* - * Save exit status and final rusage info, adding in child rusage - * info and self times. - */ PROC_LOCK(p); - p->p_xstat = rv; - *p->p_ru = p->p_stats->p_ru; - mtx_lock_spin(&sched_lock); - calcru(p, &p->p_ru->ru_utime, &p->p_ru->ru_stime, NULL); - mtx_unlock_spin(&sched_lock); - ruadd(p->p_ru, &p->p_stats->p_cru); /* @@ -509,14 +498,35 @@ wakeup(p->p_pptr); PROC_UNLOCK(p->p_pptr); - cnt.v_swtch++; - binuptime(PCPU_PTR(switchtime)); + + /* Do the same timestamp bookkeeping that mi_switch() would do. */ + binuptime(&new_switchtime); + bintime_add(&p->p_runtime, &new_switchtime); + bintime_sub(&p->p_runtime, PCPU_PTR(switchtime)); + PCPU_SET(switchtime, new_switchtime); PCPU_SET(switchticks, ticks); + /* + * Save exit status and finalize rusage info except for times, + * adding in child rusage info. + */ + p->p_xstat = rv; + *p->p_ru = p->p_stats->p_ru; + bintime_add(&p->p_runtime, &p->p_stats->p_cruntime); + p->p_uticks += p->p_stats->p_cuticks; + p->p_sticks += p->p_stats->p_csticks; + p->p_iticks += p->p_stats->p_citicks; + p->p_uu += p->p_stats->p_cuu; + p->p_su += p->p_stats->p_csu; + p->p_iu += p->p_stats->p_ciu; + ruadd(p->p_ru, &p->p_stats->p_cru); + + cnt.v_swtch++; cpu_sched_exit(td); /* XXXKSE check if this should be in thread_exit */ + /* * Allow the scheduler to adjust the priority of the * parent when a kseg is exiting. */ - if (p->p_pid != 1) + if (p != initproc) sched_exit(p->p_pptr, p); @@ -621,6 +635,8 @@ } if (uap->rusage) { - bcopy(p->p_ru, &ru, sizeof(ru)); + ru = *p->p_ru; + calccru(p, &ru.ru_utime, &ru.ru_stime); PROC_UNLOCK(p); + /* XXX copyout with proctree_lock. */ if ((error = copyout(&ru, uap->rusage, sizeof (struct rusage)))) { @@ -668,4 +684,11 @@ PROC_UNLOCK(p); PROC_LOCK(q); + bintime_add(&q->p_stats->p_cruntime, &p->p_runtime); + q->p_stats->p_cuticks += p->p_uticks; + q->p_stats->p_csticks += p->p_sticks; + q->p_stats->p_citicks += p->p_iticks; + q->p_stats->p_cuu += p->p_uu; + q->p_stats->p_csu += p->p_su; + q->p_stats->p_ciu += p->p_iu; ruadd(&q->p_stats->p_cru, p->p_ru); PROC_UNLOCK(q); %%% calccru() is like calcru() except for the child. It only requires the proc lock. The above makes calcru() and calccru() very rarely called, so their efficiency doesn't matter much. I quite haven't succeeded with the original goal getting rid of sched_lock for the whole of calcru(), so interrupt latency is still a problem. The explicit output parameters can now always point to local variables, but there are implicit output parameters in the proc pointer (the monotonization times). The proc lock is enough to protect them, but calcru() can't simply aquire the proc lock since sched_lock may be held. Bruce