From owner-freebsd-arch@FreeBSD.ORG Wed May 30 19:04:27 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A18D616A421 for ; Wed, 30 May 2007 19:04:27 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 5842513C45E for ; Wed, 30 May 2007 19:04:27 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.101] (c-71-231-138-78.hsd1.or.comcast.net [71.231.138.78]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l4UJ4OHO074016 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Wed, 30 May 2007 15:04:25 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Wed, 30 May 2007 12:04:17 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: Bruce Evans In-Reply-To: <20070530201618.T13220@besplex.bde.org> Message-ID: <20070530115752.F661@10.0.0.1> References: <20070529105856.L661@10.0.0.1> <200705291456.38515.jhb@freebsd.org> <20070529121653.P661@10.0.0.1> <20070530065423.H93410@delplex.bde.org> <20070529141342.D661@10.0.0.1> <20070530125553.G12128@besplex.bde.org> <20070529201255.X661@10.0.0.1> <20070529220936.W661@10.0.0.1> <20070530201618.T13220@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-arch@freebsd.org Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 May 2007 19:04:27 -0000 Ok, hopefully we're in the home stretch here. Patch is at the same location. I did the following: 1) Renamed p_ru back. I erroneously thought some code was accessing pstats indirectly through here. That's why I changed the name. 2) Cleaned up some more comments, casts, style bugs, etc. 3) Fixed a problem where we migh've called callout_init_mtx() multiple times. 4) Renamed ruxcollect since it doesn't do the same thing as rucollect(). 5) Resorted lim_cb() which now holds the sched_lock over much less code but is certainly safe as is. Hopefully you will find this to your liking. I intend to fix some more of the races in a follow-up commit when I change the scheduler locking. Sorry for the top post. Jeff On Wed, 30 May 2007, Bruce Evans wrote: > On Tue, 29 May 2007, Jeff Roberson wrote: > >> I have updated the patch at: >> >> http://people.freebsd.org/~jeff/rusage3.diff > > New minor points about an even later update: > > % Index: kern/kern_exit.c > % =================================================================== > % RCS file: /usr/home/ncvs/src/sys/kern/kern_exit.c,v > % retrieving revision 1.298 > % diff -u -p -r1.298 kern_exit.c > % --- kern/kern_exit.c 14 May 2007 22:21:58 -0000 1.298 > % +++ kern/kern_exit.c 29 May 2007 21:38:21 -0000 > % ... > % @@ -229,7 +233,7 @@ retry: > % */ > % EVENTHANDLER_INVOKE(process_exit, p); > % % - MALLOC(p->p_ru, struct rusage *, sizeof(struct rusage), > % + MALLOC(ru, struct rusage *, sizeof(struct rusage), > % M_ZOMBIE, M_WAITOK); > % /* > % * If parent is waiting for us to exit or exec, > > Perhaps this should not be micro-optimized for space (allocate it inside > struct proc for the whole process lifetime). Alternatively, inherit it > from the first thread in the process that exits. > > % Index: kern/kern_resource.c > % =================================================================== > % RCS file: /usr/home/ncvs/src/sys/kern/kern_resource.c,v > % retrieving revision 1.171 > % diff -u -p -r1.171 kern_resource.c > % --- kern/kern_resource.c 27 May 2007 20:50:23 -0000 1.171 > % +++ kern/kern_resource.c 29 May 2007 22:06:05 -0000 > % @@ -619,6 +620,47 @@ setrlimit(td, uap) > % return (error); > % } > % % +static void > % +lim_cb(void *arg) > % +{ > % + struct rlimit rlim; > % + struct thread *td; > % + struct proc *p; > > Unsorted decls. > > % + > % + p = (struct proc *)arg; > > Unnecessary cast. > > % + PROC_LOCK_ASSERT(p, MA_OWNED); > % + /* > % + * Check if the process exceeds its cpu resource allocation. If > % + * it reaches the max, arrange to kill the process in ast(). > % + */ > % + mtx_lock_spin(&sched_lock); > % + FOREACH_THREAD_IN_PROC(p, td) > % + ruxcollect(&p->p_rux, td); > % + if (p->p_cpulimit != RLIM_INFINITY && > > This should just retun if the limit is infinity. > > p_cpulimit is still marked as locked by sched_lock, but proc locking > should be enough now, so sched locking (or whatever it becomes) is > not needed for the early test and retrun. The early return rarely > happens but it gives simpler code as well as saving time when it > happens. > > % + p->p_rux.rux_runtime > p->p_cpulimit * cpu_tickrate()) { > % + lim_rlimit(p, RLIMIT_CPU, &rlim); > % + if (p->p_rux.rux_runtime >= rlim.rlim_max * cpu_tickrate()) { > > Hmm, the tick rate conversion bug gives more than wrong stats. Here it > shoots processs. E.g., suppose you have a cpulimit of 1 hour and you > throttle the CPU to 8 times slower. Processes that have run for > 7.5 > minutes at the old (higher) tick rate are then killed here. > > % ... > % + if (p->p_cpulimit != RLIM_INFINITY) > % + callout_reset(&p->p_limco, hz, lim_cb, (void *)p); > > Unnecessary cast. > > % Index: kern/kern_synch.c > % =================================================================== > % RCS file: /usr/home/ncvs/src/sys/kern/kern_synch.c,v > % retrieving revision 1.295 > % diff -u -p -r1.295 kern_synch.c > % --- kern/kern_synch.c 18 May 2007 07:10:45 -0000 1.295 > % +++ kern/kern_synch.c 29 May 2007 21:35:40 -0000 > % @@ -401,35 +401,17 @@ mi_switch(int flags, struct thread *newt > % ... > % /* > % * Compute the amount of time during which the current > % - * process was running, and add that to its total so far. > % + * thread was running, and add that to its total so far. > % */ > % new_switchtime = cpu_ticks(); > % - p->p_rux.rux_runtime += (new_switchtime - PCPU_GET(switchtime)); > % - p->p_rux.rux_uticks += td->td_uticks; > % - td->td_uticks = 0; > % - p->p_rux.rux_iticks += td->td_iticks; > % - td->td_iticks = 0; > % - p->p_rux.rux_sticks += td->td_sticks; > % - td->td_sticks = 0; > % - > % + td->td_runtime += (new_switchtime - PCPU_GET(switchtime)); > > Clean further by removing unnecesary parens. > > % td->td_generation++; /* bump preempt-detect counter */ > % - > % - /* > % - * Check if the process exceeds its cpu resource allocation. If > % - * it reaches the max, arrange to kill the process in ast(). > % - */ > % - if (p->p_cpulimit != RLIM_INFINITY && > % - p->p_rux.rux_runtime >= p->p_cpulimit * cpu_tickrate()) { > % - p->p_sflag |= PS_XCPU; > % - td->td_flags |= TDF_ASTPENDING; > % - } > % - > % /* > % * Finish up stats for outgoing thread. > % */ > > Clean further by merging the "start" and "finish" code (so that > switchtime is cleared immediately after it is copied, etc). There is > nothing in between any more. > > Bruce >