Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 8 Jun 2007 12:22:31 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-arch@freebsd.org
Cc:        Attilio Rao <attilio@freebsd.org>
Subject:   Re: Updated rusage patch
Message-ID:  <200706081222.32457.jhb@freebsd.org>
In-Reply-To: <20070607135511.P606@10.0.0.1>
References:  <20070529105856.L661@10.0.0.1> <20070606152352.H606@10.0.0.1> <20070607135511.P606@10.0.0.1>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 07 June 2007 04:59:03 pm Jeff Roberson wrote:
> On Wed, 6 Jun 2007, Jeff Roberson wrote:
> 
> > On Tue, 5 Jun 2007, Bruce Evans wrote:
> >
> >> 
> >> This can probably be fixed more simply by calling rufetch() to reset the
> >> time state in threads as a side effect.  Do this before resetting the
> >> state in the process.
> >
> > Ok, I agree with bde here, just call rufetch and this will clear each 
thread, 
> > and then you can clear the rux in the proc.
> >
> > I'd like to make a list of the remaining problems with rusage and 
potential 
> > fixes.  Then we can decide which ones myself and attilio will resolve 
> > immediately to clean up some of the effect of the sched lock changes.
> >
> > 1)  The ruadd() in thread_exit() is not safe since we're accessing another 
> > thread's unlocked rusage structure.  Potential solution is to allocate 
p_ru 
> > as part of the proc struct and add into there, which will be protected by 
the 
> > PROC_SLOCK, which bde seemed to like better anyway.
> >
> > 2)  We may lose information between exit1() and thread_exit() due to the 
way 
> > p_ru is initialized before we're done exiting.  There also seems to be a 
race 
> > where wait() operates on a process before it's done in thread_exit() which 
> > means wait may return rusage information without the child added in!  The 
> > solution will be to fix this race, and then access p_ru directly in 
wait().
> 
> The patch at http://people.freebsd.org/~jeff/rusage3.diff fixes points 1 
> and 2 as well as the p_runtime iniitialization problem.  This moves the 
> collection of child rusage back into exit1() and changes the exiting 
> threads to accumulate their rusage into p_ru under protection of the 
> process spinlock.  This also removes the gross lock/unlock of proc slock 
> (formerly sched_lock) from wait and implements something more sensible.

I think the comment explaining the race still needs to be there for future 
code readers so they don't remove the locking.  I also don't see what you 
gain by moving the lock earlier.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200706081222.32457.jhb>