From owner-freebsd-hackers@FreeBSD.ORG  Tue Jan  3 17:14:02 2012
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D76D4106566B;
	Tue,  3 Jan 2012 17:14:02 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id AD9068FC16;
	Tue,  3 Jan 2012 17:14:02 +0000 (UTC)
Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [96.47.65.170])
	by cyrus.watson.org (Postfix) with ESMTPSA id 65D8746B3B;
	Tue,  3 Jan 2012 12:14:02 -0500 (EST)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id C3FE9B91E;
	Tue,  3 Jan 2012 12:14:01 -0500 (EST)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-hackers@freebsd.org
Date: Tue, 3 Jan 2012 12:13:54 -0500
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p8; KDE/4.5.5; amd64; ; )
References: <4E3CC033.6070604@rawbw.com> <4E3D808F.1030101@rawbw.com>
	<201108160925.20568.jhb@freebsd.org>
In-Reply-To: <201108160925.20568.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201201031213.54336.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
	(bigwig.baldwin.cx); Tue, 03 Jan 2012 12:14:01 -0500 (EST)
Cc: Yuri <yuri@rawbw.com>, Alexander Best <arundel@freebsd.org>
Subject: Re: top(1) loses process user time count when threads end
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Jan 2012 17:14:02 -0000

On Tuesday, August 16, 2011 9:25:20 am John Baldwin wrote:
> On Saturday, August 06, 2011 1:57:35 pm Yuri wrote:
> > On 08/06/2011 02:11, Alexander Best wrote:
> > > On Fri Aug  5 11, Yuri wrote:
> > >> I have the process that first runs in 3 threads but later two active
> > >> threads exit.
> > >>
> > >> top(1) shows this moment this way (1 sec intervals):
> > >> 30833 yuri            3  76    0  4729M  4225M nanslp  4   0:32 88.62% app
> > >> 30833 yuri            3  76    0  4729M  4225M nanslp  6   0:34 90.92% app
> > >> 30833 yuri            1  96    0  4729M  4225M CPU1    1   0:03  1.17% app
> > >> 30833 yuri            1  98    0  4729M  4226M CPU1    1   0:04 12.89% app
> > >>
> > >> Process time goes down: 0:34 ->  0:03. Also WCPU goes down 90.92% ->
> > >> 1.17% even though this process is CPU bound and does intense things
> > >> right after threads exit.
> > >>
> > >> getrusage(2) though, called in the process, shows the correct user time.
> > >>
> > >> I think this is the major bug in the process time accounting.
> > > could you check, whether kern/128177 or kern/140892 describe your situation?
> > 
> > I have ULE scheduler. kern/128177 talks about single thread with ULE 
> > scheduler, and my issue is with threads. So I am not sure if it is 
> > related. There have been no motion on kern/128177 since Feb 9, 2009.
> > kern/140892 is probably the same as mine.
> > 
> > In any case, both these PRs have to be fixed since they are very user 
> > visible, not just some obscure issues.

Actually, I now think I know what this is.  This is probably fixed now by the
kernel changes in revision 188764 and my changes to top in 224062.  I think
what happened before is that top(1) "lost" the the runtime of exited threads
because it used to sum up the runtime of the currently executing threads to
get the process' runtime.  Now it will use the kernel's value for the process
runtime which should include both exited threads and currently running threads.
I can't tell how recent your kernel/world are though from your message to see
if you have both of these changes.

-- 
John Baldwin