From owner-freebsd-questions@FreeBSD.ORG Fri Nov 28 12:01:30 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 431091065673 for ; Fri, 28 Nov 2008 12:01:30 +0000 (UTC) (envelope-from fbsd.questions@rachie.is-a-geek.net) Received: from mail.rachie.is-a-geek.net (rachie.is-a-geek.net [66.230.99.27]) by mx1.freebsd.org (Postfix) with ESMTP id C07438FC1F for ; Fri, 28 Nov 2008 12:01:29 +0000 (UTC) (envelope-from fbsd.questions@rachie.is-a-geek.net) Received: from localhost (mail.rachie.is-a-geek.net [192.168.2.101]) by mail.rachie.is-a-geek.net (Postfix) with ESMTP id C71C4AFCF5D; Fri, 28 Nov 2008 03:01:28 -0900 (AKST) From: Mel To: freebsd-questions@freebsd.org Date: Fri, 28 Nov 2008 13:01:25 +0100 User-Agent: KMail/1.9.7 References: <49202F59.7050802@zzz.ee> <492FC589.2090001@FreeBSD.org> <492FCD6A.4050103@zzz.ee> In-Reply-To: <492FCD6A.4050103@zzz.ee> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200811281301.26721.fbsd.questions@rachie.is-a-geek.net> Cc: Kris Kennaway , Ott =?iso-8859-1?q?K=F6stner?= Subject: Re: top incorrectly reporting process time X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Nov 2008 12:01:30 -0000 On Friday 28 November 2008 11:52:26 Ott K=F6stner wrote: > Kris Kennaway wrote: > > Ott K=F6stner wrote: > >> Here is the problem. top returns only 9:42 -- the time of the first > >> thread. ps returns the sum of the thread times, which is correct. > > > > OK, I thought you were claiming the numbers were completely unrelated. > > Yeah, top and ps are just reporting different things (runtime of one > > thread vs total runtime of all threads). top may have a configuration > > option about aggregating the thread runtimes, or it may require a > > source code change. > > > > Kris > > Now I understand -- there are actually 2 separate problems -- one with > top code, which is the same on all systems and another porblem wihth ps > on two amd64 machines I am running: > > First computer FreeBSD 7.1-PRERELEASE #2 (exact copy / paste) : > > # ps -ax|grep mysql; echo; ps -axH|grep mysql > 981 con- IW 0:00.00 /bin/sh /usr/local/bin/mysqld_safe > --defaults-extra-file=3D/var/db/mysql/my.c > 1019 con- S 98:00.87 /usr/local/libexec/mysqld > --defaults-extra-file=3D/var/db/mysql/my.cnf --base > 12266 p0 S+ 0:00.00 grep mysql > > 981 con- IW 0:00.00 /bin/sh /usr/local/bin/mysqld_safe > --defaults-extra-file=3D/var/db/mysql/my.c > 1019 con- S 0:01.21 /usr/local/libexec/mysqld > --defaults-extra-file=3D/var/db/mysql/my.cnf --base > 1019 con- S 0:00.11 /usr/local/libexec/mysqld > --defaults-extra-file=3D/var/db/mysql/my.cnf --base > 1019 con- I 3:13.38 /usr/local/libexec/mysqld > --defaults-extra-file=3D/var/db/mysql/my.cnf --base > 1019 con- I 0:00.08 /usr/local/libexec/mysqld > --defaults-extra-file=3D/var/db/mysql/my.cnf --base > 1019 con- S 0:00.02 /usr/local/libexec/mysqld > --defaults-extra-file=3D/var/db/mysql/my.cnf --base > 12268 p0 R+ 0:00.00 grep mysql > > 3.xx minutes seems to be correct here, not 98. > > Second computer FreeBSD 7.1-PRERELEASE #3 (exact copy / paste): > > # ps -ax|grep mysql; echo; ps -axH|grep mysql > 1015 con- I 0:00.01 /bin/sh /usr/local/bin/mysqld_safe > --defaults-extra-file=3D/var/db/mysql/my.c > 1079 con- S 582:49.60 [mysqld] > > 1015 con- I 0:00.01 /bin/sh /usr/local/bin/mysqld_safe > --defaults-extra-file=3D/var/db/mysql/my.c > 1079 con- S 2:00.40 [mysqld] > 1079 con- I 0:00.00 [mysqld] > 1079 con- I 0:01.32 [mysqld] > 1079 con- I 0:47.04 [mysqld] > 1079 con- I 0:03.56 [mysqld] > 1079 con- S 0:26.43 [mysqld] > 1079 con- S 3:13.97 [mysqld] > 1079 con- S 4:12.72 [mysqld] > 1079 con- S 0:03.72 [mysqld] > > 582 minutes is clearly wrong. Not if it's the sum of all threads that lived and died during the lifetime = of=20 the process. It's value is taken from the kernel's idea of the runtime. Wit= h=20 KERN_PROC_INC_THREAD set, it will look at the thread storage for active=20 threads, including the 'main()' thread. I haven't looked into detail, but I suspect when a thread dies it gets adde= d=20 to process runtime, and is stored nowhere else. Aside from the different machines, you also took 2 different daemons, which= =20 fits this: named uses a static thread pool, by design, sum(nthreads) will equal the=20 process time mysqld uses a dynamic thread pool, sum(nthreads) is really=20 sum(nthreads_active). I haven't looked into detail =2D-=20 Mel Problem with today's modular software: they start with the modules and never get to the software part.