Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 8 Oct 2005 01:49:41 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Bruce Evans <bde@FreeBSD.org>
Cc:        cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/usr.bin/gprof printgprof.c
Message-ID:  <20051008011750.E58285@delplex.bde.org>
In-Reply-To: <200510071059.j97AxgfL051248@repoman.freebsd.org>
References:  <200510071059.j97AxgfL051248@repoman.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 7 Oct 2005, Bruce Evans wrote:

>  (2) If a function has a zero call count but has a nonzero self or child
>      time, then print its total self time in the self time per call
>      column as a percentage of the total (self + child) time.  It is
>      not possible to print the times per call in this case because the
>      call count is zero.  Previously, this was handled by leaving both
>      per-call columns blank.  The self time is printed in another column
>      but there was no way to recover the total time.
> ...
>  (2) improves mainly cases like kernel threads.  Most kernel threads
>  appear to be never called because they are always started before
>  userland can run to turn on profiling.  As for main(), the fact that
>  they are called is not very interesting and their callers are
>  uninteresting, but their relative self time is interesting since they
>  are long-running.

gprof output for hi-res (should be normal) kernel profiling now looks
like this:

% granularity: each sample hit covers 16 byte(s) for 0.00% of 65.15 seconds
% 
%   %   cumulative   self              self     total 
%  time   seconds   seconds    calls  ns/call  ns/call  name 
%  86.8     56.584   56.584   103196   548312   548412  sleepq_add [4]
%   3.1     58.573    1.990        0  100.00%           mcount [10]
%   2.3     60.085    1.512      381  3967867  3967867  cpu_idle_default [15]
%   1.8     61.244    1.159        0  100.00%           mexitcount [16]
%   0.8     61.741    0.497        0  100.00%           cputime [23]
%   0.5     62.063    0.321        0  100.00%           user [29]
%   0.3     62.247    0.184        9 20480664 31839385  vmtotal [34]
%   0.3     62.421    0.174    60381     2876     2876  sse_pagecopy [42]
%   0.2     62.526    0.105   108567      970      970  cpu_switch [48]
% [Only lines with self% shown from here on -- mostly for kernel threads]
%   0.0     65.037    0.002        0   12.21%           bintr [120]
%   0.0     65.098    0.001        0  100.00%           __mcount [316]
%   0.0     65.107    0.001        0    0.48%           ithread_loop [39]
%   0.0     65.111    0.001        0   60.26%           scheduler [293]
%   0.0     65.126    0.001        0  100.00%           __mexitcount [361]
%   0.0     65.147    0.000        0    0.11%           random_kthread [77]
%   0.0     65.150    0.000        0    0.24%           sched_sync [156]
%   0.0     65.151    0.000        0    0.00%           idle_proc [13]
%   0.0     65.152    0.000        0    0.15%           buf_daemon [197]
%   0.0     65.152    0.000        0    0.13%           vnlru_proc [196]
%   0.0     65.152    0.000        0    0.01%           schedcpu_thread [60]
%   0.0     65.153    0.000        0    0.12%           vm_pageout [283]
%   0.0     65.153    0.000        0    0.10%           poll_idle [284]
%   0.0     65.153    0.000        0    0.00%           btrap [1]

I forget what this was profiling.  It was for a kernel long ago and
far away from -current.

All threads that make a function call show up as having a self% (even
if the percentage is 0, since the profiling clock is an i386 TSC so
the unscaled counts are nonzero since a function call takes more than
0 cycles).  Leaf functions for profiling support always have a self
usage of 100% because they are not profiled in the normal way so they
appear to have no callers (and being leaves, have no children).  "user"
(userland) has a self usage of 100% because its callers are not seen
and its children are handled specially under Xsyscall, btrap and bintr
for historical and efficiency reasons.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051008011750.E58285>