Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Oct 2008 22:09:24 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-net@freebsd.org, freebsd-questions@freebsd.org, freebsd-threads@freebsd.org
Subject:   Re: kernel profiling: spinlock_exit consumes 36% CPU time.
Message-ID:  <20081008210104.S20625@delplex.bde.org>
In-Reply-To: <200810070938.04673.jhb@freebsd.org>
References:  <004001c92871$fdec0a10$01000001@china.huawei.com> <200810070938.04673.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-242184313-1223467764=:20625
Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE

On Tue, 7 Oct 2008, John Baldwin wrote:

> On Tuesday 07 October 2008 07:44:00 am =C7=F1=BD=A3 wrote:
>> Hi, folks,
>>
>> I did kernel profiling when a single thread client sends UDP packets to =
a
>> single thread server on the same machine.
>>
>> In the output kernel profile, the first few kernel functions that consum=
es
>> the most CPU time are listed below:
>>
>> granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 second=
s
>>
>>   %   cumulative   self              self     total
>>  time   seconds   seconds    calls  ms/call  ms/call  name
>>  42.4      10.88    10.88        0  100.00%           __mcount [1]
>>  36.1      20.14     9.26 17937541     0.00     0.00  spinlock_exit [4]
>>   4.2      21.22     1.08  3145728     0.00     0.00  in_cksum_skip [40]
>>   1.8      21.68     0.45  7351987     0.00     0.00  generic_copyin [43=
]
>>   1.1      21.96     0.29  3146028     0.00     0.00  generic_copyout [4=
8]
>>   1.0      22.21     0.24  2108904     0.00     0.00  Xint0x80_syscall [=
3]
>>   0.8      22.42     0.21  6292131     0.00     0.00  uma_zalloc_arg [46=
]
>>   0.8      22.62     0.20  1048576     0.00     0.00  soreceive_generic =
[9]
>>
>> It is very strange that spinlock_exit consumes over 36% CPU time while i=
t
>> seems a very simple function.
>
> It's because the intr_restore() re-enables interrupts and the resulting t=
ime
> spent executing the handlers for any pending interrupts are attributed to
> spinlock_exit().

This is one of many defects that are not present in high resolution
kernel profiling (kgmon -B instead of kgmon -b; availaible on amd64
and i386).  However, high resolution kernel profiling doesn't work
right with SMP, and was completely broken by gcc-4.  Ordinary profiling
was less completely broken by gcc-4, and you can recover the old
behaviour by turning off new optimizations (mainly -funit-at-a-time
and/or -finline-functions-called-once and or all of -O2).

Bruce
--0-242184313-1223467764=:20625--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081008210104.S20625>