Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Feb 2013 17:57:48 +0200
From:      Alexander Motin <mav@FreeBSD.org>
To:        attilio@FreeBSD.org
Cc:        svn-src-projects@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r247319 - in projects/calloutng/sys: kern sys
Message-ID:  <512CDB7C.5050905@FreeBSD.org>
In-Reply-To: <CAJ-FndAGgOMa5n4DH1S1rz4gMwWo9WdXNa57YMcDPKpV%2BnQT%2BQ@mail.gmail.com>
References:  <201302261525.r1QFPhLt058080@svn.freebsd.org> <CAJ-FndDunwgsV4FVFNhDcmQm8L21YMrGANjh8caUqL-v4qvhDQ@mail.gmail.com> <512CD8D7.60306@FreeBSD.org> <CAJ-FndAGgOMa5n4DH1S1rz4gMwWo9WdXNa57YMcDPKpV%2BnQT%2BQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 26.02.2013 17:49, Attilio Rao wrote:
> On Tue, Feb 26, 2013 at 4:46 PM, Alexander Motin <mav@freebsd.org> wrote:
>> On 26.02.2013 17:28, Attilio Rao wrote:
>>> On Tue, Feb 26, 2013 at 4:25 PM, Alexander Motin <mav@freebsd.org> wrote:
>>>> Author: mav
>>>> Date: Tue Feb 26 15:25:43 2013
>>>> New Revision: 247319
>>>> URL: http://svnweb.freebsd.org/changeset/base/247319
>>>>
>>>> Log:
>>>>   Optimize callout_process() to use less variables and less conditions to
>>>>   implement the same logic.  Now it fits better into CPU registers, and
>>>>   according to PMC significntly reduces number of resource stalls, reducing
>>>>   consumed by it CPU time during usleep(1) benchmark by 30%.
>>>
>>> Is that all improved i-cache capacity and improved dynamic branch
>>> prediction (hwpmc has counters for both FWIW)?
>>
>> I-cache capacity I think is not significant there as the loop is quite
>> small. I believe it was branch misprediction, complicated by additional
>> latency of memory accesses. I haven't analyzed cause deeper, as PMC man
>> pages are not the most informative and easiest reading.
> 
> Well, I-cache is really very small, so I think you may get some
> improvement also for the function you were trying to optimize.
> You can get all the counter description by doing: pmccontrol -L
> From there you may find some hwpmc counter showing i-cache and dynamic
> branch prediction misses statistics.

I've noticed that even without any branching changes removal of one
variable, allowing compiler to reuse the register (checked in assembler
sources), gave measurable result. I think it would not happen if the
cause was on instruction fetching side. But sure, I'll continue
experiments with HWPMC.

-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?512CDB7C.5050905>