Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 19 Oct 2003 17:49:50 +0200
From:      Shill <devnull@kma.eu.org>
To:        Bruce M Simpson <bms@spc.org>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: Why is PCE not set in CR4?
Message-ID:  <3F92B29E.7090604@kma.eu.org>
In-Reply-To: <20031018170010.GG7662@saboteur.dek.spc.org>
References:  <3F7AA0D8.1080801@kma.eu.org> <20031018170010.GG7662@saboteur.dek.spc.org>

next in thread | previous in thread | raw e-mail | index | archive | help
>> I have read the perfmon documentation and source code. For several 
>> reasons, I do not think it is totally adequate in my situation.
>> 
>> It was designed in 1996 with the Pentium Pro in mind, which, 
>> apparently, only has two performance counters:
>>
>>  #define NPMC 2
>>  if (pmc < 0 || pmc >= NPMC) return EINVAL;
>> 
>> Assume I get perfmon to work with my K7's 4 performance-monitoring 
>> counters. Since PCE is not set, I am not allowed to call RDPMC from 
>> ring 3. I have to make a system call, just to read the counters.
> 
> I've since read over perfmon and some notes on using performance monitoring
> counters in "The Indispensable PC Hardware Book".
> 
> It looks to me as though perfmon *will* do what you want. There isn't
> really any need to reinvent the wheel. If you want to configure *all* your
> PMCs to read particular events, then the best way to do this is as follows:
> 
> Conditionalise the PMC allocation code in perfmon.c to use a boot-time
> tunable, or an int, which is set by the identcpu.c code. Allocate the PMC
> structures in perfmon.c at boot-time (or preferably module init time).
> 
> Then, add the necessary code to perfmon_init() and a new writectlXX()
> function pertaining to the particular Athlon you're using.
> 
>> I will pay in terms of computation overhead to process a system 
>> call, instead of a single instruction. But more importantly, it will 
>> wreck the cache, and possibly the TLB.
>> 
>> There is no point in monitoring an event if the monitoring tools 
>> disturb the environment too much.
> 
> Ignore the patch I sent previously. perfmon is i386 specific anyway, so
> hacking perfmon.c is acceptable. What I would suggest instead is to add
> two new ioctls to perfmon to do this.
> 
>  PMIOGPCE get pce bit value on current CPU
>  PMIOSPCE set pce bit value on current CPU (if superuser)
> 
> This will allow you to set PMC enable on and off for the uniprocessor
> case OK, and let you use RDPMC from ring 3. This is not valid for the
> SMP case, however.
> 
> Unless you can achieve CPU binding (not affinity) with one of the current
> scheduler(s) then reading the counters is likely to yield useless results
> if your code spins across CPUs in an SMP system.
> 
> An IPI of some kind will be necessary if you want to tell all processors
> to turn on their PCE bit at the same time. peter@freebsd.org is a good
> guy to ask about this sort of thing.
> 
> I'd like to know how you're progressing with this.

Hello Bruce,

I must confess that I am somewhat intimidated by the prospect of
hacking the FreeBSD kernel, as I've never done it. There are several
things inside perfmon.c which I do not quite understand.

Right now, I went the "quick and dirty" route (shame on me). I wrote
a tiny kernel module which sets PCE in CR4, and writes 4 values to
my Athlon's 4 event select registers.

I would be happy to take this opportunity to contribute to FreeBSD,
and code something nice which makes it into the kernel. I might just
need a little help along the way. Can I send my questions to the list?

Terry Lambert said:

  PCE counters are a scarce resource, and the kernel needs
  to run interference on their allocation and deallocation
  by user space applications, to avoid collisions between
  applications; this is the same reason we have AGP and
  sound card device drivers in the kernel.

It seems to me that every application should get its own set of
performance-monitoring counters and event select registers, the same
way each application gets its own set of general-purpose registers.
Each application can then monitor itself, without any interference
from other applications.

In other words, and in my opinion, the kernel should save and
restore PMCs and event select registers for each application.

At work, I use Linux/IA-64 and Stephane Eranian's excellent
performance monitor framework. I believe he patched the Linux kernel
to save and restore the performance registers.

http://www.hpl.hp.com/research/linux/perfmon/perfmon.php4

This also makes the SMP situation easier: even if a process runs on
different CPUs, since the performance registers are restored, the
numbers do make sense.

Now, let me remind you all that I don't have any experience with
kernel hacking, so maybe I am overlooking some serious hurdles?

Random question: perfmon makes its functions available through ioctl
requests. Could I change that to system calls? Perhaps there are
some serious drawbacks?

[ Please note that the kma.eu.org domain is a spam honeypot. Nobody
reads the mail sent to it. I have subscribed to the list. ]

Shill



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3F92B29E.7090604>