Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 17 Jul 2010 17:25:00 +0200
From:      Marius Strobl <marius@alchemy.franken.de>
To:        Alexander Motin <mav@FreeBSD.org>
Cc:        freebsd-sparc64@FreeBSD.org
Subject:   Re: [RFC] Event timers on sparc64/sun4v
Message-ID:  <20100717152459.GU4706@alchemy.franken.de>
In-Reply-To: <4C40D6F5.6070208@FreeBSD.org>
References:  <4C404018.6040405@FreeBSD.org> <20100716213503.GT4706@alchemy.franken.de> <4C40D6F5.6070208@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Jul 17, 2010 at 01:02:29AM +0300, Alexander Motin wrote:
> Marius Strobl wrote:
> 
> > - using the stick instead of the tick counter for machines with CPUs
> >   and thus tick counters running at different speeds has turned out
> >   to be suboptimal, probably due to the fact that the 12.5MHz the
> >   stick counters typically are driven by don't provide sufficient
> >   granularity.  
> 
> On x86 ACPI HPET timers often run about 15MHz, i8254 - about 1.2MHz.
> What's wrong with 12.5MHz here?

When using the stick counter instead of the tick one on machines
consisting of CPUs running at the same speed everything seems fine
except that top(1) TIME output is implausible. Given that with
this setup the only difference between using the stick and the tick
counter is the frequency at which the counter is driven my best
bet is that the stick counter just doesn't provide sufficient
granularity.
Using the stick counter on machines consisting of CPUs running at
different speeds (well, actually all the combinations of using
stick/tick for hardclock, timecounter, CPU ticker and cycle
counter I tried as they didn't appear totally wrong) additionally
has the problem of processes getting killed as they are diagnosed
to have exceeded their maximum CPU limit, although with the in-tree
code only the timecounter provided by the host-PCI-bridge should
be used for this calculation as far as the MD initialization is
concerned when the stick counter is used to drive hardclock.

> 
> >   Thus the more desireable variant for these machines
> >   probably is to provide the tick counter of the BSP as the only
> >   non-per-CPU timer and forward it to the APs via IPIs. 
> 
> It would be possible if timer was programmable from any CPU. But as I
> understand - it require thread to be binded, which handled by
> infrastructure only for per-CPU timer.

Wouldn't it be sufficient to bind curthread to the BSP in
tick_et_start() in that case? For one-shot mode this probably
is to much overhead (assuming a tickless kernel) but for
periodic mode IMO this approach should be sound.

> 
> >   This also
> >   leaves the stick counter of all >= US-III machines generally
> >   available for driving statclock, which likely is also desirable.
> 
> It would be nice, but I don't know how separate their interrupts.

I think this should be possible in the soft interrupt dispatch.
However, meanwhile it came to my mind that there was a problem
with using the stick counter on US-IIIi machines (which also
only can consist of CPUs running at the same frequency though).

> 
> > - I'd like to keep the tick grace check as this caused problems in
> >   the past. Probably tick_et_start() just should return an error
> >   in this case.
> 
> I think it would be nice to move it to MI code. MI code knows about base
> frequency, so theoretically can adapt to it. May be we could fetch some
> additional info there, if needed. ACPI HPET timer also defines minimal
> reliable period, so solution could/should be common.

Fine

> 
> > - I don't like wasting CPU cycles for determining whether the
> >   workaround for BlackBird CPUs is needed (assuming #1 above is
> >   implemented so distinguishing stick/tick is no longer needed)
> >   with every (s)tick interrupt which are a lot as this just won't
> >   ever change during runtime, i.e. I'd like to keep the different
> >   interrupt handlers which are set up as needed.
> 
> Does it worth code duplication? Won't it be always cached/ predicted/
> prefetched? I have doubt that difference can ever be measured, as this
> function is minor part of things done on interrupt.

I wouldn't be surprised these branches to actually make a measurable
difference; f.e. moving updating the PIL counter from before calling
the tick interrupt handler to incrementing it afterwards reduced the
delay until it's called by 30% on average on a US-II SMP machine, in
turn resulting in a more steady clock and lesser drift which needs
compensation (see r157825). Besides the code already is there, just
don't nuke it :)

Marius




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100717152459.GU4706>