From owner-freebsd-current@FreeBSD.ORG Mon Aug 30 05:54:51 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CB31B106566C for ; Mon, 30 Aug 2010 05:54:51 +0000 (UTC) (envelope-from jamesbrandongooch@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 398C58FC15 for ; Mon, 30 Aug 2010 05:54:50 +0000 (UTC) Received: by wwb34 with SMTP id 34so4962060wwb.31 for ; Sun, 29 Aug 2010 22:54:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=PP6PC+Lu/jWzC9tusfbBPoxLR69bZXpofzoP6c07GWQ=; b=b9kiKRq4GFvE6E9jqGG8mS8O5UHl2znO0bEKCxuxGdWFIORliZ3KMBV3LRa7LffqpI As0EWqfmVqh9R+3RmQMVmxgRqatBByU620JdXoSB7RjOJ5xzFJJy6fSs7/oIhBfqkd2D 1ZBB5CXk8kjpSCsaWgVpwzNka+cUz5QrNi+bg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=a4AtXaHI2QGR/RbO69pQJ/2DFwi9Pnu3r8NVvq4RK6LR/xCDb1OPfBD1t/Kj0YDKBI GrUDvIMJjzDHIw9QK0emuIvZYaUbbm5vSZ6pKEqCtscCKEHq+J9wrvXR7he7B4N1QbWG 0PbEi7Jv+67ZaEcelveKMwz3WNbUm7I8PLlTk= MIME-Version: 1.0 Received: by 10.216.17.211 with SMTP id j61mr4465600wej.14.1283145898757; Sun, 29 Aug 2010 22:24:58 -0700 (PDT) Received: by 10.216.133.2 with HTTP; Sun, 29 Aug 2010 22:24:58 -0700 (PDT) In-Reply-To: <4C7A5C28.1090904@FreeBSD.org> References: <4C7A5C28.1090904@FreeBSD.org> Date: Mon, 30 Aug 2010 00:24:58 -0500 Message-ID: From: Brandon Gooch To: Alexander Motin Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-hackers@freebsd.org, FreeBSD-Current Subject: Re: One-shot-oriented event timers management X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Aug 2010 05:54:51 -0000 2010/8/29 Alexander Motin : > Hi. > > I would like to present my new work on timers management code. > > In my previous work I was mostly orienting on reimplementing existing > functionality in better way. The result seemed not bad, but after > looking on perspectives of using event timers in one-shot (aperiodic) > mode I've understood that implemented code complexity made it hardly > possible. So I had to significantly cut it down and rewrite from the new > approach, which is instead primarily oriented on using timers in > one-shot mode. As soon as some systems have only periodic timers I have > left that functionality, though it was slightly limited. > > New management code implements two modes of operation: one-shot and > periodic. Specific mode to be used depends on hardware capabilities and > can be controlled. > > In one-shot mode hardware timers programmed to generate single interrupt > precisely at the time of next wanted event. It is done by comparing > current binuptime with next scheduled times of system events > (hard-/stat-/profclock). This approach has several benefits: event timer > precision is now irrelevant for system timekeeping, hard- and statclocks > are not aliased, while only one timer used for it, and the most > important -- it allows us to define which events and when exactly we > really want to handle, without strict dependence on fixed hz, stathz, > profhz periods. Sure, our callout system is highly depends on hz value, > but now at least we can skip interrupts when we have no callouts to > handle at the time. Later we can go further. > > Periodic mode now also uses alike principals of scheduling events. But > timer running in periodic mode just unable to handle arbitrary events > and as soon as event timers may not be synchronized to system > timecounter and may drift from it, causing jitter effects. So I've used > for time source of scheduling the timer events themselves. As result, > periodic timer runs on fixed frequency multiply to hz rate, while > statclock and profclock generated by dividing it respectively. (If > somebody would tell me that hardclock jitter is not really a big > problem, I would happily rip that artificial timekeeping out of there to > simplify code.) Unluckily this approach makes impossible to use two > events timers to completely separate hard- and statclocks any more, but > as I have said, this mode is required only for limited set of systems > without one-shot capable timers. Looking on my recent experience with > different platforms, it is not a big fraction. > > Management code is still handles both per-CPU and global timers. Per-CPU > timers usage is obvious. Global timer is programmed to handle all CPUs > needs. In periodic mode global timer generates periodic interrupts to > some one CPU, while management code then redistributes them to CPUs that > really need it, using IPI. In one-shot mode timer is always programmed > to handle first scheduled event throughout the system. When that > interrupt arrives, it is also getting redistributed to wanting CPUs with > IPI. > > To demonstrate features that could be obtained from so high flexibility > I have incorporated the idea and some parts of dynamic ticks patches of > Tsuyoshi Ozawa. Now, when some CPU goes down into C2/C3 ACPI sleep > state, that CPU stops scheduling of hard-/stat-/profclock events until > the next registered callout event. If CPU wakes up before that time by > some unrelated interrupt, missed ticks are called artificially (it is > needed now to keep realistic system stats). After system is up to date, > interrupt is handled. Now it is implemented only for ACPI systems with > C2/C3 states support, because ACPI resumes CPU with interrupts disabled, > that allows to keep up missed time before interrupt handler or some > other process (in case of unexpected task switch) may need it. As I can > see, Linux does alike things in the beginning of every interrupt handler. > > I have actively tested this code for a few days on my amd64 Core2Duo > laptop and i386 Core-i5 desktop system. With C2/C3 states enabled > systems experience only about 100-150 interrupts per second, having HZ > set to 1000. These events mostly caused by several event-greedy > processes in our tree. I have traced and hacked several most aggressive > ones in this patch: http://people.freebsd.org/~mav/tm6292_idle.patch . > It allowed me to reduce down to as low as 50 interrupts per system, > including IPIs! Here is the output of `systat -vm 1` from my test > system: http://people.freebsd.org/~mav/systat_w_oneshot.txt . Obviously > that with additional tuning the results can be improved even more. > > My latest patch against 9-CURRENT can be found here: > http://people.freebsd.org/~mav/timers_oneshot4.patch > > Comments, ideas, propositions -- welcome! > > Thanks to all who read this. ;) Totally awesome work mav@! One thing I see: Where is *frame pointing to? It isn't initialized in the function, so... +static int +handleevents(struct bintime *now, int fake) { + struct trapframe *frame; + struct pcpu_state *state; + uintfptr_t pc; + int usermode; + int done; - if (doconfigtimer(0)) - return (FILTER_HANDLED); - return (hardclockhandler(frame)); + done = 0; +#ifdef KDTRACE_HOOKS + /* + * If the DTrace hooks are configured and a callback function + * has been registered, then call it to process the high speed + * timers. + */ + if (cyclic_clock_func[curcpu] != NULL) + (*cyclic_clock_func[curcpu])(frame); +#endif Also, for those of us testing, should we "reset" our timer settings back to defaults and work from there[1] (meaning, should we be futzing around with timer event sources, kern.hz, etc...)? Thanks again for tackling these tough, but important issues. I'm looking very forward to testing this out! -Brandon [1] http://wiki.freebsd.org/TuningPowerConsumption