From owner-freebsd-current@FreeBSD.ORG Fri Jun 5 07:08:39 2015 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BA6AF21F for ; Fri, 5 Jun 2015 07:08:39 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (mail.turbocat.net [IPv6:2a01:4f8:d16:4514::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7F4641452 for ; Fri, 5 Jun 2015 07:08:39 +0000 (UTC) (envelope-from hps@selasky.org) Received: from laptop015.home.selasky.org (cm-176.74.213.204.customer.telag.net [176.74.213.204]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id BDAA81FE023 for ; Fri, 5 Jun 2015 09:08:37 +0200 (CEST) Message-ID: <55714B26.6060802@selasky.org> Date: Fri, 05 Jun 2015 09:09:26 +0200 From: Hans Petter Selasky User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: FreeBSD Current Subject: [CFR] Replacing while loops with proper division and multiplication Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Jun 2015 07:08:39 -0000 Hi, I was going through some timer code and found some unnecessary while loops in kern/kern_clocksource.c . I added some prints and found that during boot, "runs" can exceed 2000, while during regular usage runs is typically 1. Do you think it is worth to convert these loops into division and multiplications? It might make the CPU pipeline a tiny bit faster, having to skip some conditionals? And also possibly improve readability? What do you think? --HPS > Index: kern/kern_clocksource.c > =================================================================== > --- kern/kern_clocksource.c (revision 283606) > +++ kern/kern_clocksource.c (working copy) > @@ -155,10 +155,11 @@ > handleevents(sbintime_t now, int fake) > { > sbintime_t t, *hct; > + sbintime_t runs; > struct trapframe *frame; > struct pcpu_state *state; > int usermode; > - int done, runs; > + int done; > > CTR3(KTR_SPARE2, "handle at %d: now %d.%08x", > curcpu, (int)(now >> 32), (u_int)(now & 0xffffffff)); > @@ -173,12 +174,10 @@ > > state = DPCPU_PTR(timerstate); > > - runs = 0; > - while (now >= state->nexthard) { > - state->nexthard += tick_sbt; > - runs++; > - } > - if (runs) { > + runs = (now - state->nexthard) / tick_sbt; > + if (runs > 0) { > + printf("R%d ", (int)runs); > + state->nexthard += tick_sbt * runs; > hct = DPCPU_PTR(hardclocktime); > *hct = state->nexthard - tick_sbt; > if (fake < 2) { > @@ -186,25 +185,25 @@ > done = 1; > } > } > - runs = 0; > - while (now >= state->nextstat) { > - state->nextstat += statperiod; > - runs++; > + runs = (now - state->nextstat) / statperiod; > + if (runs > 0) { > + printf("S%d ", (int)runs); > + state->nextstat += statperiod * runs; > + if (fake < 2) { > + statclock_cnt(runs, usermode); > + done = 1; > + } > } > - if (runs && fake < 2) { > - statclock_cnt(runs, usermode); > - done = 1; > - } > if (profiling) { > - runs = 0; > - while (now >= state->nextprof) { > - state->nextprof += profperiod; > - runs++; > + runs = (now - state->nextprof) / profperiod; > + if (runs > 0) { > + printf("T%d ", (int)runs); > + state->nextprof += profperiod * runs; > + if (!fake) { > + profclock_cnt(runs, usermode, TRAPF_PC(frame)); > + done = 1; > + } > } > - if (runs && !fake) { > - profclock_cnt(runs, usermode, TRAPF_PC(frame)); > - done = 1; > - } > } else > state->nextprof = state->nextstat; > if (now >= state->nextcallopt) {