From owner-freebsd-current@FreeBSD.ORG Fri Jul 28 21:02:00 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0633316A4DA; Fri, 28 Jul 2006 21:02:00 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from mail29.syd.optusnet.com.au (mail29.syd.optusnet.com.au [211.29.132.171]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4638543D45; Fri, 28 Jul 2006 21:01:58 +0000 (GMT) (envelope-from peterjeremy@optushome.com.au) Received: from turion.vk2pj.dyndns.org (c220-239-19-236.belrs4.nsw.optusnet.com.au [220.239.19.236]) by mail29.syd.optusnet.com.au (8.12.11/8.12.11) with ESMTP id k6SL1uVn004766 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Sat, 29 Jul 2006 07:01:56 +1000 Received: from turion.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by turion.vk2pj.dyndns.org (8.13.6/8.13.6) with ESMTP id k6SL1tPp003672; Sat, 29 Jul 2006 07:01:56 +1000 (EST) (envelope-from peter@turion.vk2pj.dyndns.org) Received: (from peter@localhost) by turion.vk2pj.dyndns.org (8.13.6/8.13.6/Submit) id k6SL1t4T003671; Sat, 29 Jul 2006 07:01:55 +1000 (EST) (envelope-from peter) Date: Sat, 29 Jul 2006 07:01:55 +1000 From: Peter Jeremy To: Brian Candler Message-ID: <20060728210154.GC748@turion.vk2pj.dyndns.org> References: <200607251254.k6PCsBef092737@lurza.secnetix.de> <200607271058.13055.jhb@freebsd.org> <20060728121525.GA44917@uk.tiscali.com> <200607280928.36573.jhb@freebsd.org> <20060728134701.GA45273@uk.tiscali.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="qtZFehHsKgwS5rPz" Content-Disposition: inline In-Reply-To: <20060728134701.GA45273@uk.tiscali.com> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.11 Cc: freebsd-current@freebsd.org Subject: Re: vmstat's entries type X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2006 21:02:00 -0000 --qtZFehHsKgwS5rPz Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, 2006-Jul-28 14:47:01 +0100, Brian Candler wrote: >On Fri, Jul 28, 2006 at 09:28:36AM -0400, John Baldwin wrote: >> lock incl counter >> jnc 1f >> lock incl counter+4 >> 1: This approach still requires the reader to loop with something like do { a.lo =3D counter.lo; a.hi =3D counter.hi; b.lo =3D counter.lo; b.hi =3D counter.hi; } while (a.hi !=3D b.hi || a.lo > b.lo); to ensure that the reader doesn't read the middle of an update. >The 'polling' argument says just do > lock incl counter >and poll all counters every 5 minutes, looking for a wrap. I think that's >almost certainly going to be cheaper, as long as you can keep track of whe= re >all these counters are located. lock prefixes are always going to be extremely expensive on a MP system because they require physical bus cycles. RISC architectures usually only have TAS lock primitives (because "inc mem" doesn't exist) and so require a spinlock to perform an atomic update. In a MP configuration where it doesn't particularly matter if a particular update gets counted this time or next time, I think the cheapest option is to have per-CPU 32-bit counters (so no locks are needed to update the counters) with a polling function to accumulate all the individual counters into a 64-bit total. This pushes the cost =66rom the update (very frequent) into the read (which is relatively infrequent), for a lower overall cost. This turns the update into something like: PCPU_SET(counter, PCPU_GET(counter)+1); or incl %fs:counter (no locks or atomic operations) Whilst the poll/read pseudo code looks something like lock counter foreach cpu { uint32 a =3D cpu->counter; uint32 b =3D cpu->last_counter; uint32 c =3D counter.lo; if (b > a) counter.hi++; counter.lo +=3D a - b; if (counter.lo < c) counter.hi++; cpu->last_counter =3D a; } unlock counter; (the lock prevents multiple readers updating counter simultaneously). You execute this whenever a reader wants the counter value (eg via SYSCTL_PROC), as well as a rate sufficient to prevent missing wraps (eg every 2 seconds for a 10g byte counter). This rate is sufficiently lower than the update rate to make the whole exercise worthwhile. --=20 Peter Jeremy --qtZFehHsKgwS5rPz Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.4 (FreeBSD) iD8DBQFEyntC/opHv/APuIcRAqJ2AJ4k3tbyma4jFGQOuv5eoxS0vP6BJwCfU4WS kC7zjOPnIFrdBGhkZ4+NMIM= =sWWy -----END PGP SIGNATURE----- --qtZFehHsKgwS5rPz--