Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 29 Jul 2006 07:01:55 +1000
From:      Peter Jeremy <peterjeremy@optushome.com.au>
To:        Brian Candler <B.Candler@pobox.com>
Cc:        freebsd-current@freebsd.org
Subject:   Re: vmstat's entries type
Message-ID:  <20060728210154.GC748@turion.vk2pj.dyndns.org>
In-Reply-To: <20060728134701.GA45273@uk.tiscali.com>
References:  <200607251254.k6PCsBef092737@lurza.secnetix.de> <200607271058.13055.jhb@freebsd.org> <20060728121525.GA44917@uk.tiscali.com> <200607280928.36573.jhb@freebsd.org> <20060728134701.GA45273@uk.tiscali.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--qtZFehHsKgwS5rPz
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, 2006-Jul-28 14:47:01 +0100, Brian Candler wrote:
>On Fri, Jul 28, 2006 at 09:28:36AM -0400, John Baldwin wrote:
>> 	lock incl counter
>> 	jnc 1f
>> 	lock incl counter+4
>> 1:

This approach still requires the reader to loop with something like
	do {
		a.lo =3D counter.lo;
		a.hi =3D counter.hi;
		b.lo =3D counter.lo;
		b.hi =3D counter.hi;
	} while (a.hi !=3D b.hi || a.lo > b.lo);
to ensure that the reader doesn't read the middle of an update.

>The 'polling' argument says just do
>    lock incl counter
>and poll all counters every 5 minutes, looking for a wrap. I think that's
>almost certainly going to be cheaper, as long as you can keep track of whe=
re
>all these counters are located.

lock prefixes are always going to be extremely expensive on a MP
system because they require physical bus cycles.  RISC architectures
usually only have TAS lock primitives (because "inc mem" doesn't
exist) and so require a spinlock to perform an atomic update.

In a MP configuration where it doesn't particularly matter if a
particular update gets counted this time or next time, I think the
cheapest option is to have per-CPU 32-bit counters (so no locks are
needed to update the counters) with a polling function to accumulate
all the individual counters into a 64-bit total.  This pushes the cost
=66rom the update (very frequent) into the read (which is relatively
infrequent), for a lower overall cost.

This turns the update into something like:
	PCPU_SET(counter, PCPU_GET(counter)+1);
or
	incl	%fs:counter
(no locks or atomic operations)

Whilst the poll/read pseudo code looks something like
	lock counter
	foreach cpu {
		uint32 a =3D cpu->counter;
		uint32 b =3D cpu->last_counter;
		uint32 c =3D counter.lo;
		if (b > a)
			counter.hi++;
		counter.lo +=3D a - b;
		if (counter.lo < c)
			counter.hi++;
		cpu->last_counter =3D a;
	}
	unlock counter;
(the lock prevents multiple readers updating counter simultaneously).

You execute this whenever a reader wants the counter value (eg via
SYSCTL_PROC), as well as a rate sufficient to prevent missing wraps
(eg every 2 seconds for a 10g byte counter).  This rate is sufficiently
lower than the update rate to make the whole exercise worthwhile.

--=20
Peter Jeremy

--qtZFehHsKgwS5rPz
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (FreeBSD)

iD8DBQFEyntC/opHv/APuIcRAqJ2AJ4k3tbyma4jFGQOuv5eoxS0vP6BJwCfU4WS
kC7zjOPnIFrdBGhkZ4+NMIM=
=sWWy
-----END PGP SIGNATURE-----

--qtZFehHsKgwS5rPz--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060728210154.GC748>