Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 25 Jun 2013 13:24:32 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, Gleb Smirnoff <glebius@FreeBSD.org>, src-committers@FreeBSD.org
Subject:   Re: svn commit: r252032 - head/sys/amd64/include
Message-ID:  <20130625130051.W1285@besplex.bde.org>
In-Reply-To: <20130625102023.K899@besplex.bde.org>
References:  <20130621081116.E1151@besplex.bde.org> <20130621090207.F1318@besplex.bde.org> <20130621064901.GS1214@FreeBSD.org> <20130621184140.G848@besplex.bde.org> <20130621135427.GA1214@FreeBSD.org> <20130622110352.J2033@besplex.bde.org> <20130622124832.S2347@besplex.bde.org> <20130622174921.I3112@besplex.bde.org> <20130623073343.GY91021@kib.kiev.ua> <20130623181458.J2256@besplex.bde.org> <20130624170849.GH91021@kib.kiev.ua> <20130625102023.K899@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 25 Jun 2013, I wrote:

> My current best design:
> - use ordinary mutexes to protect counter fetches in non-per-CPU contexts.
> - use native-sized or always 32-bit counters.  Counter updates are done
>  by a single addl on i386.  Fix pcpu.h on arches other than amd64 and
>  i386 and use the same method as there.
> - counter fetches add the native-sized pcpu counters to 64-bit non-pcpu
>  counters, when the native-size counters are in danger of overflowing
>  or always, under the mutex.  Transferring data uses an ordinary
>  atomic_cmpset.  To avoid ifdefs, always use u_int pcpu counters.
>  The 64-bit non-pcpu counters can easily be changed to pcpu counters
>  if the API is extended to support pcpu data.
> - run a daemon every few minutes to fetch all the counters, so that
>  the native-sized counters are in no danger of overflowing on systems
>  that don't run statistics programs often enough to fetch the counters
>  to actually use.

There is at least 1 counter decrement (add -1) in tcp, so the native counters
need to be signed.

> ...
> With my design:
>
> 	extern struct mtx counter_locks[];
> 	extern uint64_t counters[];

This is pseudo-code.  The extra structure must be dynamically allocated
with each counter.  I'm not sure how to do that.  uint64_t_pcpu_zone
is specialized for pcpu counters, and a non-pcpu part is needed.

> 	uint64_t r;
> 	volatile u_int *p;
> 	u_int v;

Change to int.

> 	int cnum;
>
> 	cnum = ctonum(c);
> 	mtx_lock(&counter_locks[cnum]);	/* might not need 1 per counter */
> 	r = counters[cnum];
> 	for (i = 0; i < mp_ncpus; i++) {
> 		p = (u_int *)((char *)c + sizeof(struct pcpu) * i);

Change to int *.

> 		v = *p;			/* don't care if it is stale */
> 		if (v >= 0x80000000) {

Change the critical level to 2 critical levels, 0x40000000 for positive
values and -0x40000000 for negative values.

> 			/* Transfer only when above critical level. */
> 			while (atomic_cmpset_rel_int(p, v, 0) == 0)
> 				v = *p;	/* still don't care if it is stale */
> 			counters[cnum] += v;

Even though full counter values are not expected to become negative,
the native counters can easily become negative when a decrement occurs
after the transfer resets them to 0.

> 		}
> 		r += v;
> 	}
> 	mtx_unlock(&counter_locks[cnum]);
> 	return (r);
>
> Mutexes give some slowness in the fetching code, but fetches eare expected
> to be relatively rare.
>
> I added the complication to usually avoid atomic ops at the last minute.
> The complication might not be worth it.

The complication is to test v so as to normally avoid doing the transfer
and its atomic ops (and to not use atomic ops for loading v).  The
complication is larger with 2 thresholds.  If we always transferred,
then *p would usually be small and often 0, so that decrementing it
would often make it -1.  This -1 must be transferred by adding -1, not
by adding 0xfffffffff.  Changing the type of the native counter to int
gives this.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130625130051.W1285>