Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 16 Nov 2003 07:19:53 -0500 (EST)
From:      Jeff Roberson <jroberson@chesapeake.net>
To:        Luigi Rizzo <rizzo@icir.org>
Cc:        cvs-src@FreeBSD.org
Subject:   Re: cvs commit: src/sys/netinet in_var.h ip_fastfwd.c ip_flow.c ip_flow.h ip_input.c ip_output.c src/sys/sys mbuf.h src/sys/conf files src/sys/net if_arcsubr.c if_ef.c if_ethersubr.c if_fddisubr.c if_
Message-ID:  <20031116071813.S10222-100000@mail.chesapeake.net>
In-Reply-To: <20031116003939.A10853@xorpc.icir.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Sun, 16 Nov 2003, Luigi Rizzo wrote:

> On Sun, Nov 16, 2003 at 12:48:14PM +1100, Peter Jeremy wrote:
> ...
> > >I will try to measure that with more precision. You did have
> > >code which was able to record and timestamp events several
> > >thousand times per second. Do still have that code somewhere?
>
> there is some MD code in the RELENG_4 tree, the kernel option
> you need is "options KERN_TIMESTAMP" and a description on how
> to use it is in sys/i386/include/param.h
>
> Note that you have to be careful when timing very short sections
> of code, because the CPU can execute instruction out of order
> so something like
>
> 	rdtsc();
> 	<short section of code>
> 	rdtsc()
>
> might result in the second timing call being executed before
> the section of code in the middle is complete. There is
> some nonintuitive instruction (which i now forget) to flush the
> execution pipeline which can be used around the section of
> code you want to time.

Reading the tsc is also a serializing instruction.  I often use it to
accurately measure things that take as few as 20 cycles

> In an SMP environment this code probably will not work well
> because it does not consider parallel access to the counters,
> nor the CPU where events occur (probably you could use a
> per-cpu index, and then in many cases you only care about
> relative measurements of events on the same CPU).
>
> 	cheers
> 	luigi
>
> > I've done similar things a couple of times using circular buffers
> > along the following lines:
> >
> > #define	RING_SIZE	(1 << some_suitable_value)
> > int	next_entry;
> > struct entry {
> > 	some_time_t	now;
> > 	foo_t		event;
> > }	ring[RING_SIZE];
> >
> > void __inline insert_event(foo_t event)
> > {
> > 	int	ix;
> > 	/* following two lines need to be atomic to make this re-entrant */
> > 	ix = next_entry;
> > 	next_entry = (ix + 1) & (RING_SIZE - 1);
> > 	ring[ix].now = read_time();
> > 	ring[ix].event = event;
> > }
> >
> > In userland, mmap(2) next_entry and ring to unload the events.  Pick
> > RING_SIZE and the time types to suit requirements.  The TSC has the
> > lowest overhead but worst jitter.
> >
> > Peter
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031116071813.S10222-100000>