Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 20 Dec 2007 01:54:21 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Mark Fullmer <maf@eng.oar.net>
Cc:        freebsd-net@freebsd.org, freebsd-stable@freebsd.org, Bruce Evans <brde@optusnet.com.au>
Subject:   Re: Packet loss every 30.999 seconds
Message-ID:  <20071220011626.U928@besplex.bde.org>
In-Reply-To: <CD187AD1-8712-418F-9F49-FA3407BA1AC7@eng.oar.net>
References:  <D50B5BA8-5A80-4370-8F20-6B3A531C2E9B@eng.oar.net> <20071217102433.GQ25053@tnn.dglawrence.com> <CD187AD1-8712-418F-9F49-FA3407BA1AC7@eng.oar.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 18 Dec 2007, Mark Fullmer wrote:

> A little progress.
>
> I have a machine with a KTR enabled kernel running.
>
> Another machine is running David's ffs_vfsops.c's patch.
>
> I left two other machines (GENERIC kernels) running the packet loss test
> overnight.  At ~ 32480 seconds of uptime the problem starts.  This is really

Try it with "find / -type f >/dev/null" to duplicate the problem almost
instantly.

> marks are the intervals between test runs.  The window of missing packets
> (timestamps between two packets where a sequence number is missing)
> is usually less than 4us, altough I'm not sure gettimeofday() can be
> trusted for measuring this.  See https://www.eng.oar.net/~maf/bsd6/p3.png

gettimeofday() can normally be trusted to better than 1 us for time
differences of up to about 1 second.  However, gettimeofday() should
not be used in any program written after clock_gettime() became standard
in 1994.  clock_gettime() has a resolution of 1 ns.  It isn't quite
that accurate on current machines, but I trust it to measure differences
of 10 nsec between back to back clock_gettime() calls here.  Sample
output from wollman@'s old clock-watching program converted to
clock_gettime():

%%%
2007/12/05 (TSC) bde-current, -O2 -mcpu=athlon-xp
min 238, max 99730, mean 240.025380, std 77.291436
1th: 239 (1203207 observations)
2th: 240 (556307 observations)
3th: 241 (190211 observations)
4th: 238 (50091 observations)
5th: 242 (20 observations)

2007/11/23 (TSC) bde-current
min 247, max 11890, mean 247.857786, std 62.559317
1th: 247 (1274231 observations)
2th: 248 (668611 observations)
3th: 249 (56950 observations)
4th: 250 (23 observations)
5th: 263 (8 observations)

2007/05/19 (TSC) plain -current-noacpi
min 262, max 286965, mean 263.941187, std 41.801400
1th: 264 (1343245 observations)
2th: 263 (626226 observations)
3th: 265 (26860 observations)
4th: 262 (3572 observations)
5th: 268 (8 observations)

2007/05/19 (TSC) plain -current-acpi
min 261, max 68926, mean 279.848650, std 40.477440
1th: 261 (999391 observations)
2th: 320 (473325 observations)
3th: 262 (373831 observations)
4th: 321 (148126 observations)
5th: 312 (4759 observations)

2007/05/19 (ACPI-fast timecounter) plain -current-acpi
min 558, max 285494, mean 827.597038, std 78.322301
1th: 838 (1685662 observations)
2th: 839 (136980 observations)
3th: 559 (72160 observations)
4th: 837 (48902 observations)
5th: 558 (31217 observations)

2007/05/19 (i8254) plain -current-acpi
min 3352, max 288288, mean 4182.774148, std 257.977752
1th: 4190 (1423885 observations)
2th: 4191 (440158 observations)
3th: 3352 (65261 observations)
4th: 5028 (39202 observations)
5th: 5029 (15456 observations)
%%%

"min" here gives the minimum latency of a clock_gettime() syscall.
The improvement from 247 nsec to 240 nsec in the "mean" due to -O2
-march-athlon-xp can be trusted to be measured very accurately since
it is an average over more than 100 million trials, and the improvement
from 247 nsec to 238 nsec for "min" can be trusted because it is
consistent with the improvement in the mean.

The program had to be converted to use clock_gettime() a few years
ago when CPU speeds increased so much that the correct "min" became
significantly less than 1.  With gettimeofday(), it cannot distinguish
between an overhead of 1 ns and an overhead of 1 us.

For the ACPI and i8254 timecounter, you can see that the low-level
timecounters have a low frequency clock from the large gaps between
the observations.  There is a gap of 279-280 ns for the acpi timecounter.
This is the period of the acpi timecounter's clock (frequency
14318182/4 = period 279.3651 ns.  Since we can observe this period to
within 1 ns, we must have a basic accuracy of nearly 1 ns, but if we
make only 2 observations we are likely to have an inaccuracy of 279
ns due to the granularity of the clock.  The TSC has a clock granuarity
of 6 ns on my CPU, and delivers almost that much accuracy with only
2 observations, but technical problems prevent general use of the TSC.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071220011626.U928>