From owner-freebsd-net@FreeBSD.ORG Wed Dec 19 14:54:30 2007 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BAFFA16A420; Wed, 19 Dec 2007 14:54:30 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail12.syd.optusnet.com.au (mail12.syd.optusnet.com.au [211.29.132.193]) by mx1.freebsd.org (Postfix) with ESMTP id 5679B13C468; Wed, 19 Dec 2007 14:54:30 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c211-30-219-213.carlnfd3.nsw.optusnet.com.au [211.30.219.213]) by mail12.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id lBJEsLqA004385 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 20 Dec 2007 01:54:22 +1100 Date: Thu, 20 Dec 2007 01:54:21 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Mark Fullmer In-Reply-To: Message-ID: <20071220011626.U928@besplex.bde.org> References: <20071217102433.GQ25053@tnn.dglawrence.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org, freebsd-stable@freebsd.org, David G Lawrence Subject: Re: Packet loss every 30.999 seconds X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 14:54:30 -0000 On Tue, 18 Dec 2007, Mark Fullmer wrote: > A little progress. > > I have a machine with a KTR enabled kernel running. > > Another machine is running David's ffs_vfsops.c's patch. > > I left two other machines (GENERIC kernels) running the packet loss test > overnight. At ~ 32480 seconds of uptime the problem starts. This is really Try it with "find / -type f >/dev/null" to duplicate the problem almost instantly. > marks are the intervals between test runs. The window of missing packets > (timestamps between two packets where a sequence number is missing) > is usually less than 4us, altough I'm not sure gettimeofday() can be > trusted for measuring this. See https://www.eng.oar.net/~maf/bsd6/p3.png gettimeofday() can normally be trusted to better than 1 us for time differences of up to about 1 second. However, gettimeofday() should not be used in any program written after clock_gettime() became standard in 1994. clock_gettime() has a resolution of 1 ns. It isn't quite that accurate on current machines, but I trust it to measure differences of 10 nsec between back to back clock_gettime() calls here. Sample output from wollman@'s old clock-watching program converted to clock_gettime(): %%% 2007/12/05 (TSC) bde-current, -O2 -mcpu=athlon-xp min 238, max 99730, mean 240.025380, std 77.291436 1th: 239 (1203207 observations) 2th: 240 (556307 observations) 3th: 241 (190211 observations) 4th: 238 (50091 observations) 5th: 242 (20 observations) 2007/11/23 (TSC) bde-current min 247, max 11890, mean 247.857786, std 62.559317 1th: 247 (1274231 observations) 2th: 248 (668611 observations) 3th: 249 (56950 observations) 4th: 250 (23 observations) 5th: 263 (8 observations) 2007/05/19 (TSC) plain -current-noacpi min 262, max 286965, mean 263.941187, std 41.801400 1th: 264 (1343245 observations) 2th: 263 (626226 observations) 3th: 265 (26860 observations) 4th: 262 (3572 observations) 5th: 268 (8 observations) 2007/05/19 (TSC) plain -current-acpi min 261, max 68926, mean 279.848650, std 40.477440 1th: 261 (999391 observations) 2th: 320 (473325 observations) 3th: 262 (373831 observations) 4th: 321 (148126 observations) 5th: 312 (4759 observations) 2007/05/19 (ACPI-fast timecounter) plain -current-acpi min 558, max 285494, mean 827.597038, std 78.322301 1th: 838 (1685662 observations) 2th: 839 (136980 observations) 3th: 559 (72160 observations) 4th: 837 (48902 observations) 5th: 558 (31217 observations) 2007/05/19 (i8254) plain -current-acpi min 3352, max 288288, mean 4182.774148, std 257.977752 1th: 4190 (1423885 observations) 2th: 4191 (440158 observations) 3th: 3352 (65261 observations) 4th: 5028 (39202 observations) 5th: 5029 (15456 observations) %%% "min" here gives the minimum latency of a clock_gettime() syscall. The improvement from 247 nsec to 240 nsec in the "mean" due to -O2 -march-athlon-xp can be trusted to be measured very accurately since it is an average over more than 100 million trials, and the improvement from 247 nsec to 238 nsec for "min" can be trusted because it is consistent with the improvement in the mean. The program had to be converted to use clock_gettime() a few years ago when CPU speeds increased so much that the correct "min" became significantly less than 1. With gettimeofday(), it cannot distinguish between an overhead of 1 ns and an overhead of 1 us. For the ACPI and i8254 timecounter, you can see that the low-level timecounters have a low frequency clock from the large gaps between the observations. There is a gap of 279-280 ns for the acpi timecounter. This is the period of the acpi timecounter's clock (frequency 14318182/4 = period 279.3651 ns. Since we can observe this period to within 1 ns, we must have a basic accuracy of nearly 1 ns, but if we make only 2 observations we are likely to have an inaccuracy of 279 ns due to the granularity of the clock. The TSC has a clock granuarity of 6 ns on my CPU, and delivers almost that much accuracy with only 2 observations, but technical problems prevent general use of the TSC. Bruce