From owner-svn-src-head@FreeBSD.ORG Mon Jun 20 12:03:52 2011 Return-Path: Delivered-To: svn-src-head@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2B9571065675; Mon, 20 Jun 2011 12:03:52 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by mx1.freebsd.org (Postfix) with ESMTP id ADE6D8FC1D; Mon, 20 Jun 2011 12:03:51 +0000 (UTC) Received: from c122-106-165-191.carlnfd1.nsw.optusnet.com.au (c122-106-165-191.carlnfd1.nsw.optusnet.com.au [122.106.165.191]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p5KC3hvD007146 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 20 Jun 2011 22:03:44 +1000 Date: Mon, 20 Jun 2011 22:03:43 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Peter Jeremy In-Reply-To: <20110620090640.GA64900@server.vk2pj.dyndns.org> Message-ID: <20110620213851.D1479@besplex.bde.org> References: <201106081938.p58JcWuB044252@svn.freebsd.org> <20110609055112.P2870@besplex.bde.org> <201106081913.09272.jkim@FreeBSD.org> <20110618210815.W889@besplex.bde.org> <20110620090640.GA64900@server.vk2pj.dyndns.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, Bruce Evans Subject: Re: svn commit: r222866 - head/sys/x86/x86 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Jun 2011 12:03:52 -0000 On Mon, 20 Jun 2011, Peter Jeremy wrote: > On 2011-Jun-18 22:05:06 +1000, Bruce Evans wrote: >> My clock measurement program (mostly an old program by Wollman) shows >> the following histogram of times for a non-invariant TSC timecounter >> on a 2GHz UP system: >> >> % min 273, max 265102, mean 273.998217, std 79.069534 >> % 1th: 273 (1727219 observations) >> % 2th: 274 (265607 observations) >> % 3th: 275 (6984 observations) >> % 4th: 280 (11 observations) >> % 5th: 290 (8 observations) >> >> The variance is small, and differences of a single nS can be seen clearly. > > Unfortunately, Intel broke this in their P-state invariant TSC > implementation. Rather than incrementing the TSC by one at the > CPU core frequency, they increment by the core multiplier at the > FSB frequency. This gives a result like the following on my Atom > N270: > delta samples > 24 49637124 > 36 50312540 > 48 44658 > 60 77 > > This makes it virtually impossible to measure short periods. > > Luckily, AMD seem to have gotten this right. I tested a FreeBSD cluster machine in userland, since it doesn't have a usable TSC timecounter (iterating $(sysctl kern.timcounter...) is too slow. %%% #include #include #include static unsigned buf[17]; static volatile unsigned v; int main(void) { int i; for (i = 0; i < 17; i++) buf[i] = rdtsc(); for (i = 0; i < 16; i++) printf("%u\n", buf[i + 1] - buf[i]); buf[0] = rdtsc(); for (i = 0; i < 1000000; i++) v = rdtsc(); printf("%.1f\n", (v - buf[0]) / 1e6); return (0); } %%% Output: 77 63 63 70 63 63 63 70 63 63 70 63 63 63 70 63 65.2 %%% It seems to always give a multiple of 7, so that might be the multiplier. 63 is also a lot, and limits the resulotion to ~34 nS at 1.86GHz. On an original Athlon64: %%% 34 8 5 8 5 8 5 8 5 8 5 8 5 8 5 8 6.5 %%% Phenom specs say 42 instead of ~6.5 IIRC. Only slightly better than 63. This is execution latencu, but although rdtsc is non-serialzied, there is only 1 of it at least on old CPUs, so it can never deliver results faster than its latency, on average. The 5's in the above seem to be lower than the latency, due to the 8's being delivered late. I normally write tests like the above in asm to get more control over the loop overhead, but the above behaviour is interesting since it is what will happen for normal unsynchronized use of rdtsc. Bruce