Date: Fri, 9 Aug 2013 09:17:13 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Mark R V Murray <mark@grondar.org> Cc: Arthur Mesh <arthurmesh@gmail.com>, Steve Kargl <sgk@troutmask.apl.washington.edu>, secteam@FreeBSD.org, freebsd-arch@FreeBSD.org Subject: Re: random(4) plugin infrastructure for mulitple RNG in a modular fashion Message-ID: <20130809081923.N1044@besplex.bde.org> In-Reply-To: <50BE6942-CC39-413C-8E14-C6B93440901B@grondar.org> References: <20130807182858.GA79286@dragon.NUXI.org> <20130807192736.GA7099@troutmask.apl.washington.edu> <CAGE5yCq%2Bs6kYtVYyxi27RAqPmvpV42nNNykm2%2B2x1EJGCihYXw@mail.gmail.com> <5203968D.7060508@freebsd.org> <7018AAA9-0A88-430F-96B7-867E5F529B36@bsdimp.com> <50BE6942-CC39-413C-8E14-C6B93440901B@grondar.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 8 Aug 2013, Mark R V Murray wrote: > I still want to get back something like the original get_cyclecount(); simple and quick. I don't care what its called, but out doesn't need to be the massive thing that the current get_cyclecount() has grown to be on x86. rdtsc(), I think it was. The simple and quick version cannot exist, and never did. The original i386 version was: 1.50 (markm 21-Nov-00): /* 1.50 (markm 21-Nov-00): * Return contents of in-cpu fast counter as a sort of "bogo-time" 1.50 (markm 21-Nov-00): * for non-critical timing. 1.50 (markm 21-Nov-00): */ 1.50 (markm 21-Nov-00): static __inline u_int64_t 1.50 (markm 21-Nov-00): get_cyclecount(void) 1.50 (markm 21-Nov-00): { 1.50 (markm 21-Nov-00): #if defined(I386_CPU) || defined(I486_CPU) 1.50 (markm 21-Nov-00): struct timespec tv; 1.50 (markm 21-Nov-00): 1.50 (markm 21-Nov-00): if ((cpu_feature & CPUID_TSC) == 0) { 1.50 (markm 21-Nov-00): nanotime(&tv); 1.50 (markm 21-Nov-00): return (tv.tv_sec * (u_int64_t)1000000000 + tv.tv_nsec); 1.50 (markm 21-Nov-00): } 1.50 (markm 21-Nov-00): #endif 1.50 (markm 21-Nov-00): return (rdtsc()); 1.50 (markm 21-Nov-00): } This is not so simple, and is unquick if there is no TSC. If I386_CPU or I486_CPU is configured, then it is suboptimal even if there is a TSC. Other arches are even further from always having a TSC. The simple and quvck version would always return 0 or a kernel global like time.tv_nsec if there is no TSC and no other readable freqently changing timer or noise source that can be read almost as fast as memory. It wouldn't guarantee any entropy. The current version is only slightly unsimpler and unquicker: - on amd64, it is still just inline rdtsc() On other versions, the nanotime() in it was first improved to binuptime(). This also gave more noise in the extra low bits, and mixing of the bits made it less abusable as a timer. The latter has been broken on some arches. - on arm, the bits are still mixed by ((sec << 56) | (frac >> 8)) (8 bits of sec and 56 bits of frac. I don't like losing some low bits (it is better to xor things), but the result is fairly unusable as a timer and perhaps there is nothing useful in the low bits on arm (it takes a very high frequency clock like a TSC and/or delicate ntpd adjustments that aren't very noisy to put anything there). The 8-bit seconds count isn't too good when KTR abuses get_cyclecount.(). - on i386, read_cycleount() is still inline, but the inline just calls the function pointer cpu_tick(). If there is a TSC, then cpu_tick points to an un-inline rdtsc() and the result is a slightly pessimzed version of the above if I386_CPU or I486_CPU is configured and a more pessimized version of the above if neither is configured. Otherwise, the result is the accumulated tick count of the currently active timecounter. This is much better for noise in get_cyclecount() and much worse for its primary purpose of timing than is binuptime() with bits mixed to form a timer. The active timecounter can change, and then the frequency and offset of its ticker changes. Its primary use is for process times, and there is some recalibration for this, but this is incomplete and buggy. But for get_cyclecount(), the noise is a feature. The noise from this is bad when KTR abuses get_cyclecount(). Otherwise, this is better for get_cyclecount() than the old binuptime() method. - on ia64, get_cyclecount is #define'd as another function. The declaration and definition of the other function are even more obscure. They are generated by a macro. Standard namespace pollution in sys/systm.h is depended on to join the definitions. - mips is like ia64 except the obfuscation chain is shorter. <machine/cpu.h> provides its own namespace pollution, so sys/systm.h and its pollution aren't depended on... - on powerpc, get_cyclecount() reads a counter using inline asm. It spells the 2 32-bit components of the counter as essentially time._upper and time._lower, so it isn't clear if they are actually times to begin with. - sparc64 uses inline asm to read some register which is hopefully a counter. So, get_cyclecount() is actually simple and quick (except for macros hiding the simplicity) on all arches except arm and old i386. But it is very MD, so it takes a lot of code with different simplicity to support it for all arches. Still better than #ifdefing it wherever it is used. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130809081923.N1044>