Date: Tue, 18 Oct 2005 10:25:14 -0400 (EDT) From: Andrew Gallatin <gallatin@cs.duke.edu> To: Scott Long <scottl@samsco.org> Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, David Xu <davidxu@FreeBSD.org>, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/amd64/amd64 cpu_switch.S machdep.c Message-ID: <17237.1482.52148.283282@grasshopper.cs.duke.edu> In-Reply-To: <435501B9.4070401@samsco.org> References: <200510172310.j9HNAVPL013057@repoman.freebsd.org> <20051018094402.A29138@grasshopper.cs.duke.edu> <435501B9.4070401@samsco.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Scott Long writes: > Andrew Gallatin wrote: > > David Xu [davidxu@FreeBSD.org] wrote: > > > >>davidxu 2005-10-17 23:10:31 UTC > >> > >> FreeBSD src repository > >> > >> Modified files: > >> sys/amd64/amd64 cpu_switch.S machdep.c > >> Log: > >> Micro optimization for context switch. Eliminate code for saving gs.base > >> and fs.base. We always update pcb.pcb_gsbase and pcb.pcb_fsbase > >> when user wants to set them, in context switch routine, we only need to > >> write them into registers, we never have to read them out from registers > >> when thread is switched away. Since rdmsr is a serialization instruction, > >> micro benchmark shows it is worthy to do. > > > > > > Nice. This reduces lmbench context switch latency by about 0.4us (7.2 > > -> 6.8us), and reduces TCP loopback latency by about 0.9us (36.1 -> > > 35.2) on my dual core 3800+ > > > > It is a shame we can't find a way to use the TSC as a timecounter on > > SMP systems. It seems that about 40% of the context switch time is > > spent just waiting for the PIO read of the ACPI-fast or i8254 to > > return. > > > > > > Drew > > > > > > > > The TSC represents the clock rate of the CPU, and thus can vary wildly > when thermal and power management controls kick in, and there is no way > to know when it changes. Because of this, I think that it's > practically useless on Pentium-Mobile and Pentium-M chips, among many > others. There is also the issue of multiple CPUs having to keep their > TSC's somewhat in sync in order to get consistent counting in the > system. The best that you can do is to periodically read a stable > counter and try to recalibrate, but then you'll likely start getting > wild operational variances. As I pointed out in another thread, both linux and solaris do it. Solaris seems to have a nice algorithm for keeping things in sync, and accounting for the TSC getting cleared after suspend/resume etc. At my level of understanding, this argument is nothing more than "but Mom, all the other kids are doing it". I was just hoping that somebody with real understanding could pick up on it. > It's a shame that a PIO read is still so > expensive. I'd hate to see just how bad your benchmark becomes when > ACPI-slow is used instead of ACPI-fast. It seems like reading ACPI-fast is "only" 3us or so, but when the ctx switch is otherwise 4us, it adds up. i8254 is much worse on this system (6.5us). > I wonder if moving to HZ=1000 on amd64 and i386 was really all that good > of an idea. Having preemption in the kernel means that ithreads can run > right away instead of having to wait for a tick, and various fixes to > 4BSD in the past year have eliminated bugs that would make the CPU wait > for up to a tick to schedule a thread. So all we're getting now is a > 10x increase in scheduler overhead, including reading the timecounters. Yeah. I moved my back to hz=1000 when I noticed 4000 interrupts/sec on an idle system. Drew
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?17237.1482.52148.283282>