Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Oct 2005 10:25:14 -0400 (EDT)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Scott Long <scottl@samsco.org>
Cc:        cvs-src@FreeBSD.org, src-committers@FreeBSD.org, David Xu <davidxu@FreeBSD.org>, cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/sys/amd64/amd64 cpu_switch.S machdep.c
Message-ID:  <17237.1482.52148.283282@grasshopper.cs.duke.edu>
In-Reply-To: <435501B9.4070401@samsco.org>
References:  <200510172310.j9HNAVPL013057@repoman.freebsd.org> <20051018094402.A29138@grasshopper.cs.duke.edu> <435501B9.4070401@samsco.org>

next in thread | previous in thread | raw e-mail | index | archive | help

Scott Long writes:
 > Andrew Gallatin wrote:
 > > David Xu [davidxu@FreeBSD.org] wrote:
 > > 
 > >>davidxu     2005-10-17 23:10:31 UTC
 > >>
 > >>  FreeBSD src repository
 > >>
 > >>  Modified files:
 > >>    sys/amd64/amd64      cpu_switch.S machdep.c 
 > >>  Log:
 > >>  Micro optimization for context switch. Eliminate code for saving gs.base
 > >>  and fs.base. We always update pcb.pcb_gsbase and pcb.pcb_fsbase
 > >>  when user wants to set them, in context switch routine, we only need to
 > >>  write them into registers, we never have to read them out from registers
 > >>  when thread is switched away. Since rdmsr is a serialization instruction,
 > >>  micro benchmark shows it is worthy to do.
 > > 
 > > 
 > > Nice.  This reduces lmbench context switch latency by about 0.4us (7.2
 > > -> 6.8us), and reduces TCP loopback latency by about 0.9us (36.1 ->
 > > 35.2) on my dual core 3800+
 > > 
 > > It is a shame we can't find a way to use the TSC as a timecounter on
 > > SMP systems.  It seems that about 40% of the context switch time is
 > > spent just waiting for the PIO read of the ACPI-fast or i8254 to
 > > return.
 > > 
 > > 
 > > Drew
 > > 
 > > 
 > > 
 > 
 > The TSC represents the clock rate of the CPU, and thus can vary wildly
 > when thermal and power management controls kick in, and there is no way
 > to know when it changes.  Because of this, I think that it's
 > practically useless on Pentium-Mobile and Pentium-M chips, among many
 > others.  There is also the issue of multiple CPUs having to keep their
 > TSC's somewhat in sync in order to get consistent counting in the
 > system.  The best that you can do is to periodically read a stable
 > counter and try to recalibrate, but then you'll likely start getting
 > wild operational variances.  

As I pointed out in another thread, both linux and solaris do it.
Solaris seems to have a nice algorithm for keeping things in sync, and
accounting for the TSC getting cleared after suspend/resume etc.  At
my level of understanding, this argument is nothing more than "but
Mom, all the other kids are doing it".  I was just hoping that
somebody with real understanding could pick up on it.

 >				 It's a shame that a PIO read is still so
 > expensive.  I'd hate to see just how bad your benchmark becomes when
 > ACPI-slow is used instead of ACPI-fast.

It seems like reading ACPI-fast is "only" 3us or so, but when the ctx
switch is otherwise 4us, it adds up. i8254 is much worse on this
system (6.5us).

 > I wonder if moving to HZ=1000 on amd64 and i386 was really all that good
 > of an idea.  Having preemption in the kernel means that ithreads can run
 > right away instead of having to wait for a tick, and various fixes to
 > 4BSD in the past year have eliminated bugs that would make the CPU wait
 > for up to a tick to schedule a thread.  So all we're getting now is a
 > 10x increase in scheduler overhead, including reading the timecounters.

Yeah.  I moved my back to hz=1000 when I noticed 4000 interrupts/sec
on an idle system.

Drew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?17237.1482.52148.283282>