Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Oct 2005 09:50:37 -0700
From:      Nate Lawson <nate@root.org>
To:        Andrew Gallatin <gallatin@cs.duke.edu>
Cc:        cvs-src@FreeBSD.org, Poul-Henning Kamp <phk@phk.freebsd.dk>, src-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/sys/amd64/amd64 cpu_switch.S machdep.c
Message-ID:  <435527DD.3040007@root.org>
In-Reply-To: <17237.286.236279.883806@grasshopper.cs.duke.edu>
References:  <20051018094402.A29138@grasshopper.cs.duke.edu>	<68671.1129643256@critter.freebsd.dk> <17237.286.236279.883806@grasshopper.cs.duke.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
Andrew Gallatin wrote:
> Poul-Henning Kamp writes:
>  > In message <20051018094402.A29138@grasshopper.cs.duke.edu>, Andrew Gallatin wri
>  > tes:
>  > 
>  > >It is a shame we can't find a way to use the TSC as a timecounter on
>  > >SMP systems.  It seems that about 40% of the context switch time is
>  > >spent just waiting for the PIO read of the ACPI-fast or i8254 to
>  > >return.
>  > 
>  > No, the shame is that the scheduler tries to partition time rather
>  > than cpu cycles because that approximation got goldplated in some
>  > random standard years back.
> 
> Sorry if I mi-spoke.  I guess the shame twofold.  
> 
> First we insist on not trying keep the TSC in sync and so we don't use
> it for SMP timekeeping like other OSes do, which means that getting a
> micro-second granularity timestamp is orders of magnitude more
> expensive for us.  To compound the problem, we insist on using the
> expensive non-TSC binuptime() to get a runtime measurement on each
> context switch, rather than being able to use something cheap like
> ticks, or a per-cpu cycle counter.

I have good information that in the near future, most designs will have 
guaranteed synchronized TSC across all CPUs.

> If anybody is looking for low-hanging fruit in the SMP context switch
> path, figuring some acceptable way to avoid reading the ACPI or i8254
> timecounter is it.

The ACPI timecounter involves a 32 bit read from IO space.  The actual 
timecounter is 24 or 32 bits.  Since it's maintained in the chipset and 
has strict requirements for being reliable in many modes of system 
operation (i.e. C3), this read takes a while.

Using it at task switch time is overkill.  As you suggest, it's better 
to use TSC and calibrate via the ACPI timer.  More info on this in my 
next email.

--
Nate



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?435527DD.3040007>