Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 20 Oct 2005 15:39:20 +0800
From:      David Xu <davidxu@freebsd.org>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        cvs-src@freebsd.org, Scott Long <scottl@samsco.org>, src-committers@freebsd.org, Andrew Gallatin <gallatin@cs.duke.edu>, cvs-all@freebsd.org
Subject:   Re: cvs commit: src/sys/amd64/amd64 cpu_switch.S machdep.c
Message-ID:  <435749A8.5070309@freebsd.org>
In-Reply-To: <20051020145234.H99720@delplex.bde.org>
References:  <200510172310.j9HNAVPL013057@repoman.freebsd.org> <20051018094402.A29138@grasshopper.cs.duke.edu> <435501B9.4070401@samsco.org> <17237.1482.52148.283282@grasshopper.cs.duke.edu> <4355080C.302@samsco.org> <20051020145234.H99720@delplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Bruce Evans wrote:
> On Tue, 18 Oct 2005, Scott Long wrote:
> 
> [Excessive quoting retained since I want to comment on separate points.]
> 
>> Andrew Gallatin wrote:
>>
>>> Scott Long writes:
>>>  > Andrew Gallatin wrote:
>>>  > > David Xu [davidxu@FreeBSD.org] wrote:
>>>  > >  > >>davidxu     2005-10-17 23:10:31 UTC
>>>  > >>
>>>  > >>  FreeBSD src repository
>>>  > >>
>>>  > >>  Modified files:
>>>  > >>    sys/amd64/amd64      cpu_switch.S machdep.c  > >>  Log:
>>>  > >>  Micro optimization for context switch. Eliminate code for 
>>> saving gs.base
>>>  > >>  and fs.base. We always update pcb.pcb_gsbase and pcb.pcb_fsbase
>>>  > >>  when user wants to set them, in context switch routine, we 
>>> only need to
>>>  > >>  write them into registers, we never have to read them out from 
>>> registers
>>>  > >>  when thread is switched away. Since rdmsr is a serialization 
>>> instruction,
>>>  > >>  micro benchmark shows it is worthy to do.
> 
> 
>>>  > >  > >  > > Nice.  This reduces lmbench context switch latency by 
>>> about 0.4us (7.2
>>>  > > -> 6.8us), and reduces TCP loopback latency by about 0.9us (36.1 ->
>>>  > > 35.2) on my dual core 3800+
> 
> 
> I wonder if this reduces the context switch latency from about 1.320
> usec to 0.900 usec on my A64-3000.  The latency is only .520 usec in
> i386 mode.  I use a TSC timecounter of course.
> 
> The fastest loopback latency that I've seen is 5.638 usec under
> Linux-2.2.9 on the same machine.  In Linux-2.6.10, it has regressed
> to 17.1 usec.  In FreeBSD last year, it was 10.8 usec on the same
> machine in i386 mode and 19.0 in amd64 mode.  So the A64 can almost
> keep up with an AXP-1400 running a pre-SMPng version of FreeBSD where
> it was 9.94 usec.

we can avoid reloading userland GS.base MSR and FS.base MSR for system
threads, I am not sure if it can reduce interrupt thread latency.

David Xu





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?435749A8.5070309>