Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 1 Dec 2008 14:07:06 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        Kostik Belousov <kostikbel@gmail.com>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, Scott Long <scottl@samsco.org>, src-committers@freebsd.org, Kip Macy <kmacy@freebsd.org>
Subject:   Re: svn commit: r185162 - in head: . sys/amd64/include sys/arm/include sys/conf sys/dev/bce sys/dev/cxgb sys/dev/cxgb/sys sys/dev/cxgb/ulp/iw_cxgb sys/dev/mxge sys/dev/nxge sys/i386/include sys/i386/in...
Message-ID:  <200812011407.06563.jhb@freebsd.org>
In-Reply-To: <20081123154138.GS6408@deviant.kiev.zoral.com.ua>
References:  <200811220555.mAM5tuIJ007781@svn.freebsd.org> <3c1674c90811221651u338294frcdbd99b386733851@mail.gmail.com> <20081123154138.GS6408@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sunday 23 November 2008 10:41:38 am Kostik Belousov wrote:
> On Sun, Nov 23, 2008 at 12:51:58AM +0000, Kip Macy wrote:
> > On Sat, Nov 22, 2008 at 11:08 PM, Scott Long <scottl@samsco.org> wrote:
> > > Kostik Belousov wrote:
> > >>
> > >> On Sat, Nov 22, 2008 at 03:05:22PM -0700, Scott Long wrote:
> > >>>
> > >>> A neat hack would be for the kernel linker to scan the text and do a
> > >>> drop-in replacement of the opcode that is appropriate for the 
platform.
> > >>> I can't see how a CPU_XXX definition would work because it's just a
> > >>> compile time construct, one that can be included with any kernel
> > >>> compile.
> > >>
> > >> Yes, it is possible to do that. Less drastic change is to directly
> > >> check features. I moved slow code to separate section to eliminate
> > >> unconditional jump in fast path.
> > >> Only compile-tested.
> > >>
> > >
> > > As long as it works, I think it's a step in the right direction; I'm
> > > assuming that cpu_feature is a symbol filled in at runtime and not a
> > > macro for the cpuid instruction, right?
> > >
> > > Scott
> > >
> > 
> > i386/include/md_var.h:
> > <..>
> > extern  u_int   cpu_exthigh;
> > extern  u_int   cpu_feature;
> > extern  u_int   cpu_feature2;
> > extern  u_int   amd_feature;
> > extern  u_int   amd_feature2;
> > <...>
> > 
> > I'm not thrilled with it, but we can revisit the issue if it makes a
> > measurable difference on someone's workload.
> 
> Below is the updated patch. It includes changes made after private comments
> by bde@ and uses symbolic definitions for the bits in the features words.
> I thought about accessing a per-CPU word for serialized instruction in the
> slow path, but decided that it does not beneficial.\

Is the branch really better than just doing what the atomic operations for 
mutexes, etc. do and just use 'lock addl $0,%esp' for a barrier in all cases 
on i386 and only bother with using the fancier instructions on amd64?  Even 
amd64 doesn't use *fence yet for the atomic ops actually.  I have had a patch 
to use it for years, but during testing there was no discernable difference 
between the existing 'lock addl' approach vs '*fence'.  I'd much rather just 
use 486 code for all i386 machines than add a branch, esp. if 
the "optimization" the branch is doing isn't an actual optimization.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200812011407.06563.jhb>