Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 Apr 2015 00:52:35 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, Jung-uk Kim <jkim@freebsd.org>,  Alan Cox <alc@rice.edu>, John Baldwin <jhb@freebsd.org>,  src-committers@freebsd.org, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r280279 - head/sys/sys
Message-ID:  <20150421003316.M10305@besplex.bde.org>
In-Reply-To: <20150420220347.B9956@besplex.bde.org>
References:  <201503201027.t2KAR6Ze053047@svn.freebsd.org> <550DA656.5060004@FreeBSD.org> <20150322080015.O955@besplex.bde.org> <17035816.lxyzYKiOWV@ralph.baldwin.cx> <552BFEB2.8040407@rice.edu> <552C215D.8020107@FreeBSD.org> <20150420115351.GD2390@kib.kiev.ua> <20150420220347.B9956@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 21 Apr 2015, Bruce Evans wrote:

> On Mon, 20 Apr 2015, Konstantin Belousov wrote:
>
>> On Mon, Apr 13, 2015 at 04:04:45PM -0400, Jung-uk Kim wrote:
>>> Please try the attached patch.
>>> ...
>>> -	__asm __volatile("xorl %k0,%k0;popcntq %1,%0"
>>> -	    : "=&r" (result) : "rm" (elem));
>>> ...
>>> +			__asm __volatile("xorl %k0, %k0; popcntq %1, %0"
>>> +			    : "=r" (count) : "m" (pc_map[field]));
>>  ...
>> Yes, this worked for me the same way as for you, the argument is taken
>> directly from memory, without temporary spill.  Is this due to silly
>> inliner ?  Whatever the reason is, I think a comment should be added
>> noting the subtlety.
>> 
>> Otherwise, looks fine.
>
> Erm, this looks silly.  It apparently works by making things too complicated
> for the compiler to "optimize" (where one of the optimizations actually
> gives pessimal spills).  Its main changes are:
> ...
> It works better to change the constraint to "r":

It's even sillier than that.  The problem is not limited to this function.
clang seems to prefer memory whenever you use the "rm" constraint.  The
silliest case is when you have a chain of simple asm functions.  Say the
original popcntq (without the xorl):

 	return (popcntq(popcntq(popcntq(popcntq(popcntq(x))))));

gcc compiles this to 5 sequential popcntq instructions, but clang
spills the results of the first 4.

This is an old bug.  clang does this on FreeBSD[9-11].  cc does this
on FreeBSD[10-11] (not on FreeBSD-9 since cc = gcc there.

Asms should always use "rm" if "m" works.  Ones in cpufunc.h always
do except for lidt(), lldt() and ltr().  These 3 are fixed in my version.
So cpufunc.h almost always asks for the pessimization.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150421003316.M10305>