Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Nov 2012 14:30:25 -0500
From:      Jung-uk Kim <jkim@FreeBSD.org>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        freebsd-arch@FreeBSD.org
Subject:   Re: [RFC] Generic population count function
Message-ID:  <50A542D1.6030006@FreeBSD.org>
In-Reply-To: <20121115181009.D42162@besplex.bde.org>
References:  <50A43B52.8030102@FreeBSD.org> <20121115181009.D42162@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 2012-11-15 03:32:50 -0500, Bruce Evans wrote:
> On Wed, 14 Nov 2012, Jung-uk Kim wrote:
> 
>> I implemented generic population count function.  Please see the 
>> attachment.  It is also available from here:
>> 
>> http://people.freebsd.org/~jkim/bitcount.diff
>> 
>> The idea is to make use of CPU-supported population count
>> instructions if available and the compiler supports them (i.e.,
>> clang), especially for larger than 32-bit data.  The patch also
>> has use cases for the new function (i.e., counting number of bits
>> in IPv6 address[*]).
>> 
>> Any objection?
> 
> This is rather over-engineered for something that is used once or
> twice, but it is good to move the functions out of sys/systm.h
> where they weren't even usable outside the kernel.

There's little history behind it, actually.  When I wrote the
following patch, I was looking for fast population count function
available in FreeBSD:

http://svnweb.freebsd.org/ports/head/java/openjdk6/files/patch-set?annotate=306138#l5437

I searched everywhere and found this:

http://www.dalkescientific.com/writings/diary/archive/2011/11/02/faster_popcount_update.html

but I ended up (ab)using count() method for bitset in C++ because
there's no portable way to do so.  The GCC-supplied bitset header uses
__builtin_popcount(), naturally. :-)

BTW, I intent to add CPU_COUNT() macro to sys/cpuset.h with this
bitcount() function.

[Snip ffs() issues]

> I only noticed the following in your version: - including
> <sys/cdefs.h> is a style bug because sys/types.h includes it

Will fix, thanks!

> - you removed the silly special inline for bitcount16() but still 
> implement bitcount16(), presumably for compatibiltiy.

Precisely.  I don't want to see angry kib@. ;-)

> But it is incompatible since the old version has the bogus return
> type uint16_t.

Yes, I noticed that, too.  I believe kib@ added it to make Linux ->
FreeBSD conversion easier, i.e.,

http://lxr.linux.no/#linux+v3.6.6/drivers/gpu/drm/i915/intel_sdvo.c#L1279
http://lxr.linux.no/#linux+v3.6.6/drivers/gpu/drm/i915/intel_sdvo.c#L1913

BTW, I found Linux has very simple ("naive") population count (aka,
Hamming weight) macros:

http://lxr.linux.no/#linux+v3.6.6/include/asm-generic/bitops/const_hweight.h

__builtin_popcount() family should do much better than this. ;-)

> It seems to be used just twice: in intel_sdvo.c, it is blindly
> converted to bool in 1 place and to unsigned int in 1 place.  Both
> of these uses are for device initialization so their efficiency is
> especially
uniportant.

Correct.

> - bitcount32() is also incompatible since the old version has the
> slightly less bogus return type uint32_t.  At least the args have
> correct type, unlike the ffs() family.

Yes, it is little incompatible but *intentional*.  I looked through
entire source code and found no problem.  The return value is always
<= 32 anyway. :-)

Thanks for the review,

Jung-uk Kim
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iEYEARECAAYFAlClQtAACgkQmlay1b9qnVMZUwCfQhimEbrL87o+KOITt27nNgNS
pWAAniWs9Jzf6W2CkiIyozSSfLqW1YJK
=DOpK
-----END PGP SIGNATURE-----



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50A542D1.6030006>