Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 23 Oct 2015 21:02:17 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        "Conrad E. Meyer" <cem@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r289765 - in head/sys: conf libkern sys
Message-ID:  <20151023200654.T1687@besplex.bde.org>
In-Reply-To: <201510222028.t9MKSbRu032869@repo.freebsd.org>
References:  <201510222028.t9MKSbRu032869@repo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 22 Oct 2015, Conrad E. Meyer wrote:

> Log:
>  Add libkern ffsll() for parity with flsll()
>
>  Sponsored by:	EMC / Isilon Storage Division


This fills a much needed gap, like the one filled by flsll().

It uses the long long abomination.

uintmax_t has been Standard since 1999, but is still not supported
by the ffs() family.

It further expands the design error in ffs().
   (ffs() only supports ints.  Its behaviour on negative values is
   undocumented, but needs documentation more than for non-negative
   values because even in the 2's complement case, the sign bit can
   be anywhere).  The bad BSD documented is for ffs() is duplicated
   in POSIX in at least old versions.  I think this requires ffs()
   to operate on values, so it cannot work in the 1's complement
   case.  The -0 has all bits 1 but value 0.  ffs(-0) by value must
   be -1, and ABIs are probably allowed to pass -0 as +0 so ffs(-0)
   is probably -1 even if ffs() is by bits.)
Due to this design error, it is not obvious that even u_char is
directly supported by the ffs() family.  It is supported because
POSIX requires 8-bit chars so chars get promoted to non-negative
ints when passed to ffs().  This doesn't happen for u_int, but
you can promote u_int manually to long long to get correct and
pessimal code.  This doesn't work for u_foo that is already
of larger rank than long long.

> Added: head/sys/libkern/ffsll.c
> ==============================================================================
> --- /dev/null	00:00:00 1970	(empty, because file is newly added)
> +++ head/sys/libkern/ffsll.c	Thu Oct 22 20:28:37 2015	(r289765)
> + ...
> +/*
> + * Find First Set bit
> + */
> +int
> +ffsll(long long mask)
> +{
> +	int bit;
> +
> +	if (mask == 0)
> +		return (0);
> +	for (bit = 1; !(mask & 1); bit++)
> +		mask = (unsigned long long)mask >> 1;
> +	return (bit);
> +}

This has the usual implementation for negative values.  They are cast
to the correct unsigned type to get defined behaviour for the shift.
The cast itself gives implementation-defined behaviour.  Even when the
compiler was gcc so that it was documented in man pages and info pages,
the documentation for this was hard to find.  Whatever it is, it gives
the undocumented behaviour of the function.

This implementation is pessimal unless the compiler can convert it to
special instructions.  On i386, 2 32-bit ffs()'s are more portable and
much faster, since each one uses a special instruction and it is not
possible to do much better.  (But fls64() can be done better in userland
using the FPU or perhaps SSE.)

clang optimizes similar simple loops for popcount() but doesn't optimize
the ones used in libc for the ffs() family.  If it did optimize these
loops, 2 32-bit ffs()'s with asms inside them wouldn't be so good, but
for bswap64() we arranged for compilers to good combining for even more
complicated variations.  (For bswap*, clang converts the complicated C
expression giving the rearrangement to minimal bswap instruction(s),
but we hard-code these instructions in asm because gcc can't do this
in the old versions in FreeBSD, and then arrange so that clang can
combine the asms efficiently on i386.)

Efficiency of the ffs() family is rarely important.  bitset.h uses ffsl()
but bitstring.h uses a simple loop.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151023200654.T1687>