Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 Mar 2015 15:58:14 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Mateusz Guzik <mjg@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r280407 - head/sys/kern
Message-ID:  <20150324135350.N1665@besplex.bde.org>
In-Reply-To: <201503240010.t2O0ACZb089906@svn.freebsd.org>
References:  <201503240010.t2O0ACZb089906@svn.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 24 Mar 2015, Mateusz Guzik wrote:

> Log:
>  filedesc: microoptimize fget_unlocked by getting rid of fd < 0 branch

This has no effect.  Compilers optimize to the equivalent of the the
unsigned cast hack if this is good.  On x86, it is good since no
instructions are needed for the conversion, and the only difference
between the code generated by

 	if (fd >= fdt->fdt_nfiles)

and

 	if (fd < 0 || fd >= fdt->fdt_nfiles)

is to change from jl (jump if less than) to jb (jump if below) (both
jumps over the if clause).  Negative values satisfy jl but not jb.

>  Casting fd to an unsigned type simplifies fd range coparison to mere checking
>  if the result is bigger than the table.

No, it obfuscates the range comparison.

On some arches, conversion to unsigned is slow.  Then compilers should
optimize in the opposite direction by first undoing the bogus cast to
get back to the range check and then optimizing the range check using
the best strategy.  Compilers should probably first undo the bogus
cast even on x86, so as to reduce to the range check case.  Range checks
are more important and more uniform than bogus casts, so they are more
likely to be optimized.  Similarly if fd has type u_int to begin with.
   (Grosser forms of the obfuscation do this.  The gross forms are still
   common :-(.  The first one in syscalls.master is "u_int fd" for dup().
   But this doesn't even survive as far as do_dup(), since sys_dup()
   explicitly converts to int.  The conversions are:
   - the arg "int fd" is type-punned to "u_int fd" in the args struct
   - uap->fd is cast to before passing it to do_dup().
   dup2() is similar.
   The first one in syscalls.master that uses the obfuscation is getgroups()
   or perhaps getlogin().  The conversions and obfuscations for getgroups
   are:
   - the arg "int gidsetlen" is type-punned to "u_int gidsetsize" in the
     args struct.  This also converts the name to a worse one.  The arg
     gives the number of elements, not a length or a size
   - uap->gidsetsize is compared with a maximum value.  This depends on
     it being u_int to reject negative values.
   - uap->gidsetsize passed is without a cast to copyout().  It becomes a
     size_t.  This conversion cannot overflow for the checked value
     even if uap->gidsetsize has the correct type (int), but an explicit
     cast might be needed to avoid compiler warnings then (sys_dup() does
     such a cast).  copyout() has related naming problems.  Its count arg
     is 'named' len.  bcopy() and memcpy()'s count arg is also named
     'len' in FreeBSD, but in C99 and POSIX memcpy()'s count arg is just
     named 'n'.

> Modified: head/sys/kern/kern_descrip.c
> ==============================================================================
> --- head/sys/kern/kern_descrip.c	Tue Mar 24 00:01:30 2015	(r280406)
> +++ head/sys/kern/kern_descrip.c	Tue Mar 24 00:10:11 2015	(r280407)
> @@ -2342,7 +2342,7 @@ fget_unlocked(struct filedesc *fdp, int
> #endif
>
> 	fdt = fdp->fd_files;
> -	if (fd < 0 || fd >= fdt->fdt_nfiles)
> +	if ((u_int)fd >= fdt->fdt_nfiles)
> 		return (EBADF);
> 	/*
> 	 * Fetch the descriptor locklessly.  We avoid fdrop() races by

This undoes the changes that removed the obfuscation.  Many commits
were involved:
- 4.4BSD-Lite1 uses the obfuscation in a modification of the above
   form, with "u_int" spelled in non-BSD style as "unsigned" for at
   least most of fcntl(), close(), compat_43_fstat(), fstat(),
   fpathconf(), flock() and dupfdopen().
- 4.4BSD-Lite2 fixed the style bug, so that these functions used the
   obfuscation in exactly the above form
- FreeBSD never merged these changes from Lite2
- dupfd_open() apparently had the BSD spelling in Lite1, so FreeBSD-2+
   always had that.  jhb committed my change to unobfuscate dupfd_open()
   in 2002.
- I apparently missed half of the obfuscations with the "unsigned"
   spelling.  But half of them were moved into _fget(), fget() or
   fget_locked().
- early in 2002, the obfuscation was in _fget() in exactly the above
   form.  _fget() far too large for an inline function, but was static
   inline in kern_descrip.c.
- a little later in 2002, the obfuscated part of _fget() and a little
   more was turned into fget_locked().  fget_locked() was not too large
   for an inline function, and was static inline in filedesc.h.  But this
   change turned the obfuscation into nonsense.  It added the explicit
   check for the lower bound, but kept the bogus cast.
- I asked the committer to fix this, but my instructions were apparently
   too tl;dr (maybe 1/4 as long as this mail), and the result was less
   than reversion to the previous obfuscated version -- it was that plus,
   plus another bogus cast to u_int (the second one has no effect not
   already given by the first one), and a comment documenting the
   obfuscation.  The comment takes about 4 times as much source code as
   the unobfuscated version.
- I fixed this in 2004.  The unobfuscated version is slightly shorter
   then the version with the doubled bogus cast, even without the comment.
- this version of fget_locked() survived until at least FreeBSD-9, but
   in recent versions its style regressed in other ways.  It expanded
   from 2 lines of statements to 6 to add 1 assert statement and 3 lines
   with style bugs (2 extra blank lines and one loss of use of a
   conditional expression).
- fget_locked() seems to be unchanged recently, so it doesn't have the
   obfuscation.  fget_unlocked() is newer than the unobfuscated version
   of fget_locked().  It is in a different file, but may have copied the
   unobfuscated range check from fget_locked().  This commit restores the
   obfuscation to it alone.

There are now some other range checks in kern_desc.c that are missing
the obfuscation: grepping for "fd <" gives:
- a magic treatment for negative fd's in closefrom()
- the range check in an unobfuscated by confusing form in fdisused().
   The check has to be inverted since it is in a KASSERT(), and the
   inversion is done by reversing the inequalities.
- similarly in fdalloc().

I used to use this hack a lot 30 years ago, but stopped when compilers
got better 20-25 years ago.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150324135350.N1665>