Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Jul 2015 10:37:35 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Baptiste Daroussin <bapt@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r285997 - head/usr.sbin/pw
Message-ID:  <20150730081432.K1227@besplex.bde.org>
In-Reply-To: <201507290623.t6T6N7lq073984@repo.freebsd.org>
References:  <201507290623.t6T6N7lq073984@repo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 29 Jul 2015, Baptiste Daroussin wrote:

> Log:
>  Actually add the new code

I shouldn't have asked for this.  It gives more to clean up.  It has 1
large bug and many style bugs.

> Added: head/usr.sbin/pw/strtounum.c
> ==============================================================================
> --- /dev/null	00:00:00 1970	(empty, because file is newly added)
> +++ head/usr.sbin/pw/strtounum.c	Wed Jul 29 06:23:06 2015	(r285997)
> @@ -0,0 +1,73 @@
> ...
> +#define INVALID		"invalid"
> +#define TOOSMALL	"too small"
> +#define	TOOLARGE	"too large"

Style bugs:
- space instead of tab after #define
- not even consistent space instead of tab after #define
- unsorted defines.

The unsorting is for bug compatible with strtonum.c.  strtonum.c doesn't
have the tab bugs.  It uses these definitions to obfuscate the code even
better.  Here the definintions are of string literals which are used only
once except for INVALID.  The obfuscation is to add a layer of indirection
to these literals.  This might save a few bytes of object code for a 1975
compiler by avoiding duplication of the "invalid".  strtonum.c uses the
larger obfuscation of indirection through a table.  This allows it to
have more compact code (basically by setting the index and falling through
to common code for returning).  This probably saves a few bytes of object
code even for current compilers.

> +uintmax_t
> +strtounum(const char *numstr, uintmax_t minval, uintmax_t maxval,
> +    const char **errstrp)
> +{
> +	uintmax_t ret = 0;

Style bug: initialization in declaration.  strtonum.c has the same bug.
The bug is larger here.  Here, the result of the initialization is not
even used.  In strtonum.c, it is part of the obfuscations.  The variable
must be inititialized early so that it is initialized after falling
through to the common code at the end.  But initializing in its
declaration obfuscates that it is being initialixed for this reason.

> +	char *ep;

Style bug: unsorted declarations.  Pointers should be sorted before scalars.
strtonum.c has the same bug.

The rest of the initializations are much cleaner than in strtonum.c.  There
are no more, but strtonum has a mess to set up the table, and another
valriable initialized in its declaration to use after falling through.
The table is static and const, but is declared as auto and non-const.
This is a good pessimization for old compilers.  They will laboriously
construct it on every entry to the function.

> +
> +	if (minval > maxval) {
> +		errno = EINVAL;
> +		if (errstrp != NULL)
> +			*errstrp = INVALID;
> +		return (0);

strtonum.c uses its fall-through obfuscations here and elsewhere.  It
sets only an index variable here, and falls through to its general code
that looks up the errno and string associated with this index.

This is much better, except it should just set *errstrp to "invalid",
or better change the API by changing this string to something like
"doofus" to indicate that it is a programming error and not invalid
input.  The change is easier without the complications from the table
or an API to preserve.

> +	}
> +

Style bug: extra blank line.

> +	ret = strtoumax(numstr, &ep, 10);

Large bug here.  The setting of errno to 0 is missing.  This bug is not
in strtonum.c.

> +	if (errno == EINVAL || numstr == ep || *ep != '\0') {

Style bugs:
- unportable code
- dead code
- backwards order of check in dead code

Setting errno to EINVAL for certain errors is a POSIX extension.  It is
not in C99.  This style bug is made fatal by not setting errno to 0
before calling the function.  On success, errno has its previous value
which may be EINVAL.

All this unportability does is allow you to omit the numstr == ep check,
but the above doesn't even omit it.

Using the unportable check takes twice as much code for a non-broken
version, or 3 times as much with the redundancy:

Broken:
 	if (errno == EINVAL ...
Correct:
 	errno = 0; ...
 	if (errno == EINVAL ...
Broken and redundant:
 	if (errno == EINVAL || numstr == ep ...
Correct and redundant:
 	errno = 0; ...
 	if (errno == EINVAL || numstr == ep ...

except errno must be set to 0 in most uses so that the ERANGE check is
not broken and then using the unportability doesn't take more code.

The backwards order is numstr == ep instead of ep == numstr.

> +		errno = EINVAL;
> +		if (errstrp != NULL)
> +			*errstrp = INVALID;

This is the only definition that is used twice.  It is still clearer to
return the literal "invalid".

> +		return (0);
> +	} else if ((ret == 0 && errno == ERANGE) || ret < minval) {

Style bug: redndant 'else'.  The 'if' clause always returns, so 'else' here
has no effect except to obfuscate that.  An 'else if' ladder with some
redundant else's can be clearer if in more complicated cases if it is
done consistently.

Style bug: nonsense check of the non-error value 0.  ret can never be 0
after an actual range error, since strtoumax() returns UINTMAX_MAX for
range errors.  ret can be 0 with errno = ERANGE only becase of the
missing initialization of errno.

> +		errno = ERANGE;
> +		if (errstrp != NULL)
> +			*errstrp = TOOSMALL;
> +		return (0);
> +	} else if ((ret == UINTMAX_MAX && errno == ERANGE) || ret > maxval) {

Style bug: redundant 'else'.

Style bug: redundant check of the not-always-error value UINTMAX_MAX.
strtonum.c has to do redundant checks for not-always-error values
INTMAX_MIN and INTMAX_MAX to distinguish the 2 "ends" of a range error.
For unsigned values, there is only one "end" -- ERANGE always means
too large.

Normally it is a bug to look at the "end" value[s], since ERANGE tells you
if there is a range error.  I always suspect that code which does such
checks gets other details of range checking wrong and works around the
bugs in most cases by using bogus checks of the "end" value[s].  The
checked turned out to be almost no-ops here, but the range checking
was very broken.

> +		errno = ERANGE;
> +		if (errstrp != NULL)
> +			*errstrp = TOOLARGE;
> +		return (0);
> +	}
> +

Stykle bug: extra blank line.

> +	return (ret);

Style bug: inconsistent redundant 'else' clause.  This is the contents
of a redundant 'else' clause that we reach at the end of the 'else if'
ladder.  This clause is closely associated with the ladder, but is
obfuscated by separating it first by the blank line and then by not
using a redundant 'else' for it.

> +}

Fixing these bugs gives:

X #include <sys/cdefs.h>
X __FBSDID("$FreeBSD$");
X 
X #include <errno.h>
X #include <inttypes.h>
X #include <limits.h>
X #include <stdlib.h>
X 
X #include "pw.h"
X 
X uintmax_t
X strtounum(const char *numstr, uintmax_t minval, uintmax_t maxval,
X     const char **errstrp)
X {
X 	char *ep;
X 	uintmax_t ret;
X 
X 	if (minval > maxval) {
X 		errno = EINVAL;
X 		if (errstrp != NULL)
X 			*errstrp = "invalid";
X 		return (0);
X 	}
X 	errno = 0;
X 	ret = strtoumax(numstr, &ep, 10);
X 	if (ep == numstr || *ep != '\0') {
X 		errno = EINVAL;
X 		if (errstrp != NULL)
X 			*errstrp = "invalid";
X 		return (0);
X 	}
X 	if (ret < minval) {
X 		errno = ERANGE;
X 		if (errstrp != NULL)
X 			*errstrp = "too small";
X 		return (0);
X 	}
X 	if (errno == ERANGE || ret > maxval) {
X 		errno = ERANGE;
X 		if (errstrp != NULL)
X 			*errstrp = "too large";
X 		return (0);
X 	}
X 	return (ret);
X }

This can be shortened a little at a cost of a minor obfuscation: call
strtoumax before checking (minval > maxval) and use the same error
handling code.

Some variable names could be improved.  numstr is the same as in
strtonum.c, but the man page uses the better name nptr.  nptr is
actually better for the implementation than the man page, but the
implementation should use simply np.  errstrp is used by both
strtonum.c and strtounum.c, but the man page spells it errstrp.

I like to use a p suffix for all pointers.  This is not so important
for man pages.  C99 and FreeBSD man pages and implementation use
consistent names for the strtol family:
- the string to be converted is always named 'nptr'
- the pointer for returning the position reached is always named 'endptr'
I.e., they consistently use a 'p' suffix but spell it verbostly as 'ptr'.
It would be better to expand 'n' to 'num' than 'p' to 'ptr', but they
don't do that.  endptr is doubly indirect so it should have too 'p's
but it only has 1 'ptr.

strtonum introduces many inconsistencies:
- it still uses 'nptr' in the man page but not in the implementation
- it uses 'str' in 'errstr' in the man page.  That emphasizes the
   string and not the not the pointer.  Sort of backwards.  It uses
   'errstrp' in the implementation.  That is basically 2 'p' suffixes
   with the inner one distinguished as being for a string.  This is
   good, but the man page needs such details more than the implementatin.

I would just use np and errpp in the implementation.  Also, change ep
to endp so that the 'e' in it is more obviously unrelated to errors.

Another easy fix: apart from intmax_t, strto*num() hasn't caught up
with C99 'restrict' yet.  See the strtol() prototype for this.  I think
it can be fixed without breaking the API too much.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150730081432.K1227>