Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 1 Dec 2016 11:19:19 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Eric van Gyzen <vangyzen@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r309331 - head/usr.bin/locale
Message-ID:  <20161201082459.T1285@besplex.bde.org>
In-Reply-To: <201611301834.uAUIYfQs075427@repo.freebsd.org>
References:  <201611301834.uAUIYfQs075427@repo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 30 Nov 2016, Eric van Gyzen wrote:

> Log:
>  Include limits.h for CHAR_MAX
>
>  This was needed on stable/10.  Apparently, sys/param.h supplies CHAR_MAX
>  on head.  Include limits.h anyway, for consistency, and because C says so.

sys/param.actually supplies CHAR_MAX (undocumented namespace pollution) in
all versions of FreeBSD.

The function that uses this has many bugs, but most are only style bugs due
to various magic.

X char *
X format_grouping(const char *binary)
X {
X 	static char rval[64];
X 	const char *cp;
X 	size_t len;
X 
X 	rval[0] = '\0';
X 	for (cp = binary; *cp != '\0'; ++cp) {
X 		char group[sizeof("127;")];

This hard-codes CHAR_MAX is 127.  This works accidentally on POSIX >= 2100
since POSIX now specifies 8-bit chars, and even if chars are unsigned so
that CHAR_MAX is 255, that fits in the same space as 127.

X 		snprintf(group, sizeof(group), "%hhd;", *cp);
X 		len = strlcat(rval, group, sizeof(rval));
X 		if (len >= sizeof(rval)) {
X 			len = sizeof(rval) - 1;
X 			break;
X 		}

I don't like the error handling.  It is too careful, yet not careful
enough to be correct.  Use of snprintf() with no error handling instead
of sprintf() has no effect unless there are bugs, but there are bugs.
The first few are:
- if chars are signed, then *cp may be negative.  This is an invalid
   format, but we should check it.  Negative chars are converted to
   garbage starting with a minus sign.  group[] is too small to hold
   large ones.  We defend against buffer overruns by using snprintf(),
   but don't check for errors.  Truncation gives further garbage.
- %hhd is a bogus format if chars are signed.  Then hh in it has no
   effect
- %hhd is a broken format if chars are unsigned.  It says to convert
   the arg to signed char.  On non-exotic machines with CHAR_MAX = 255,
   this corrupts values between 128 and 255 to negative ones.  Large
   values are unlikely/physically impossible, but CHAR_MAX is a magic
   sentinel value which I think we want to print as itself (users also
   need to know what CHAR_MAX is to decode it).
- on exotic machines, CHAR_MAX can be UINT_MAX.  Then char promotes to
   u_int and %d format would give undefined behaviour if the value
   exceeds INT_MAX.  %hhd is no better for avoiding the undefined
   behviour, and when it works normally it corrupts large values as for
   the non-exotic case.

X 		if (*cp == CHAR_MAX) {
X 			break;
X 		}
X 	}
X 
X 	/* Remove the trailing ';'. */
X 	rval[len - 1] = '\0';

This writes outside of rval when binary is the null string.  Possible
for at least invalid formats.  len should be initialized to 0 in this
case, but it actually uninitialized, so compilers will warn at high
WARNS even if this case is unreachable.

X 
X 	return (rval);
X }

Untested fixes and cleanups:

Y diff -u2 locale.c~ locale.c
Y --- locale.c~	2016-11-25 08:26:48.000000000 +0000
Y +++ locale.c	2016-12-01 00:09:23.603179000 +0000
Y @@ -495,27 +495,27 @@
Y  	static char rval[64];
Y  	const char *cp;
Y -	size_t len;
Y +	size_t roff;
Y +	int len;
Y 
Y  	rval[0] = '\0';
Y +	roff = 0;
Y  	for (cp = binary; *cp != '\0'; ++cp) {
Y -		char group[sizeof("127;")];
Y -		snprintf(group, sizeof(group), "%hhd;", *cp);
Y -		len = strlcat(rval, group, sizeof(rval));
Y -		if (len >= sizeof(rval)) {
Y -			len = sizeof(rval) - 1;
Y -			break;
Y -		}
Y -		if (*cp == CHAR_MAX) {
Y -			break;
Y -		}
Y +		if (*cp < 0)
Y +			break;		/* garbage input */
Y +		len = snprintf(&rval[roff], sizeof(rval) - roff, "%u;", *cp);
Y +		if (len < 0 || len >= sizeof(rval) - roff)
Y +			break;		/* insufficient space for output */
Y +		roff += len;
Y +		if (*cp == CHAR_MAX)
Y +			break;		/* special termination */
Y  	}
Y 
Y -	/* Remove the trailing ';'. */
Y -	rval[len - 1] = '\0';
Y +	/* Truncate at the last successfully snprintf()ed semicolon. */
Y +	if (roff != 0)
Y +		rval[roff - 1] = '\0';
Y 
Y -	return (rval);
Y +	return (&rval[0]);
Y  }
Y 
Y -
Y  /*
Y   * keyword value lookup helper (via localeconv())

It was much easier to just use snprintf().

The error handling can be further simplified by checking if rval[] is
large enough up front.  It needs to have size

     strlen("255;") * strlen(binary) + 1,

where 255 is POSIX CHAR_MAX hard-coded.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20161201082459.T1285>