From owner-freebsd-bugs@FreeBSD.ORG  Fri May  9 14:20:04 2008
Return-Path: <owner-freebsd-bugs@FreeBSD.ORG>
Delivered-To: freebsd-bugs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C6CE61065671
	for <freebsd-bugs@hub.freebsd.org>;
	Fri,  9 May 2008 14:20:04 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 9F8A78FC14
	for <freebsd-bugs@hub.freebsd.org>;
	Fri,  9 May 2008 14:20:04 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m49EK4ee060018
	for <freebsd-bugs@freefall.freebsd.org>; Fri, 9 May 2008 14:20:04 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m49EK44w060017;
	Fri, 9 May 2008 14:20:04 GMT (envelope-from gnats)
Resent-Date: Fri, 9 May 2008 14:20:04 GMT
Resent-Message-Id: <200805091420.m49EK44w060017@freefall.freebsd.org>
Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer)
Resent-To: freebsd-bugs@FreeBSD.org
Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org,
	Romain Tartiere <romain@blogreen.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 45C01106567D
	for <FreeBSD-gnats-submit@freebsd.org>;
	Fri,  9 May 2008 14:15:50 +0000 (UTC)
	(envelope-from romain@blogreen.org)
Received: from smtp3-g19.free.fr (smtp3-g19.free.fr [212.27.42.29])
	by mx1.freebsd.org (Postfix) with ESMTP id D95A08FC24
	for <FreeBSD-gnats-submit@freebsd.org>;
	Fri,  9 May 2008 14:15:49 +0000 (UTC)
	(envelope-from romain@blogreen.org)
Received: from smtp3-g19.free.fr (localhost.localdomain [127.0.0.1])
	by smtp3-g19.free.fr (Postfix) with ESMTP id C560C17B546
	for <FreeBSD-gnats-submit@freebsd.org>;
	Fri,  9 May 2008 16:15:48 +0200 (CEST)
Received: from marvin.blogreen.org (marvin.blogreen.org [82.247.213.140])
	by smtp3-g19.free.fr (Postfix) with ESMTP id A7DF517B543
	for <FreeBSD-gnats-submit@freebsd.org>;
	Fri,  9 May 2008 16:15:48 +0200 (CEST)
Received: by marvin.blogreen.org (Postfix, from userid 1001)
	id 4B39E5C070; Fri,  9 May 2008 16:15:48 +0200 (CEST)
Message-Id: <20080509141548.4B39E5C070@marvin.blogreen.org>
Date: Fri,  9 May 2008 16:15:48 +0200 (CEST)
From: Romain Tartiere <romain@blogreen.org>
To: FreeBSD-gnats-submit@FreeBSD.org
X-Send-Pr-Version: 3.113
Cc: 
Subject: bin/123553: [patch] Prevent indent(1) from splitting unrecognized
	tokens
X-BeenThere: freebsd-bugs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Romain Tartiere <romain@blogreen.org>
List-Id: Bug reports <freebsd-bugs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-bugs>
List-Post: <mailto:freebsd-bugs@freebsd.org>
List-Help: <mailto:freebsd-bugs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 09 May 2008 14:20:04 -0000


>Number:         123553
>Category:       bin
>Synopsis:       [patch] Prevent indent(1) from splitting unrecognized tokens
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri May 09 14:20:04 UTC 2008
>Closed-Date:
>Last-Modified:
>Originator:     Romain Tartiere
>Release:        FreeBSD 7.0-STABLE i386
>Organization:
>Environment:
System: FreeBSD marvin.blogreen.org 7.0-STABLE FreeBSD 7.0-STABLE #14: Fri Apr 18 18:27:58 CEST 2008 root@marvin.blogreen.org:/usr/obj/usr/src/sys/MARVIN i386

>Description:

When using indent(1) to indent source code, unrecognized tokens such as "0b00101010" are split (e.g. "0b 00101010").

Such constructs are however valid using avr-gcc from the ports, and upcoming releases of gcc will support this binary notation [1].

References:
  1. As noticed by Frank Behrens: http://lists.freebsd.org/pipermail/freebsd-hackers/2008-April/024343.html

>How-To-Repeat:

% echo "int x = 0b00101010 ;" > foo.c
% avr-gcc -c foo.c
% indent foo.c
% avr-gcc -c foo.c
foo.c:1: error: expected ',' or ';' before 'b00101010'
% cat foo.c
int             x = 0 b00101010;

>Fix:

The following patch attempt to detect numbers in different bases, assert it is valid, but avoid splitting tokens on unrecognized data:

--- lexi.c.diff begins here ---
--- /usr/src/usr.bin/indent/lexi.c	2005-11-20 14:48:15.000000000 +0100
+++ lexi.c	2008-04-27 15:09:21.000000000 +0200
@@ -121,6 +121,10 @@
     1, 1, 1, 0, 3, 0, 3, 0
 };
 
+enum base {
+	BASE_2, BASE_8, BASE_10, BASE_16
+};
+
 int
 lexi(void)
 {
@@ -158,16 +162,37 @@
 	    int         seendot = 0,
 	                seenexp = 0,
 			seensfx = 0;
-	    if (*buf_ptr == '0' &&
-		    (buf_ptr[1] == 'x' || buf_ptr[1] == 'X')) {
-		*e_token++ = *buf_ptr++;
-		*e_token++ = *buf_ptr++;
-		while (isxdigit(*buf_ptr)) {
+	    enum base	in_base = BASE_10;
+
+	    if (*buf_ptr == '0') {
+		if (buf_ptr[1] == 'b' || buf_ptr[1] == 'B')
+		    in_base = BASE_2;
+		else if (buf_ptr[1] == 'x' || buf_ptr[1] == 'X')
+		    in_base = BASE_16;
+		else
+		    in_base = BASE_8;
+	    }
+
+	    *e_token++ = *buf_ptr++;
+	    if (in_base == BASE_2 || in_base == BASE_16)
+		*e_token++ = *buf_ptr++;	/* Read the second character from
+						 * 0b... / 0x... expressions.
+						 */
+
+	    switch (in_base) {
+	    case BASE_2:
+		while (*buf_ptr == '0' || *buf_ptr == '1') {
 		    CHECK_SIZE_TOKEN;
 		    *e_token++ = *buf_ptr++;
 		}
-	    }
-	    else
+		break;
+	    case BASE_8:
+		while (*buf_ptr >= '0' && *buf_ptr <= '8') {
+		    CHECK_SIZE_TOKEN;
+		    *e_token++ = *buf_ptr++;
+		}
+		break;
+	    case BASE_10:
 		while (1) {
 		    if (*buf_ptr == '.') {
 			if (seendot)
@@ -209,6 +234,29 @@
 		}
 		break;
 	    }
+
+	    	break;
+	    case BASE_16:
+		while (isxdigit(*buf_ptr)) {
+		    CHECK_SIZE_TOKEN;
+		    *e_token++ = *buf_ptr++;
+		}
+	    	break;
+	    }
+	    if (isalnum(*buf_ptr)) {
+		char *buf;
+		/* current token is malformed */
+		if (asprintf(&buf, "Ignoring invalid numeric "
+		    "expression '%s%c...'", s_token, *buf_ptr)) {
+		    diag2(0, buf);
+		    free(buf);
+		}
+		/* finish to eat the current token */
+		while (isalnum(*buf_ptr)) {
+		    CHECK_SIZE_TOKEN;
+		    *e_token++ = *buf_ptr++;
+		}
+	    }
 	}
 	else
 	    while (chartype[(int)*buf_ptr] == alphanum || *buf_ptr == BACKSLASH) {
--- lexi.c.diff ends here ---


>Release-Note:
>Audit-Trail:
>Unformatted: