From owner-freebsd-bugs@FreeBSD.ORG Fri May 9 14:20:04 2008 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C6CE61065671 for ; Fri, 9 May 2008 14:20:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 9F8A78FC14 for ; Fri, 9 May 2008 14:20:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m49EK4ee060018 for ; Fri, 9 May 2008 14:20:04 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m49EK44w060017; Fri, 9 May 2008 14:20:04 GMT (envelope-from gnats) Resent-Date: Fri, 9 May 2008 14:20:04 GMT Resent-Message-Id: <200805091420.m49EK44w060017@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Romain Tartiere Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 45C01106567D for ; Fri, 9 May 2008 14:15:50 +0000 (UTC) (envelope-from romain@blogreen.org) Received: from smtp3-g19.free.fr (smtp3-g19.free.fr [212.27.42.29]) by mx1.freebsd.org (Postfix) with ESMTP id D95A08FC24 for ; Fri, 9 May 2008 14:15:49 +0000 (UTC) (envelope-from romain@blogreen.org) Received: from smtp3-g19.free.fr (localhost.localdomain [127.0.0.1]) by smtp3-g19.free.fr (Postfix) with ESMTP id C560C17B546 for ; Fri, 9 May 2008 16:15:48 +0200 (CEST) Received: from marvin.blogreen.org (marvin.blogreen.org [82.247.213.140]) by smtp3-g19.free.fr (Postfix) with ESMTP id A7DF517B543 for ; Fri, 9 May 2008 16:15:48 +0200 (CEST) Received: by marvin.blogreen.org (Postfix, from userid 1001) id 4B39E5C070; Fri, 9 May 2008 16:15:48 +0200 (CEST) Message-Id: <20080509141548.4B39E5C070@marvin.blogreen.org> Date: Fri, 9 May 2008 16:15:48 +0200 (CEST) From: Romain Tartiere To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Cc: Subject: bin/123553: [patch] Prevent indent(1) from splitting unrecognized tokens X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Romain Tartiere List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 May 2008 14:20:04 -0000 >Number: 123553 >Category: bin >Synopsis: [patch] Prevent indent(1) from splitting unrecognized tokens >Confidential: no >Severity: non-critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri May 09 14:20:04 UTC 2008 >Closed-Date: >Last-Modified: >Originator: Romain Tartiere >Release: FreeBSD 7.0-STABLE i386 >Organization: >Environment: System: FreeBSD marvin.blogreen.org 7.0-STABLE FreeBSD 7.0-STABLE #14: Fri Apr 18 18:27:58 CEST 2008 root@marvin.blogreen.org:/usr/obj/usr/src/sys/MARVIN i386 >Description: When using indent(1) to indent source code, unrecognized tokens such as "0b00101010" are split (e.g. "0b 00101010"). Such constructs are however valid using avr-gcc from the ports, and upcoming releases of gcc will support this binary notation [1]. References: 1. As noticed by Frank Behrens: http://lists.freebsd.org/pipermail/freebsd-hackers/2008-April/024343.html >How-To-Repeat: % echo "int x = 0b00101010 ;" > foo.c % avr-gcc -c foo.c % indent foo.c % avr-gcc -c foo.c foo.c:1: error: expected ',' or ';' before 'b00101010' % cat foo.c int x = 0 b00101010; >Fix: The following patch attempt to detect numbers in different bases, assert it is valid, but avoid splitting tokens on unrecognized data: --- lexi.c.diff begins here --- --- /usr/src/usr.bin/indent/lexi.c 2005-11-20 14:48:15.000000000 +0100 +++ lexi.c 2008-04-27 15:09:21.000000000 +0200 @@ -121,6 +121,10 @@ 1, 1, 1, 0, 3, 0, 3, 0 }; +enum base { + BASE_2, BASE_8, BASE_10, BASE_16 +}; + int lexi(void) { @@ -158,16 +162,37 @@ int seendot = 0, seenexp = 0, seensfx = 0; - if (*buf_ptr == '0' && - (buf_ptr[1] == 'x' || buf_ptr[1] == 'X')) { - *e_token++ = *buf_ptr++; - *e_token++ = *buf_ptr++; - while (isxdigit(*buf_ptr)) { + enum base in_base = BASE_10; + + if (*buf_ptr == '0') { + if (buf_ptr[1] == 'b' || buf_ptr[1] == 'B') + in_base = BASE_2; + else if (buf_ptr[1] == 'x' || buf_ptr[1] == 'X') + in_base = BASE_16; + else + in_base = BASE_8; + } + + *e_token++ = *buf_ptr++; + if (in_base == BASE_2 || in_base == BASE_16) + *e_token++ = *buf_ptr++; /* Read the second character from + * 0b... / 0x... expressions. + */ + + switch (in_base) { + case BASE_2: + while (*buf_ptr == '0' || *buf_ptr == '1') { CHECK_SIZE_TOKEN; *e_token++ = *buf_ptr++; } - } - else + break; + case BASE_8: + while (*buf_ptr >= '0' && *buf_ptr <= '8') { + CHECK_SIZE_TOKEN; + *e_token++ = *buf_ptr++; + } + break; + case BASE_10: while (1) { if (*buf_ptr == '.') { if (seendot) @@ -209,6 +234,29 @@ } break; } + + break; + case BASE_16: + while (isxdigit(*buf_ptr)) { + CHECK_SIZE_TOKEN; + *e_token++ = *buf_ptr++; + } + break; + } + if (isalnum(*buf_ptr)) { + char *buf; + /* current token is malformed */ + if (asprintf(&buf, "Ignoring invalid numeric " + "expression '%s%c...'", s_token, *buf_ptr)) { + diag2(0, buf); + free(buf); + } + /* finish to eat the current token */ + while (isalnum(*buf_ptr)) { + CHECK_SIZE_TOKEN; + *e_token++ = *buf_ptr++; + } + } } else while (chartype[(int)*buf_ptr] == alphanum || *buf_ptr == BACKSLASH) { --- lexi.c.diff ends here --- >Release-Note: >Audit-Trail: >Unformatted: