Date: Fri, 1 Jul 2011 11:09:36 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Stefan Esser <se@freebsd.org> Cc: Alexander Best <arundel@freebsd.org>, Poul-Henning Kamp <phk@phk.freebsd.dk>, standards@freebsd.org, Bruce Evans <bde@freebsd.org> Subject: Re: RESENT with patch ... Re: [RFC] Consistent numeric range for "expr" on all architectures Message-ID: <20110701102746.I1335@besplex.bde.org> In-Reply-To: <4E0CAE8B.6030309@freebsd.org> References: <99048.1309258976@critter.freebsd.dk> <4E0A0774.3090004@freebsd.org> <20110629082103.O1084@besplex.bde.org> <4E0B1C47.4010201@freebsd.org> <20110630073705.P1117@besplex.bde.org> <4E0C2F41.2060302@freebsd.org> <4E0CA55E.2090102@freebsd.org> <20110630165050.GB82980@zim.MIT.EDU> <4E0CAE8B.6030309@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 30 Jun 2011, Stefan Esser wrote: > Am 30.06.2011 18:50, schrieb David Schultz: >> On Thu, Jun 30, 2011, Stefan Esser wrote: >>> int >>> +is_integer(const char *s) >>> +{ >>> + if (*s == '-') >>> + s++; >>> + while (isdigit(*s)) >>> + s++; >>> + return *s == '\0'; >>> +} >> >> I only glanced at the patch for a few seconds, but your >> is_integer() routine will accept a bare minus sign ("-") as an >> integer. It also invokes undefined behaviour by applying isdigit() to a value that is not necessarily representable as an unsigned char or equal to the value EOF, and gives wrong behaviour if there is a digit whose value as a plain char promotes to an int with the value EOF (perhaps the latter can't happen). strto*() is more careful. It also has a style bug (missing silly parentheses around return value). Does you current version use only the above to classify integers, so that leading whitespace is not permitted? "expr [-e] ' 2' + ' 2'" now gives 4 since strtoimax() ignores the leading whitespace, but I think the syntax doesn't allow this. This is another example of why expr's syntax is broken as designed. Shell arithmetic seems to ignore leading whitespace. This might be just because the whitespace is kept when a variable is expanded, but has no significance in shell expressions. Trailing whitespace is already disallowed in expr. This is presumably just because strto*() stops parsing digits at the whitespace and then the check for trailing garbage treats even whitespace as garbage. Shell arithmetic seems to ignore trailing whitespace. > I mentioned in the message, that I do not test for empty numeric > operands. I had considered: This seems to be another bug in old expr. It gives 4 for "expr [-e] 2 + '' + 2". > int > is_integer(const char *s) > { > if (*s == '-') > s++; > if (!isdigit(*s++) > return 0; > while (isdigit(*s)) > s++; > return *s == '\0'; > } > > But this will break script that do > > while : > do > i=`expr "$i" + 1` > done > > without first setting i=0. And I assume, such scripts exist in huge numbers. Eeek, Doesn't the syntax require them to break? The resulting expression is similar to unary plus, which is useful for similar things. > But "-" will not be accepted as operand (neither integer nor string). > > Didn't I give the example "expr -- - : -" which fails with a syntax > error? > > Therefore, I do not need to check for actual digits in the input (and > thus may accept "" as 0 for backwards compatibility), but "-" will not > be allowed as parameter to expr by the parser ("syntax error"). I tried a Linux expr (from a FreeBSD package in 2002): the following are all syntax errors ("non-numeric argument"): '+1' + 1 (unary plus) ' 1' + 1 (leading whitespace) '1 ' + 1 (trailing whitespace) '' + 1 (no digits) The following wrap silently, but not as signed long (this is on i386, but int64_t is apparently used): 1111111111111111111 + 1111111111111111111 -> 2222222222222222222 9223372036854775806 + 1 -> 9223372036854775807 9223372036854775806 + 2 -> -9223372036854775808 We have the compat mode for the broken scripts, but since they would have broken in Linux in 2002, hopefully there aren't any. Someone should check a current Linux expr. I guess it hasn't changed for unary plus and whitepsace, since it was perfectly broken to POSIX spec in 2002 and the spec hasn't changed. It might have changed the behaviour on overflow (say to make it nonsilent), since the spec doesn't require any particular brokenness then. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110701102746.I1335>