From owner-freebsd-standards@FreeBSD.ORG Wed Jun 29 01:10:01 2011 Return-Path: Delivered-To: standards@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 239661065670; Wed, 29 Jun 2011 01:10:01 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx07.syd.optusnet.com.au (fallbackmx07.syd.optusnet.com.au [211.29.132.9]) by mx1.freebsd.org (Postfix) with ESMTP id 486E28FC16; Wed, 29 Jun 2011 01:09:59 +0000 (UTC) Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au [211.29.132.188]) by fallbackmx07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p5SN73QD001264; Wed, 29 Jun 2011 09:07:03 +1000 Received: from c122-106-165-191.carlnfd1.nsw.optusnet.com.au (c122-106-165-191.carlnfd1.nsw.optusnet.com.au [122.106.165.191]) by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p5SN6v60014950 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 29 Jun 2011 09:06:58 +1000 Date: Wed, 29 Jun 2011 09:06:57 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stefan Esser In-Reply-To: <4E0A0774.3090004@freebsd.org> Message-ID: <20110629082103.O1084@besplex.bde.org> References: <99048.1309258976@critter.freebsd.dk> <4E0A0774.3090004@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Alexander Best , Poul-Henning Kamp , standards@FreeBSD.org, Bruce Evans Subject: Re: [RFC] Consistent numeric range for "expr" on all architectures X-BeenThere: freebsd-standards@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Standards compliance List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jun 2011 01:10:01 -0000 [I changed developers to standards instead of removing it] On Tue, 28 Jun 2011, Stefan Esser wrote: > Am 28.06.2011 13:02, schrieb Poul-Henning Kamp: >> In message <4E09AF8E.5010509@freebsd.org>, "Stefan Esser" writes: >> >>> Due to (false, according to BDE) considerations for POSIX compliance, >>> the 64bit code was made conditional on a command line option in 2002. >> >> I think 64bit is the wrong thing to focus on, shouldn't it be >> "intmax_t" so we will not have to revisit this again ? > > Well, actually it already *is* intmax_t, which happens to be 64bit > on all architectures I checked ;-) > > My proposal is just to not produce overflows when easily avoidable. > This takes little effort, simplifies the code and makes scripts more > portable accross architectures. > > Are there any supported architectures with intmax_t smaller than 64bit? There cannot be, since C99 requires long long to be at least 64 bits (counting the sign bit) and it requires intmax_t to be capable of representing any value of any signed integer type. Which checking this, I noticed that: - preprocessor arithmetic is done using intmax_t or uintmax_t. This causes portability problems related to ones for expr -- expressions like ULONG_MAX + ULONG_MAX suddenly started in 1999 giving twice ULONG_MAX instead of ULONG_MAX-1, but only on arches where ULONG_MAX < UINTMAX_MAX. (I use unsigned values in this example to give defined behaviour on overflow, so that the expression ULONG_MAX + ULONG_MAX is not just a bug. expr doesn't have this complication.) - C99 doesn't require intmax_t to be the logically longest type. Thus it permits FreeBSD's rather bizarre implementation of intmax_t being plain long which is logically shorter than long long. Other points: - `expr -e 10000000000000000000 + 0' (19 zeros) gives "Result too large", but it isn't the result that is too large, but the arg that is too large. This message is strerror(ERANGE) after strtoimax() sets errno to ERANGE. `expr -e 1000000000000000000 \* 10' gives "overflow". This message is correct, but it is in a different style to strerror() (uncapitalized, and more concise). - `expr 10000000000000000000' (19 or even 119 zeros) gives no error. It is documented that the arg is parsed as a string in this case, and the documentation for -e doesn't clearly say that -e changes this. And -e doesn't change this if the arg clearly isn't a number (e.g., if it is 10000000000000000000mumble), or even if it is a non-decimal number (e.g., if is 010, 0x10 or 10.0). If the arg isn't a decimal integer, then (except for -e on decimal integers), there is an error irrespective of -e when arithmetic is attempted (e.g., adding 0). The error message for this bogusly says "non-numeric argument" when the arg is numeric but not a decimal integer. - POSIX requires brokenness for bases other than 10, but I wonder if an arg like 0x10 invokes undefined behaviour and thus can be made to work. (I wanted to use a hex number since I can never remember what INTMAX_MAX is in decimal and wanted to type it in hex for checking the range and overflow errors.) Allowing hex args causes fewer problems than allowing decimal args larger than INT32_MAX, since they are obviously unportable. Some FreeBSD utilities, e.g., dd, support hex args and don't worry about POSIX restricting them. - POSIX unfortunately requires args larger than INT32_MAX to be unportable (to work if longs are longer than 32 bits, else to give undefined (?) behaviour. For portability there could be a -p switch that limits args to INT32_MAX even if longs are longer than 32 bits. - I hope POSIX doesn't require benign overflow. Thus treating all overflows as errors is good for portability and doesn't require any switch. Bruce