From owner-freebsd-bugs@FreeBSD.ORG Wed Oct 6 03:10:30 2004 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6ABFB16A4CE for ; Wed, 6 Oct 2004 03:10:30 +0000 (GMT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 416F543D1F for ; Wed, 6 Oct 2004 03:10:30 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.11/8.12.11) with ESMTP id i963AUve064834 for ; Wed, 6 Oct 2004 03:10:30 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i963AUOh064833; Wed, 6 Oct 2004 03:10:30 GMT (envelope-from gnats) Date: Wed, 6 Oct 2004 03:10:30 GMT Message-Id: <200410060310.i963AUOh064833@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Giorgos Keramidas Subject: Re: bin/72370: awk in -current dumps core X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Giorgos Keramidas List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Oct 2004 03:10:30 -0000 The following reply was made to PR bin/72370; it has been noted by GNATS. From: Giorgos Keramidas To: Joseph Koshy Cc: "David O'Brien" , bug-followup@freebsd.org Subject: Re: bin/72370: awk in -current dumps core Date: Wed, 6 Oct 2004 06:06:26 +0300 On 2004-10-06 02:18, Joseph Koshy wrote: > awk in 5-current dumps core if asked to deference a positional > parameter at a large positive index. There also seems to be numeric > overflow occuring behind the scenes. The following examples show the > difference between GNU awk in 4-STABLE and the awk in 5-current. Others have reported awk allocating huge amounts of memory if a program references a variable with a huge index, which seems to be related to this. > $ echo | /4/usr/bin/awk '{ x = 2147483648; print $x }' > awk: cmd. line:1: (FILENAME=- FNR=1) fatal: attempt to access field -2147483648 Looking at the sources of contrib/one-true-awk I can see several places where an overflow/truncation of values can occur. One example is the code of indirect() in run.c which calls getfval() with: : Awkfloat getfval(Cell *); : Cell *indirect(Node **a, int n) /* $( a[0] ) */ : { : Cell *x; : int m; : : m = (int) getfval(x); : ... There is no guarantee that a plain `int' can hold all the values of an Awkfloat, so here's truncation waiting to happen. The excessive memory allocation is probably caused by the code in lib.c which, in the body of the fldbld() function, fails to check for overflow the field counter; a plain `int' again: 253 void fldbld(void) /* create fields from current record */ 254 { ... 259 int i, j, n; ... 278 for (i = 0; ; ) { ... 283 i++; 284 if (i > nfields) 285 growfldtab(i); There's no check for an overflow of `i' here, so all sorts of funny things can happen if one asks for a large field number. What you see below: > $ echo | /4/usr/bin/awk '{ x = 2147483647; print $x }' > *blank line* > $ echo | /5/usr/bin/awk '{ x = 2147483648; print $x }' > /5/usr/bin/awk: trying to access field -2147483648 > input record number 1, file > source line number 1 is a result of the fieldaddr() function in lib.c, which does: 378 Cell *fieldadr(int n) /* get nth field */ 379 { 380 if (n < 0) 381 FATAL("trying to access field %d", n); 382 if (n > nfields) /* fields after NF are empty */ 383 growfldtab(n); /* but does not increase NF */ 384 return(fldtab[n]); 385 } so negative field numbers are warned about but field numbers greater than the existing fields are silently converted to empty strings. David O'Brien is the one who imported this version of awk in our tree, so he's the right person to decide if we can make changes to one-true-awk to fix the problems it has or do something else and what that 'something else' should be.