Go forward to Close Input.
Go backward to Multiple Line.
Go up to Reading Files.
Explicit Input with `getline'
=============================
So far we have been getting our input files from `awk''s main input
stream--either the standard input (usually your terminal) or the files
specified on the command line. The `awk' language has a special
built-in command called `getline' that can be used to read input under
your explicit control.
This command is quite complex and should *not* be used by beginners.
It is covered here because this is the chapter on input. The examples
that follow the explanation of the `getline' command include material
that has not been covered yet. Therefore, come back and study the
`getline' command *after* you have reviewed the rest of this manual and
have a good knowledge of how `awk' works.
`getline' returns 1 if it finds a record, and 0 if the end of the
file is encountered. If there is some error in getting a record, such
as a file that cannot be opened, then `getline' returns -1. In this
case, `gawk' sets the variable `ERRNO' to a string describing the error
that occurred.
In the following examples, COMMAND stands for a string value that
represents a shell command.
`getline'
The `getline' command can be used without arguments to read input
from the current input file. All it does in this case is read the
next input record and split it up into fields. This is useful if
you've finished processing the current record, but you want to do
some special processing *right now* on the next record. Here's an
example:
awk '{
if (t = index($0, "/*")) {
if (t > 1)
tmp = substr($0, 1, t - 1)
else
tmp = ""
u = index(substr($0, t + 2), "*/")
while (u == 0) {
getline
t = -1
u = index($0, "*/")
}
if (u <= length($0) - 2)
$0 = tmp substr($0, t + u + 3)
else
$0 = tmp
}
print $0
}'
This `awk' program deletes all C-style comments, `/* ... */',
from the input. By replacing the `print $0' with other
statements, you could perform more complicated processing on the
decommented input, like searching for matches of a regular
expression. (This program has a subtle problem--can you spot it?)
This form of the `getline' command sets `NF' (the number of
fields; see Examining Fields: Fields.), `NR' (the number of
records read so far; see How Input is Split into Records: Records.), `FNR' (the number of records read from this input
file), and the value of `$0'.
*Note:* the new value of `$0' is used in testing the patterns of
any subsequent rules. The original value of `$0' that triggered
the rule which executed `getline' is lost. By contrast, the
`next' statement reads a new record but immediately begins
processing it normally, starting with the first rule in the
program. See The `next' Statement: Next Statement.
`getline VAR'
This form of `getline' reads a record into the variable VAR. This
is useful when you want your program to read the next record from
the current input file, but you don't want to subject the record
to the normal input processing.
For example, suppose the next line is a comment, or a special
string, and you want to read it, but you must make certain that it
won't trigger any rules. This version of `getline' allows you to
read that line and store it in a variable so that the main
read-a-line-and-check-each-rule loop of `awk' never sees it.
The following example swaps every two lines of input. For
example, given:
wan
tew
free
phore
it outputs:
tew
wan
phore
free
Here's the program:
awk '{
if ((getline tmp) > 0) {
print tmp
print $0
} else
print $0
}'
The `getline' function used in this way sets only the variables
`NR' and `FNR' (and of course, VAR). The record is not split into
fields, so the values of the fields (including `$0') and the value
of `NF' do not change.
`getline < FILE'
This form of the `getline' function takes its input from the file
FILE. Here FILE is a string-valued expression that specifies the
file name. `< FILE' is called a "redirection" since it directs
input to come from a different place.
This form is useful if you want to read your input from a
particular file, instead of from the main input stream. For
example, the following program reads its input record from the
file `foo.input' when it encounters a first field with a value
equal to 10 in the current input file.
awk '{
if ($1 == 10) {
getline < "foo.input"
print
} else
print
}'
Since the main input stream is not used, the values of `NR' and
`FNR' are not changed. But the record read is split into fields in
the normal manner, so the values of `$0' and other fields are
changed. So is the value of `NF'.
This does not cause the record to be tested against all the
patterns in the `awk' program, in the way that would happen if the
record were read normally by the main processing loop of `awk'.
However the new record is tested against any subsequent rules,
just as when `getline' is used without a redirection.
`getline VAR < FILE'
This form of the `getline' function takes its input from the file
FILE and puts it in the variable VAR. As above, FILE is a
string-valued expression that specifies the file from which to
read.
In this version of `getline', none of the built-in variables are
changed, and the record is not split into fields. The only
variable changed is VAR.
For example, the following program copies all the input files to
the output, except for records that say `@include FILENAME'. Such
a record is replaced by the contents of the file FILENAME.
awk '{
if (NF == 2 && $1 == "@include") {
while ((getline line < $2) > 0)
print line
close($2)
} else
print
}'
Note here how the name of the extra input file is not built into
the program; it is taken from the data, from the second field on
the `@include' line.
The `close' function is called to ensure that if two identical
`@include' lines appear in the input, the entire specified file is
included twice. See Closing Input Files and Pipes: Close Input.
One deficiency of this program is that it does not process nested
`@include' statements the way a true macro preprocessor would.
`COMMAND | getline'
You can "pipe" the output of a command into `getline'. A pipe is
simply a way to link the output of one program to the input of
another. In this case, the string COMMAND is run as a shell
command and its output is piped into `awk' to be used as input.
This form of `getline' reads one record from the pipe.
For example, the following program copies input to output, except
for lines that begin with `@execute', which are replaced by the
output produced by running the rest of the line as a shell command:
awk '{
if ($1 == "@execute") {
tmp = substr($0, 10)
while ((tmp | getline) > 0)
print
close(tmp)
} else
print
}'
The `close' function is called to ensure that if two identical
`@execute' lines appear in the input, the command is run for each
one. See Closing Input Files and Pipes: Close Input.
Given the input:
foo
bar
baz
@execute who
bletch
the program might produce:
foo
bar
baz
hack ttyv0 Jul 13 14:22
hack ttyp0 Jul 13 14:23 (gnu:0)
hack ttyp1 Jul 13 14:23 (gnu:0)
hack ttyp2 Jul 13 14:23 (gnu:0)
hack ttyp3 Jul 13 14:23 (gnu:0)
bletch
Notice that this program ran the command `who' and printed the
result. (If you try this program yourself, you will get different
results, showing you who is logged in on your system.)
This variation of `getline' splits the record into fields, sets the
value of `NF' and recomputes the value of `$0'. The values of
`NR' and `FNR' are not changed.
`COMMAND | getline VAR'
The output of the command COMMAND is sent through a pipe to
`getline' and into the variable VAR. For example, the following
program reads the current date and time into the variable
`current_time', using the `date' utility, and then prints it.
awk 'BEGIN {
"date" | getline current_time
close("date")
print "Report printed on " current_time
}'
In this version of `getline', none of the built-in variables are
changed, and the record is not split into fields.