Input/Output to an external program from within awk
Reading output from external program
Awk has a way to read output from external programs. Here is an example where we will use only the BEGIN block in order to simplify the discussion.
read_ext1.awk
#!/usr/bin/awk -f
BEGIN{
while ("lsb_release -a" | getline){
print "awk:",$0
}
}
$ ./read_ext1.awk
No LSB modules are available.
awk: Distributor ID: Ubuntu
awk: Description: Ubuntu 18.04.4 LTS
awk: Release: 18.04
awk: Codename: bionic
Note
No LSB modules are available.
was sent to /dev/stderr
by lsb_release
and awk newer got to read it on first place.
Warning
Kepp in mind that getline
will read one line and store it in $0
by replacing the content from the common lines read by awk.
To avoid this use getline variablename
to store the line in new variable. more...
Info
Awk can getline
directly from another file instead of the one that awk is currently reading - getline < filename
more...
This second variant will produce the same result, but also illustrates the use of close()
.
read_ext2.awk
#!/usr/bin/awk -f
BEGIN{
cmd="lsb_release -a"
while (cmd | getline){
print "awk:",$0
}
close(cmd)
}
Question
What happens if you try to read the output second time without closing?
How about if we want to get only the bionic
from the Codename (ignore that you can request this by lsb_release -c
)
This version will print only what we need.
read_ext3.awk
#!/usr/bin/awk -f
BEGIN{
cmd="lsb_release -a"
while (cmd | getline){
if($1 == "Codename:") print $2
}
close(cmd)
}
Note
You need to redirect standard error to get the clean output.
$ ./read_ext3.awk 2> /dev/null
bionic
Sending data to external program (and reading the output)
These examples perhaps are not the best use but will illustrate how awk can send data to the standard input of an external program and read the produced output so you can use the data in your script. Awk does not have a function to find the greatest common divisor but python has such function math.gcd and we can use it by sending commands directly to python.
gcd1.awk
1 2 3 4 |
|
$ ./gcd1.awk
4
This will simply send the commands to python and the output will be printed to standard output.
We want the result back.
gcd2.awk
1 2 3 4 5 6 7 8 9 10 |
|
There is a complication, though. Python is an interactive program and expects end of stream in order to preprocess the data or in many situations - to flush the input buffer.
The solution to this is to call close(cmd,"to")
function on line 6, deeply buried in the awk documentation.
This last example covers more or less the most complicated situation. Usually one can get away with fewer lines. Note also that we getline
-ed only once since we wanted only the first line in the output. This might not be the case and you might need to run while
loop to read all lines.
Summary of the eight variants of getline, listing which predefined variables are set by each one, and whether the variant is standard or a gawk extension.
Variant | Effect | awk / gawk |
---|---|---|
getline | Sets $0, NF, FNR, NR, and RT | awk |
getline var | Sets var, FNR, NR, and RT | awk |
getline < file | Sets $0, NF, and RT | awk |
getline var < file | Sets var and RT | awk |
command | getline | Sets $0, NF, and RT | awk |
command | getline var | Sets var and RT | awk |
command |& getline | Sets $0, NF, and RT | gawk |
command |& getline var | Sets var and RT | gawk |