AWK is an acronym of first letters of its authors (Aho, Weinberger and Kernighan). It is a data-manipulating scripting language with huge possibilities. There are several implementations of it: awk is a canonical one, nawk (new awk), mawk (default in Ubuntu 12.04), gawk is GNU awk. I recommend latter one, because it works correct with unicode symbols in example:
$ echo юникод | gawk "{res = toupper(\$1); print res;}"
ЮНИКОД
Most useful feature is writing script files to be loaded in awk later. One can execute script file by
gawk [options] -f script_file.awk input_file
If there is no input file awk will read standard input stream.
Let’s take a look at an example. It reads input stream, writes down first argument to history, increases counter by 1 and prints “”. Code of script.awk:
#!/usr/bin/gawk -f
# BEGIN block executes only once after running awk
BEGIN {
print "\nBegin printing args\n";
i = 0;
}
# Main block executes for every argument
{
i++;
history[i] = $1;
print i, $1;
if ($1 == 0)
exit(0);
}
# END block executes only once at finishing awk
END {
print "\nArguments were: ";
for (n=1; n<=i; ++n)
print history[n]," ";
print "\nEnd printing args\n"
}
Then, make script executable and run it:
$chmod +x ./script.awk
$./script.awk
Output will be like (“one”-enter, “cat”-enter, “dog"-enter, “0"-enter):
Begin printing args
one
1 one
cat
2 cat
dog
3 dog
0
4 0
Arguments were:
one
cat
dog
0
End printing args
Awk can be launched with script inline:
gawk [options] ''script_text'' file(s)
Example counts “block" words in code listed above:
awk "BEGIN{blocks=0} /block/{blocks++} END{ print blocks}" script.awk
Where /regular expression/
controls whether block after it will be executed.
In awk user-defined functions can be added as follows:
#!/usr/bin/gawk -f
# returns sum of numbers
function sum(a, b, c) {
res = a + b + c;
return res;
}
# main program, for testing
{
print sum($1, $2, $3);
}