AWK is a general purpose computer language that is designed for processing text-based data, either in files or data streams. The name AWK is derived from the surnames of its authors — Alfred Aho, Peter Weinberger, and Brian Kernighan; however, it is commonly pronounced "awk" and not as a string of separate letters. awk, when written in all lowercase letters, refers to the Unix or Plan 9 program that runs other programs written in the AWK programming language.
AWK is an example of a programming language that extensively uses the string datatype, associative arrays (that is, arrays indexed by key strings), and regular expressions. The power, terseness, and limitations of AWK programs and sed scripts inspired Larry Wall to write Perl. Because of their dense notation, all these languages are often used for writing one-liner programs.
AWK is one of the early tools to appear in Version 7 Unix and gained popularity as a way to add computational features to a Unix pipeline. A version of the AWK language is a standard feature of nearly every modern Unix-like operating system available today. AWK is mentioned in the Single UNIX Specification as one of the mandatory utilities of a Unix operating system. Besides the Bourne shell, AWK is the only other scripting language available in a standard Unix environment. Implementations of AWK exist as installed software for almost all other operating systems.
/pattern/ { action }
where pattern is a regular expression and action is a command. Most implementations of AWK use extended regular expressions by default. AWK looks through the input file; when it finds a line that matches pattern, it executes the command(s) specified in action. Alternate line forms include:
Each of these forms can be included multiple times in the command file. Lines in the command file are executed in order, so if there are two "BEGIN" statements, the first is executed, then the second, and then the rest of the lines. BEGIN and END statements do not have to be located before and after (respectively) the other lines in the command file.
AWK was created as a broadbased replacement to C algorithmic approaches developed to integrate text parsing methods.
For brevity, the enclosing curly braces ( { } ) will be omitted from these examples.
This displays the contents of the current line. In AWK, lines are broken down into fields, and these can be displayed separately:
Although these fields ($X) may bear resemblance to variables (the $ symbol indicates variables in perl), they actually refer to the fields of the current line. A special case, $0, refers to the entire line. In fact, the commands "print" and "print $0" are identical in functionality.
The print command can also display the results of calculations and/or function calls:
print 3+2 print foobar(3) print foobar(variable) print sin(3-2)
Output may be sent to a file:
print "expression" > "file name"
function add_three (number, temp) { temp = number + 3 return temp }
This statement can be invoked as follows:
print add_three(36) # prints 39
Functions can have variables that are in the local scope. The names of these are added to the end of the argument list, though values for these should be omitted when calling the function. It is convention to add some whitespace in the argument list before the local variables, in order to indicate where the parameters end and the local variables begin.
BEGIN { print "Hello, world!"; exit }
length > 80
{ w += NF; c += length} END { print NR, w, c }
{ s += $1 } END { print s }
BEGIN { FS="*+"} { for (i=1; i<=NF; i++) words*++ } END { for (i in words) print i, words* }
For example, a UNIX command called hello.awk that prints the string "Hello, world!" may be built by going first creating a file named hello.awk containing the following lines:
#!/usr/bin/awk -f BEGIN { print "Hello, world!"; exit }
In 1985 its authors started expanding the language, most significantly by adding user-defined functions. The language is described in the book The AWK Programming Language, published 1988, and its implementation was made available in releases of UNIX System V. To avoid confusion with the incompatible older version, this version was sometimes known as "new awk" or nawk. This implementation was released under a free software license in 1996, and is still maintained by Brian Kernighan. (see external links below)
GNU awk, or gawk, is another free software implementation. It was written before the original implementation became freely available, and is still widely used. Almost every Linux distribution comes with a recent version of gawk and gawk is widely recognized as the de-facto standard implementation in the Linux world, and gawk version 3.0 was included as awk in FreeBSD up to version 5.0. (Subsequent versions of FreeBSD use an awk re-written from scratch in order to avoid the GPL, a more restrictive license than the BSD license.)
xgawk is a SourceForge project based on gawk. It extends gawk with dynamically loadable libraries.
mawk is a very fast AWK implementation by Mike Brennan based on a byte code interpreter.
Downloads and further information about these versions are available from the sites listed below.
Thompson AWK or TAWK is an AWK compiler for DOS and Windows, previously sold by Thompson Automation Software (which has ceased its activities).
Curly bracket programming languages | Domain-specific programming languages | Text-oriented programming languages | Scripting languages | Unix shells | Unix software | Free compilers and interpreters
Awk | Awk | AWK | Awk | AWK | Awk | AWK 프로그래밍 언어 | Awk | Awk | Awk programozási nyelv | AWK | AWK | Awk | Awk | AWK | AWK | AWK (programovací jazyk) | AWK | AWK | AWK (мова програмування) | AWK
This article is licensed under the GNU Free Documentation License.
It uses material from the
"AWK programming language".
Home Page • arts • business • computers • games • health • hospitals • home • kids & teens • news • physicians • recreation• reference • regional • science • shopping • society • sports • world