02. Awk's Workflow BEGIN, BODY, and END blocks

Let's begin by looking at the step-by-step methodology of how AWK works. AWK starts by executing the BEGIN block. It then enters the BODY block, reading in some record, executing some command, and repeating until the file is exhausted. Finally, the END block is then executed.

  1. Execute awk commands from BEGIN block.
  2. Read in a line from the input stream (may be from a file or directly from std in). Stored in memory.
  3. Execute the awk commands on a line.
  4. Repeat if not end of file.
  5. Execute awk commands from END block.

Awk runs in blocks three blocks - BEGIN, BODY, and END. The BEGIN and END blocks provide the startup and cleanup actions of our program. The BODY block includes lines of pattern & action pairings.

Records and fields in awk.
Awk's workflow: BEGIN, BODY, END. Notice that you don't need the BODY keyword before its block.

BEGIN Block

The BEGIN block executes just once, acting to initialize the program. Here, we can define variables such as FS, RS and ORS, which are initially undefined. Additionally, we may create a header for a data table if not exists.

BEGIN {
    // initialize variables and other commands
}

BODY Block

The BODY block runs on every input line that matches an optional pattern. Note that you don't need any keywords before the opening curly brace for the BODY block.

{
    /pattern/ { actions }
}

END block

The END block is the last block of code to be executed once the file is exhausted. Oftentime it is used to produce summary reports. Precede the block with the END keyword.

END {
    // cleanup
}

The BEGIN and END patterns can occur in any order within the awk program, but convention holds that BEGIN should come first, and END should be last. If there are multiple BEGIN and END blocks, they are processed in order of the AWK file.

Example

Let's now look at an example AWK script. The syntax and variables have not been covered yet, but we wanted to give you a brief gist of what a basic AWK script would look like.

Assume we want to perform two tasks to the grades.txt datafile below. 1) We want to create a header, and 2) we wanted to find out how many students received a B in the class.

# grades.txt
Gil Conrad 98 93 94 A Vern Wynne 85 78 93 B Ingram Dannie 84 85 94 B+ Wright Morty 75 76 79 C+ Johnnie Adair 78 94 87 B

Now we can write our awk script test.awk.

# test.awk
BEGIN {
    # Print the header out before starting anything
    printf "FName\tLName\tExam1\tExam2\tFinal\tGrade\n";
 
    # Initialize any variables
    n = 0;
}
 
{
  # Print each line (called "record")
    print $0
 
    # If the sixth column (called "field") is a B, then increment n
    if($6 == "B") { ++n }
}
 
END {
    # Wrap things up and print out summary variables
    print "Number of students with a B in the class = "     | n;
}

To apply our awk script via the command line, use the -f option.

$ gawk -f test.awk grades.txt
FName LName Exam1 Exam2 Final Grade Gil Conrad 98 93 94 A Vern Wynne 85 78 93 B Ingram Dannie 84 85 94 B+ Wright Morty 75 76 79 C+ Johnnie Adair 78 94 87 B Number of students with a B in the class = 2

Now let's move onto Records and Fields, one of the main backbones of AWK.

Aching back from coding all day?

Acupressure Mat & Pillow

Aching back from coding all day? Try Back Problems

Relieve your stress, back, neck and sciatic pain through 1,782 acupuncture points for immediate neck pain relief. Made for lower, upper and mid chronic back pain treatment, and improves circulation, sleep, digestion and quality of life.

$$ Check price
144.87144.87Amazon 4.5 logo(1,890+ reviews)

More Back Problems resources

Take your Linux skills to the next level!

The Linux Command Line

Take your Linux skills to the next level! Try Linux & UNIX

The Linux Command Line takes you from your very first terminal keystrokes to writing full programs in Bash, the most popular Linux shell. Along the way you'll learn the timeless skills handed down by generations of gray-bearded, mouse-shunning gurus: file navigation, environment configuration, command chaining, pattern matching with regular expressions, and more.

$ Check price
39.9539.95Amazon 4.5 logo(274+ reviews)

More Linux & UNIX resources

Ad