Home > Text Processing & Search > awk

Mastering the AWK Command: The Magician of Text Processing

AWK is a powerful scripting language and command-line tool used to search for patterns in text files and to process or reformat data based on those patterns. It's utilized for various text processing tasks such as generating data reports, extracting data, and transforming it. Through this guide, you will learn the basic usage and advanced features of AWK.

AWK Overview

AWK is an acronym formed from the initials of its three developers: A. V. Aho, P. J. Weinberger, and B. W. Kernighan. It is a widely used data manipulation language in Unix-like systems. It reads input line by line from files or standard input and processes data based on specified patterns and actions.

How AWK Works

AWK follows a basic structure: `pattern { action }`. For each line read, if the `pattern` matches, the `action` is performed. If there is no pattern, the action is performed for every line. If there is no action, matching lines are printed as is.

Features of AWK

  • Line-by-line processing: Processes input files one line (record) at a time.
  • Field-level access: Divides each line into fields ($1, $2, ...) separated by whitespace (default delimiter) and allows access to them.
  • Pattern matching: Can perform operations only on lines that match a specific pattern.
  • Programming capabilities: Provides basic programming features such as variables, conditional statements, loops, and functions.
  • Report generation: Can easily generate complexly formatted text reports.

Key AWK Commands and Options

Precisely process text data using AWK's various options, built-in variables, and special patterns.

1. Basic Usage and Input/Output Options

2. Built-in Variables

3. Special Patterns

Generated command:

Try combining the commands.

Description:

`awk` Executes the command.

Combine the above options to virtually execute commands with AI.

Usage Examples

Experience the magic of text data processing through various examples of AWK command usage.

Print only the second column from a file

awk '{print $2}' data.txt

Extracts and prints only the second field (column) of each line from the `data.txt` file. (Default delimiter: space)

Print specific columns from a CSV file

awk -F',' '{print "Name: " $1 ", Score: " $3}' scores.csv

Prints the name (first field) and score (third field) from the `scores.csv` file, which is comma (`,`) delimited.

Print only lines containing a specific pattern

awk '/ERROR/{print}' log.txt

Prints all lines containing the string 'ERROR' from the `log.txt` file.

Print with line numbers

awk '{print NR ": " $0}' names.txt

Prints each line of the `names.txt` file prefixed with its line number.

Change the second field value in lines where the first field is 'apple'

awk '$1 == "apple" {$2 = "fruit"; print}' inventory.txt

Finds lines in the file where the first field is 'apple', changes the second field to 'fruit', and then prints the entire line.

Print messages before and after file processing and calculate total sum

awk 'BEGIN {total = 0; print "Calculation Starts!"} {total += $1} END {print "Total: " total}' numbers.txt

Adds all numbers in `numbers.txt` and prints messages at the beginning and end. (Assumes each line contains only one number)


Related commands

These are commands that are functionally similar or are commonly used together.


Same category commands