Home > Text Processing & Search > cut

cut: Extracting Specific Characters or Fields from Text Files

The `cut` command is used to extract specific portions (characters, bytes, fields) from each line of text files or standard input and output them to standard output. It is particularly useful for data processing and report generation when selectively extracting specific columns.

Overview

The `cut` command is specialized for cutting out desired parts from text data. It can extract by character (-c), byte (-b), or field (-f) units, and is particularly powerful for selectively extracting specific columns from structured text files.

Key Features

  • Character unit extraction (-c)
  • Byte unit extraction (-b)
  • Field unit extraction (-f)
  • Specifying custom delimiters (-d)

Main Options

The `cut` command offers various options to specify the units for text extraction. The `-c` option, in particular, is used for extracting data by character units.

Extraction Methods

Delimiter Specification

Generated command:

Try combining the commands.

Description:

`cut` Executes the command.

Combine the above options to virtually execute commands with AI.

Usage Examples

Learn how to extract characters and fields through various usage examples of the `cut` command.

Extract First 5 Characters of Each Line

echo "Hello World" | cut -c 1-5

Extracts characters from the 1st to the 5th position of the input string.

Extract the 7th Character of Each Line

echo "Hello World" | cut -c 7

Extracts only the 7th character from the input string.

Extract Multiple Character Positions

echo "Hello World" | cut -c 1,5,7

Extracts characters at positions 1, 5, and 7 from the input string.

Extract Specific Character Range from a File

head -n 3 /etc/passwd | cut -c 6-

Extracts characters from the 6th position to the end of each line in the `/etc/passwd` file.

Extract First Field from a Tab-Delimited File

echo -e "apple\torange\tbanana" | cut -f 1

Extracts only the first field from a tab-delimited string.

Extract Username from a Colon-Delimited File

head -n 3 /etc/passwd | cut -d: -f1

Uses the `cut -d:` option to specify a colon as the delimiter and extracts the first field (username) from the `/etc/passwd` file.

Tips & Precautions

Useful tips and points to note when using the `cut` command.

Tips

  • **Handling Multi-byte Characters**: `cut -c` may not correctly handle Unicode characters. If multi-byte characters like Korean are included, using `awk` or `sed` is safer. `cut -b` operates on bytes, so multi-byte characters might be corrupted.
  • **Comparison with `awk`**: `cut` is fast and efficient for simple column/field extraction. However, for more complex conditional processing, data manipulation, or handling multiple delimiters, `awk` is a more powerful alternative.
  • **Specifying Ranges**: `-c 1-5` means from the 1st to the 5th character, `-c -5` means from the first to the 5th character, and `-c 5-` means from the 5th character to the end. These rules apply similarly to `-b` and `-f`.

Related commands

These are commands that are functionally similar or are commonly used together.


Same category commands