Overview
The `cut` command is specialized for cutting out desired parts from text data. It can extract by character (-c), byte (-b), or field (-f) units, and is particularly powerful for selectively extracting specific columns from structured text files.
Key Features
- Character unit extraction (-c)
- Byte unit extraction (-b)
- Field unit extraction (-f)
- Specifying custom delimiters (-d)
Main Options
The `cut` command offers various options to specify the units for text extraction. The `-c` option, in particular, is used for extracting data by character units.
Extraction Methods
Delimiter Specification
Generated command:
Try combining the commands.
Description:
`cut` Executes the command.
Combine the above options to virtually execute commands with AI.
Usage Examples
Learn how to extract characters and fields through various usage examples of the `cut` command.
Extract First 5 Characters of Each Line
echo "Hello World" | cut -c 1-5
Extracts characters from the 1st to the 5th position of the input string.
Extract the 7th Character of Each Line
echo "Hello World" | cut -c 7
Extracts only the 7th character from the input string.
Extract Multiple Character Positions
echo "Hello World" | cut -c 1,5,7
Extracts characters at positions 1, 5, and 7 from the input string.
Extract Specific Character Range from a File
head -n 3 /etc/passwd | cut -c 6-
Extracts characters from the 6th position to the end of each line in the `/etc/passwd` file.
Extract First Field from a Tab-Delimited File
echo -e "apple\torange\tbanana" | cut -f 1
Extracts only the first field from a tab-delimited string.
Extract Username from a Colon-Delimited File
head -n 3 /etc/passwd | cut -d: -f1
Uses the `cut -d:` option to specify a colon as the delimiter and extracts the first field (username) from the `/etc/passwd` file.
Tips & Precautions
Useful tips and points to note when using the `cut` command.
Tips
- **Handling Multi-byte Characters**: `cut -c` may not correctly handle Unicode characters. If multi-byte characters like Korean are included, using `awk` or `sed` is safer. `cut -b` operates on bytes, so multi-byte characters might be corrupted.
- **Comparison with `awk`**: `cut` is fast and efficient for simple column/field extraction. However, for more complex conditional processing, data manipulation, or handling multiple delimiters, `awk` is a more powerful alternative.
- **Specifying Ranges**: `-c 1-5` means from the 1st to the 5th character, `-c -5` means from the first to the 5th character, and `-c 5-` means from the 5th character to the end. These rules apply similarly to `-b` and `-f`.