Home > Text Processing & Search > uniq

uniq Command Guide: Remove and Identify Duplicate Lines

The `uniq` command is used to remove consecutively duplicated lines from text files or data passed via pipes, or to count the number of duplicated lines. Its true value is realized when used with the `sort` command, making it an essential tool for data cleaning and analysis.

uniq Command Overview

`uniq` is short for 'unique', and it performs the function of finding and processing duplicate lines within a file. The crucial point here is that `uniq` only processes **consecutively duplicated lines**. Therefore, to remove duplicates from an entire file, you must first sort the data using the `sort` command.

How uniq Works

The `uniq` command reads input lines one by one and compares each with the immediately preceding line. If two lines are identical, they are considered duplicates; otherwise, they are considered new unique lines. Because of this process, to remove duplicates from an entire file, it is essential to first sort the file using the `sort` command.

Key Options

By utilizing the various options of the `uniq` command, you can perform detailed tasks such as duplicate removal, counting, and specific line output.

1. Basic Functions

2. Control Comparison Method

Generated command:

Try combining the commands.

Description:

`uniq` Executes the command.

Combine the above options to virtually execute commands with AI.

Commonly Used Examples

Learn how to effectively process data by using `uniq` with `sort`.

Remove Duplicate Lines from Entire File

sort data.txt | uniq

Sorts `data.txt` with `sort`, then removes duplicate lines from the entire file using `uniq`. This combination is the most common usage.

Output Duplicated Lines with Counts

sort data.txt | uniq -c

Removes duplicate lines from `data.txt` and outputs each line along with the number of times it appeared.

Output Only Duplicated Lines from Entire File

sort data.txt | uniq -d

Outputs only lines that appeared more than once in `data.txt`.

Output Only Unique Lines Appearing Once in Entire File

sort data.txt | uniq -u

Outputs only lines that appeared exactly once and are not duplicated in `data.txt`.

Remove Duplicates Ignoring Specific Fields

sort log.txt | uniq -f 1

Removes duplicates from a log file by ignoring time information (the first field) and comparing only the remaining content.


Related commands

These are commands that are functionally similar or are commonly used together.


Same category commands