Overview
`uniq` is often used in conjunction with the `sort` command via a pipe (|) to efficiently process duplicate lines in sorted data. The `-c` option allows for easy identification of the frequency of duplicate lines.
Key Features
- Processes consecutive duplicate lines
- Counts duplicate lines (-c)
- Case-insensitive comparison option (-i)
- Ignores specific fields or characters for comparison
Key Options
Functionality
Comparison Method
Generated command:
Try combining the commands.
Description:
`uniq` Executes the command.
Combine the above options to virtually execute commands with AI.
Usage Examples
Calculate Word Frequency in a File
sort words.txt | uniq -c
Counts the occurrences of each word (line) in the `words.txt` file. Since `uniq` only processes consecutive duplicates, `sort` is used first to make all duplicates adjacent.
Find Most Frequent Lines in a Log File
cat log.txt | sort | uniq -c | sort -nr
Counts duplicate lines in a log file and then sorts the results numerically in descending order to show the most frequent lines first.
Count Duplicate Lines Ignoring Case
echo -e "Apple\napple\nBanana\napple" | sort | uniq -ci
Counts duplicate lines from standard input, treating 'Apple' and 'apple' as the same.
Count Duplicates Ignoring Specific Fields
sort -k2 data.txt | uniq -f 1 -c
Counts duplicate lines by ignoring the first field and comparing from the second field onwards. For example, if `data.txt` contains `ID1 apple` and `ID2 apple`, 'apple' will be counted as 2.
Tips & Notes
The `uniq` command fundamentally operates on 'consecutive' duplicate lines. To remove or count duplicates across an entire file, you must first sort the lines using the `sort` command.
Usage Tips
- Use with sort: `uniq` only processes consecutive duplicates. To handle duplicates throughout the entire file, sort it first with `sort`. Example: `sort file.txt | uniq -c`
- Find Most Frequent Items: Pipe the output of `uniq -c` to `sort -nr` to sort the most frequent items in descending order. Example: `sort file.txt | uniq -c | sort -nr`
- Performance Considerations: For very large files, consider the memory usage of `sort` and `uniq`. If necessary, you can specify a temporary directory using `sort`'s `-T` option.