Home > Text Processing & Search > comm

comm: Compare common and unique lines of two sorted files

The comm command compares the contents of two sorted files and outputs lines unique to each file and lines common to both, divided into three columns. It is useful for merging or de-duplication tasks.

Overview

comm, short for 'common', is a command specialized in comparing lines between two already sorted files. It compares file contents in parallel and displays the results separated into three columns. The first column shows lines unique to the first file, the second column shows lines unique to the second file, and the third column shows lines common to both files. This command is particularly powerful when comparing sorted text data such as database lists or user ID lists.

Key Features

The main features of the comm command are as follows:

  • Used to compare two sorted files. (If files are not sorted, you must use the `sort` command first.)
  • Outputs comparison results neatly separated into three columns.
  • Facilitates quick identification of commonalities and differences in text data.
  • Allows selective hiding of output columns, making it versatile for various uses.

comm vs diff

Both comm and diff are file comparison tools, but they differ in their operation and purpose.

  • comm: Specialized for sorted files, outputs common and unique lines in three columns. Does not detect changes (modifications to line content).
  • diff: Can compare unsorted files and outputs all line-by-line changes (additions, deletions, modifications) in detail.

Key Options

Options for the comm command are primarily used to hide specific columns.

1) Output Column Control

2) Help

Generated command:

Try combining the commands.

Description:

`comm` Executes the command.

Combine the above options to virtually execute commands with AI.

Usage Examples

Learn the functionality of the comm command through various usage examples.

Compare Common and Unique Lines of Two Files

comm file1.txt file2.txt

Compares the contents of two sorted files in three columns.

Output Only Lines Common to Both Files

comm -12 file1.txt file2.txt

Uses the `-1` and `-2` options to hide lines unique to the first and second files, outputting only common lines.

Output Only Unique Lines from Both Files

comm -3 file1.txt file2.txt

Uses the `-3` option to hide common lines, outputting only lines unique to each file.

Compare Unsorted Files

comm <(sort file1.txt) <(sort file2.txt)

You can compare unsorted files by using the `sort` command with `comm` via a pipe (`|`).

Installation

comm is included by default in most Linux distributions as part of the `coreutils` package. No separate installation is required.

Tips & Cautions

Here are some points to note when using the comm command.

Tips

  • Before using comm, **you must sort the contents of the files.** If the files are not sorted, you will not get correct comparison results.
  • To check if files are sorted, you can run `comm file1.txt file2.txt` and see if the output differs from expectations, or sort the files using the `sort` command and then compare them again.
  • The `<(...)` syntax is process substitution, which passes the output of the `sort` command to `comm` as if it were a temporary file. This method allows you to easily compare unsorted files.

Related commands

These are commands that are functionally similar or are commonly used together.


Same category commands