Home > Text Processing & Search > join

join: Merge Common Fields of Two Files

The join command merges lines of two sorted text files based on a specified common field and outputs the result to standard output. It functions similarly to a JOIN operation in databases, combining corresponding lines from each file to create new lines.

Overview

join compares specific fields of two files and combines matching lines. This command works correctly only if the input files are sorted based on the common field. Using it on unsorted files may lead to unexpected results.

Key Features

  • Merges lines based on common fields from two files
  • Requires input files to be sorted
  • Controls output format with various options
  • Useful for data integration and report generation

Key Options

The join command allows fine-grained control over the merge criteria, output format, and handling of non-matching lines through various options.

Field Specification and Delimiters

Output Control

Generated command:

Try combining the commands.

Description:

`join` Executes the command.

Combine the above options to virtually execute commands with AI.

Usage Examples

Learn how to effectively merge data from two files through various usage examples of the join command.

Basic Join

echo "1 apple\n2 banana" > file1.txt && echo "1 red\n2 yellow" > file2.txt && join file1.txt file2.txt

Merges based on the first field of both files. (Create files and then execute)

Join on Specific Field

echo "apple 1\nbanana 2" > file3.txt && echo "red 1\nyellow 2" > file4.txt && join -j 2 file3.txt file4.txt

Merges based on the second field of both files. (Create files and then execute)

Using Tab Delimiter

echo -e "id\tname\n1\tAlice\n2\tBob" > users.tsv && echo -e "id\tcity\n1\tSeoul\n3\tParis" > cities.tsv && join -t $'\t' users.tsv cities.tsv

Merges files delimited by tabs. (Create files and then execute)

Including Unmatched Lines

echo "1 apple\n2 banana\n3 orange" > file5.txt && echo "1 red\n2 yellow" > file6.txt && join -a 1 file5.txt file6.txt

Includes lines present only in the first file (file5.txt) in the output. (Create files and then execute)

Outputting Specific Fields Only

echo "1 apple\n2 banana" > file7.txt && echo "1 red\n2 yellow" > file8.txt && join -o 1.1,1.2,2.2 file7.txt file8.txt

Outputs the first and second fields from the first file and the second field from the second file. (Create files and then execute)

Tips & Precautions

Useful tips and points to be aware of when using the join command.

Important Tips

  • **Input File Sorting**: The join command only works correctly if the input files are sorted by the join field. It is essential to sort them beforehand using the `sort` command. Example: `sort file1.txt > sorted_file1.txt`
  • **Field Delimiter**: The default delimiter is whitespace (space, tab). To use a different delimiter, use the `-t` option. For example, for CSV files, use `-t ','`.
  • **Output Format Control**: The `-o` option allows precise control over the order and inclusion of fields in the output. Specify in the format `FILENUM.FIELDNUM` (e.g., `1.2` refers to the second field of the first file).
  • **Handling Unmatched Lines**: Use the `-a` or `-v` options to include unmatched lines or to output only unmatched lines.

Same category commands