LinuxCommandLibrary

join

Join lines from two files by common field

TLDR

Join two files on the first (default) field

$ join [path/to/file1] [path/to/file2]
copy

Join two files using a comma (instead of a space) as the field separator
$ join -t ',' [path/to/file1] [path/to/file2]
copy

Join field3 of file1 with field1 of file2
$ join -1 [3] -2 [1] [path/to/file1] [path/to/file2]
copy

Produce a line for each unpairable line for file1
$ join -a [1] [path/to/file1] [path/to/file2]
copy

Join a file from stdin
$ cat [path/to/file1] | join - [path/to/file2]
copy

SYNOPSIS

join [OPTION]... FILE1 [FILE2]

PARAMETERS

-1 FIELD, --join-field=FIELD
    Join on this field (1-based) of FILE1

-2 FIELD, --join-field2=FIELD
    Join on this field (1-based) of FILE2

-j FIELD
    Equivalent to -1 FIELD -2 FIELD

-a FILENUM
    Print unpairable lines from file 1 or 2

-e STRING
    Replace unpairable lines/empty fields with STRING

-o FORMAT
    Custom output: e.g., auto, 0, 1.2, 2.1 1.2

-t CHAR
    Use CHAR as input/output field separator (default whitespace)

-v FILENUM
    Like -a but exclude lines with matches

--check-order
    Verify input sorted on join fields (default)

--nocheck-order
    Skip sorted input check

--header
    Treat first line of each file as headers, output once

--help
    Display usage summary

--version
    Output version info

DESCRIPTION

The join command merges lines from two sorted text files based on matching keys in specified fields, performing an inner join by default.

It requires input files to be sorted in ascending lexicographic order on the join fields (use sort -k first).

Fields are numbered from 1, separated by whitespace (blanks/tabs) by default; use -t for custom separators like commas.

By default, it outputs the join field once, followed by remaining fields from both files.

Options control unpairable lines: -a includes them, -v outputs only them, -e fills missing fields.

Custom output via -o (e.g., 1.2 2.1 for specific fields).

Ideal for tabular data processing, like merging CSV/logs efficiently without loading entire files into memory. Supports stdin via - for FILE1/FILE2.

Common workflow: sort -k1 file1 > s1; sort -k1 file2 > s2; join s1 s2.

Limited to line-based, single-key joins; for complex logic, use awk.

CAVEATS

Files must be sorted ascending on join fields; unsorted input yields incorrect results. Fields start at 1; leading/trailing whitespace trimmed. No support for multi-key joins natively.

EXAMPLES

join -t$' ' -1 2 -2 3 file1 file2
Joins on field 2 of file1, field 3 of file2, tab-separated.

sort -k1 file1 | join - file2
Pipes sorted stdin as FILE1.

EXIT STATUS

0: success
1: errors in input files
2: other errors (e.g., usage)

HISTORY

Originated in Unix Version 7 (1979); POSIX.1-2001 standardized; GNU coreutils enhances with --header, --check-order since 2005.

SEE ALSO

sort(1), comm(1), cut(1), paste(1), uniq(1)

Copied to clipboard