comm
Compare two sorted files line by line
TLDR
Produce three tab-separated columns: lines only in first file, lines only in second file, and common lines
Print only lines common to both files
Print only lines common to both files, reading one file from stdin
Get lines only found in first file, saving the result to a third file
Print lines only found in second file, when the files aren't sorted
SYNOPSIS
comm [OPTION]... FILE1 FILE2
PARAMETERS
-1
suppress lines unique to FILE1 (column 1)
-2
suppress lines unique to FILE2 (column 2)
-3
suppress lines common to both files (column 3)
-z, --zero-terminated
separate lines/files with NUL instead of newline/tab (GNU)
--help
display usage information and exit
--version
output version information and exit
DESCRIPTION
comm compares two sorted text files line by line, outputting three tab-separated columns to stdout:
- Column 1: lines unique to FILE1
- Column 2: lines unique to FILE2
- Column 3: lines common to both
Files must be sorted ascending with sort(1) using the current locale's collating sequence, or output is meaningless. comm is ideal for finding differences, intersections, or symmetric differences in large datasets efficiently.
Options suppress columns: -1 hides FILE1 uniques, -2 hides FILE2 uniques, -3 hides commons. Combine for specific outputs, e.g., comm -12 shows only commons (like intersection).
If FILE1 or FILE2 is -, read from stdin. Use -z for NUL-delimited lines/files (GNU extension).
Exit status: 0 on success, 1 on error (e.g., unreadable files). No locale-dependent case folding or whitespace normalization.
CAVEATS
Input files must be pre-sorted with sort(1); unsorted input yields incorrect results. Cannot compare stdin to itself directly. No support for unsorted data or regex patterns.
COLUMN DELIMITERS
Default: tab. Column 1: no prefix; column 2: 1 tab; column 3: 2 tabs. -z uses NUL.
EXAMPLES
sort file1 > f1; sort file2 > f2; comm f1 f2 # full comparison
comm -12 f1 f2 # common lines only
comm -23 f1 f2 # lines only in f1
HISTORY
Introduced in Version 7 Unix (1979). Standardized in POSIX.1-2001. GNU coreutils version adds -z (1997+). Widely used for data processing pipelines.


