samtools
Manipulate and analyze sequence alignment data
TLDR
Convert a SAM input file to BAM stream and save to file
Take input from stdin (-) and print the SAM header and any reads overlapping a specific region to stdout
Sort file and save to BAM (the output format is automatically determined from the output file's extension)
Index a sorted BAM file (creates sorted_input.bam.bai)
Print alignment statistics about a file
Count alignments to each index (chromosome/contig)
Merge multiple files
Split input file according to read groups
SYNOPSIS
samtools
PARAMETERS
view
Converts SAM to BAM, BAM to SAM, calls alignments in a specific region, etc.
sort
Sorts a SAM or BAM file.
index
Indexes a BAM file for fast random access.
merge
Merges multiple SAM or BAM files into one.
mpileup
Generates genotype likelihoods for variant calling.
faidx
Indexes FASTA file.
tview
Text alignment viewer.
depth
Calculates read depth at each position.
flagstat
Provides simple statistics from BAM file.
DESCRIPTION
samtools is a suite of programs for interacting with and manipulating sequence alignment data in the SAM (Sequence Alignment/Map), BAM (Binary Alignment/Map), and CRAM (Compressed Alignment/Map) formats. These formats are commonly used for storing and analyzing next-generation sequencing data. samtools provides a wide range of functionalities, including indexing alignment files for efficient random access, sorting and merging alignment data, filtering reads based on various criteria, generating summary statistics of alignment data, and converting between different alignment formats.
The toolset is essential for bioinformatics workflows involving read mapping, variant calling, and other downstream analyses of sequencing data. samtools is widely used and actively maintained, making it a reliable and powerful resource for researchers working with genomic data.
CAVEATS
Many samtools commands require indexed BAM files for efficient operation. Ensure that BAM files are properly indexed before using commands that require random access.
EXIT STATUS
samtools returns 0 on successful completion, and non-zero on failure.
FORMAT SPECIFICATIONS
samtools primarily works with SAM, BAM and CRAM file formats. Specifications for each format can be found in the samtools documentation.
HISTORY
samtools was initially developed by Heng Li at the Sanger Institute. It has evolved significantly over time with contributions from many developers. It is written in C and designed for performance and efficiency. samtools is an essential component of many bioinformatics pipelines, used for processing and analyzing sequencing data from a variety of platforms.
SEE ALSO
bcftools(1), tabix(1)