LinuxCommandLibrary

bcftools

SYNOPSIS

bcftools [global-options] <command> [<command-args/options>]

PARAMETERS

-h, --help
    print help and exit

-v, --verbose [0-4]
    verbosity level, 0=quiet, 4=debug (default: 3)

--log-level [+|-]STRING
    control logging: error|warn|info|debug|genotype|all|off

--threads INT
    number of I/O threads to use (default: 0)

-r, --regions STR
    comma-separated regions (chr:pos or chr:from-to)

-R, --regions-file FILE
    regions listed in FILE (BED or list)

-s, --samples [^]LIST
    comma-separated samples, ^ to exclude

-S, --samples-file [^]FILE
    samples listed in FILE, ^ to exclude

-t, --targets [^]STR
    sites matching regions (like --regions)

-T, --targets-file [^]FILE
    sites listed in FILE (like --regions-file)

-o, --output FILE
    output file [stdout]

-O, --output-type b|u|z|v
    bcf|uncompressed_bcf|compressed_vcf|vcf (z=vcf.gz)

--force, -f
    force overwriting output files

--no-version
    suppress version header

-H, --header-only
    stop after printing header (supported by some commands)

DESCRIPTION

BCFtools is a high-performance toolkit for processing genomic variant data in VCF (Variant Call Format) and its binary counterpart BCF. Developed as part of the SAMtools/HTSlib ecosystem, it offers a suite of subcommands for viewing, filtering, annotating, merging, splitting, sorting, indexing, and querying variant files. BCFtools supports compressed formats with CSI or TBI indexing for efficient random access to large datasets from next-generation sequencing. It features multi-threading for I/O and computations, a plugin system for custom extensions, and seamless handling of phased/unphased genotypes. Widely used in bioinformatics pipelines, BCFtools excels in speed, low memory usage, and flexibility, autodetecting input/output formats (.vcf.gz, .bcf, streams). Common tasks include subsetting by region/sample, normalizing indels, generating consensus sequences, and statistical summaries. It integrates with tools like SAMtools for BAM-to-VCF workflows.

CAVEATS

Some subcommands are memory-intensive on whole-genome data; use --regions to subset. Compressed BCF recommended over VCF for large files. Plugins require compilation. Not all options apply to every subcommand.

KEY SUBCOMMANDS

view: subset/filter VCF/BCF
index: create CSI/TBI index
sort: sort by chromosomal position
merge: merge multiple files
call: perform variant calling from mpileup
query: extract fields as TSV
norm: left-normalize indels
annotate: add INFO/FORMAT fields

QUICK EXAMPLE

bcftools view -Oz -o output.vcf.gz input.bcf --threads 4
Subset, compress, and index output VCF.
bcftools query -f '%CHROM %POS %REF %ALT\n' input.vcf
Extract basic variant info.

HISTORY

Originated from SAMtools in 2012 by Petr Danecek at Wellcome Sanger Institute as bcftools (binary VCF tools). Evolved independently post-SAMtools 1.0 (2014), incorporating vcftools features. Active development with releases every few months; version 1.19 (2023) added new plugins and threading improvements. Now essential in GATK-free workflows.

SEE ALSO

samtools(1), tabix(1), bgzip(1), vcf-validator(1)

Copied to clipboard