blastdbcmd
Retrieve sequences from BLAST database
SYNOPSIS
blastdbcmd [options]
PARAMETERS
-db <String>
BLAST database name or path (local or remote).
-dbtype {nucl|prot}
Database type: nucleotide (nucl) or protein (prot).
-entry <String>
Single sequence ID (GI or accession).
-entry_batch <File>
File with one sequence ID per line.
-seqidlist <File>
File listing sequence IDs for batch retrieval.
-out <File>
Output file; defaults to stdout.
-outfmt <String>
Output format (e.g., fasta, fastx, 0 for FASTA).
-range <String>
Genomic range (e.g., 1-100).
-strand {plus|minus|both}
DNA strand selection.
-start <Integer>
Starting coordinate (1-based).
-stop <Integer>
Ending coordinate (1-based).
-length <Integer>
Sequence length to extract.
-lineage [{all|taxid}]
Fetch taxonomy lineage.
-taxid <Integer>
Single TaxID filter.
-taxids <String>
Comma-separated TaxIDs.
-taxidlist <File>
File with TaxIDs, one per line.
-title <String>
Custom defline title.
-header {T|F}
Include FASTA header (True).
-info
Display database summary info.
-show_gis
Show GI numbers in output.
-help / -h
Print usage summary.
-version
Print version information.
DESCRIPTION
blastdbcmd is a versatile command-line utility from the NCBI BLAST+ toolkit, designed for extracting and manipulating sequence data from pre-formatted BLAST databases. It enables users to fetch specific sequences by identifiers such as GI numbers, accession numbers, or SeqIDs, supporting both single entries and batch processing from files.
Common use cases include retrieving FASTA sequences for alignment preparation, inspecting database contents, generating custom sequence subsets, or extracting associated metadata like taxonomy lineages and titles. The tool supports a wide range of output formats (e.g., FASTA, tabular, ASN.1) and options for specifying genomic ranges, strands, or taxons.
blastdbcmd is essential in bioinformatics pipelines for database querying without full BLAST searches, offering efficiency for large-scale genomic data handling. It works with both protein and nucleotide databases created via makeblastdb, and supports remote NCBI databases. Its flexibility makes it invaluable for researchers analyzing NGS data, phylogenetics, or functional genomics.
CAVEATS
Requires BLAST+ installation and databases formatted with makeblastdb. Remote access needs internet; large batches may consume high memory. Output formats are case-sensitive.
COMMON EXAMPLE
Single sequence:
blastdbcmd -db nt -entry NC_000001.11 -outfmt fasta
Batch:
blastdbcmd -db nr -entry_batch ids.txt -outfmt fasta > output.fasta
OUTPUT FORMATS
Key codes: 0=FASTA, 1=tabular GI, 2=plain text, 3=ASN.1, 11=XML. See blastdbcmd -help for full list.
HISTORY
Introduced in NCBI BLAST+ 2.2.22 (2010) as a replacement for legacy blastdbcmd from BLAST 2.2.x. Continuously updated with BLAST+ releases for improved performance, remote db support, and new formats.
SEE ALSO
makeblastdb(1), blastn(1), blastp(1), blastx(1), update_blastdb.pl(1)


