nextalign
viral genome sequence alignment tool
TLDR
SYNOPSIS
nextalign run [options]
DESCRIPTION
Nextalign is a viral genome sequence alignment tool. It performs pairwise alignment of viral sequences against a reference and identifies mutations, insertions, and deletions.Nextalign is part of the Nextclade suite, commonly used for SARS-CoV-2 analysis. As of Nextclade v3, the standalone Nextalign CLI has been superseded by nextclade run, which provides the same alignment functionality plus additional analysis. Users are encouraged to migrate to nextclade.
PARAMETERS
--input-ref file
Reference sequence (FASTA). Required when not using --input-dataset.-i, --input file
Input sequences (FASTA).-o, --output-fasta file
Output aligned sequences.--input-annotation file
Genome annotation (GFF3).--input-dataset name
Use a Nextclade dataset (replaces individual --input-ref, --input-annotation).--output-all dir
Write all output files to a directory.--output-translations template
Output translated protein sequences.-j, --jobs n
Number of threads.--include-reference
Include reference sequence in output alignment.--in-order
Output sequences in the same order as input.
OUTPUT FILES
insertions.csv - Insertion positions
translations/ - Translated proteins
CAVEATS
Optimized for viral genomes with low divergence (less than 10% from reference). For more diverse datasets, tools like mafft or minimap2 are more robust. As of v3, the standalone nextalign CLI is removed in favor of nextclade run.
HISTORY
Nextalign was developed at the Nextstrain project, led by Trevor Bedford and Richard Neher, gaining prominence during the COVID-19 pandemic.
