LinuxCommandLibrary
GitHubF-DroidGoogle Play Store

nextalign

viral genome sequence alignment tool

TLDR

Align sequences to reference
$ nextalign run --input-ref [reference.fasta] -i [sequences.fasta] -o [aligned.fasta]
copy
Use genome annotation
$ nextalign run --input-ref [ref.fasta] --input-annotation [genemap.gff3] -i [seqs.fasta] -o [out.fasta]
copy
Output all results to a directory
$ nextalign run --input-ref [ref.fasta] -i [seqs.fasta] --output-all [output_dir/]
copy
Use a Nextclade dataset instead of individual files
$ nextalign run --input-dataset [nextstrain/sars-cov-2/wuhan-hu-1/orfs] -i [seqs.fasta] -o [out.fasta]
copy
Set number of threads
$ nextalign run -j [8] --input-ref [ref.fasta] -i [seqs.fasta] -o [out.fasta]
copy

SYNOPSIS

nextalign run [options]

DESCRIPTION

Nextalign is a viral genome sequence alignment tool. It performs pairwise alignment of viral sequences against a reference and identifies mutations, insertions, and deletions.Nextalign is part of the Nextclade suite, commonly used for SARS-CoV-2 analysis. As of Nextclade v3, the standalone Nextalign CLI has been superseded by nextclade run, which provides the same alignment functionality plus additional analysis. Users are encouraged to migrate to nextclade.

PARAMETERS

--input-ref file

Reference sequence (FASTA). Required when not using --input-dataset.
-i, --input file
Input sequences (FASTA).
-o, --output-fasta file
Output aligned sequences.
--input-annotation file
Genome annotation (GFF3).
--input-dataset name
Use a Nextclade dataset (replaces individual --input-ref, --input-annotation).
--output-all dir
Write all output files to a directory.
--output-translations template
Output translated protein sequences.
-j, --jobs n
Number of threads.
--include-reference
Include reference sequence in output alignment.
--in-order
Output sequences in the same order as input.

OUTPUT FILES

$ aligned.fasta        - Aligned sequences
insertions.csv       - Insertion positions
translations/        - Translated proteins
copy

CAVEATS

Optimized for viral genomes with low divergence (less than 10% from reference). For more diverse datasets, tools like mafft or minimap2 are more robust. As of v3, the standalone nextalign CLI is removed in favor of nextclade run.

HISTORY

Nextalign was developed at the Nextstrain project, led by Trevor Bedford and Richard Neher, gaining prominence during the COVID-19 pandemic.

SEE ALSO

nextclade(1), mafft(1), minimap2(1)

Copied to clipboard
Kai