needle
Needleman-Wunsch global pairwise sequence alignment (EMBOSS)
TLDR
SYNOPSIS
needle -asequence seqfile -bsequence seqfile -gapopen f -gapextend f -outfile file [options]
DESCRIPTION
needle computes the optimal global pairwise alignment of two sequences using the Needleman-Wunsch dynamic programming algorithm. It ships as part of EMBOSS (European Molecular Biology Open Software Suite) and is intended for nucleotide or protein sequences of comparable length where the entire sequences should be aligned end-to-end.Gap-open and gap-extend penalties are mandatory parameters that shape the alignment, and a scoring matrix (BLOSUM, PAM, EDNAFULL, ...) determines how matches and mismatches are weighted. The output is a formatted alignment that reports score, length, percentage identity, similarity, and gap statistics; many alternative formats are available via -aformat3.For local alignment of subsequences use water; for very long sequences where memory is a concern use stretcher, which implements a linear-space variant of the algorithm.
PARAMETERS
-asequence file
First input sequence (single sequence, any EMBOSS-supported format).-bsequence file
Second input sequence (one or many sequences to align against the first).-gapopen float
Penalty for opening a gap (typical: 10.0 for proteins, 10.0 for DNA).-gapextend float
Penalty for extending an existing gap (typical: 0.5).-datafile matrix
Scoring matrix name (e.g. EBLOSUM62, EDNAFULL).-endweight
Apply end-gap penalties (default: false; end gaps are free).-outfile file
Path to the alignment report.-aformat3 format
Output alignment format (pair, markx0...markx10, msf, fasta, ...).-brief
Print a brief alignment summary instead of the full pairwise view.-auto
Skip all interactive prompts (suitable for scripts).
CAVEATS
Time and memory complexity are O(m·n) in the lengths of the two sequences, so needle is not appropriate for very long sequences — use stretcher instead. Option syntax is EMBOSS-specific (long names introduced by a single dash) and is not interchangeable with GNU-style flags. End gaps are free by default; enable -endweight if you want them penalized.
HISTORY
needle was written by Alan Bleasby as part of EMBOSS, a project started in 1996 at the Sanger Centre / MRC to provide an open, integrated suite of bioinformatics tools. The Needleman-Wunsch algorithm itself was published in 1970 by Saul B. Needleman and Christian D. Wunsch.
