LinuxCommandLibrary

treetime

Visualize and analyze time-resolved phylogenetic trees

TLDR

Infer ancestral sequences maximizing the joint or marginal likelihood

$ treetime ancestral
copy

Analyze patterns of recurrent mutations aka homoplasies
$ treetime homoplasy
copy

Estimate molecular clock parameters and reroot the tree
$ treetime clock
copy

Map discrete character such as host or country to the tree
$ treetime mugration
copy

SYNOPSIS

treetime [--tree tree.nwk] [--aln alignment.fasta] [--dates dates.csv] [--outdir results/] [options]

Alternatively, treetime also functions with subcommands for specific tasks:
treetime <command> [options] [arguments]

PARAMETERS

--tree <file>
    Path to the input phylogenetic tree file in Newick format.

--aln <file>
    Path to the input multiple sequence alignment file in FASTA format.

--dates <file>
    Path to a CSV or TSV file containing sequence names and their collection dates.

--outdir <directory>
    Specify the directory where all output files and plots will be stored.

--reroot <node_name>
    Reroot the tree at the specified sequence name or node ID. Optimal rerooting is done if not specified.

--keep-root
    Prevent treetime from rerooting the input tree, keeping its original root.

--clock-rate <float>
    Set a fixed molecular clock rate instead of estimating it from the data.

--do-coalescent
    Incorporate a coalescent prior into the branch length optimization, useful for population dynamics.

--relax <value>
    Apply a relaxed molecular clock model, allowing for rate variation across branches. The value indicates the degree of relaxation.

DESCRIPTION

treetime is a powerful Python-based bioinformatics tool designed for inferring time-stamped phylogenies from genetic sequence data. It takes a phylogenetic tree, a multiple sequence alignment, and collection dates for the sequences as input. The primary goal of treetime is to estimate the molecular clock, reconstruct ancestral sequences, and provide a time-resolved tree, allowing for insights into evolutionary rates and temporal relationships.

It employs a maximum-likelihood framework, efficiently accounting for varying substitution rates across the tree and along the genome. treetime is particularly useful in evolutionary biology and epidemiology for analyzing rapidly evolving pathogens, where understanding the timing of evolutionary events and transmission dynamics is crucial. It can handle noisy data, provide confidence intervals for age estimates, and supports various clock models, making it a flexible tool for robust phylogenetic dating.

CAVEATS

treetime relies on accurately dated tip sequences for reliable molecular clock estimation; inaccuracies in dates can lead to misleading results. Its computational demands can be significant for extremely large datasets (thousands of sequences), potentially requiring substantial memory and processing time. The quality and diversity of input data directly impact the accuracy of the temporal inferences. It's crucial to ensure clean alignments and correct date formats. While powerful, treetime is not a substitute for initial phylogenetic inference (e.g., building the input tree) or multiple sequence alignment; these steps must typically be performed beforehand.

SUBCOMMANDS AND WORKFLOWS

treetime can be run as a single comprehensive command (often via the configure subcommand implicitly or explicitly), or through individual subcommands for specific analytical tasks. For instance, treetime ancestral reconstructs ancestral sequences, treetime clock estimates molecular clock parameters, and treetime plot visualizes results. This modularity allows for fine-grained control over the analysis pipeline and troubleshooting specific steps.

OUTPUT FILES

The command typically generates several output files within the specified --outdir. Common outputs include a time-resolved Newick tree file with dated nodes and branches, reconstructed ancestral sequences in FASTA format, a JSON file suitable for visualization platforms like Auspice, and various plots (e.g., root-to-tip regression, clock deviation) that help interpret the results of the molecular clock analysis.

HISTORY

treetime was developed by the Neherlab, a research group focusing on computational evolutionary biology and epidemiology, particularly at the Biozentrum, University of Basel. Its development was driven by the need for efficient and accurate phylogenetic dating tools to analyze large and rapidly evolving viral pathogens, such as influenza and SARS-CoV-2. As a Python-based open-source project, treetime leverages scientific libraries to provide a flexible and extensible framework for time-resolved phylogenetics. Its widespread adoption highlights its utility in epidemiological surveillance and evolutionary studies.

SEE ALSO

iqtree(1), muscle(1), mafft(1), fasttree(1)

Copied to clipboard