| name | fasttree |
| description | Use FastTree for rapid approximate maximum-likelihood phylogenetic tree construction. |
FastTree
FastTree builds approximately-maximum-likelihood phylogenetic trees from large alignments quickly.
Basic Usage
# Nucleotide alignment (default: JC+CAT)
fasttree -nt alignment.fasta > tree.nwk
# Protein alignment (default: JTT+CAT)
fasttree alignment.fasta > tree.nwk
# With GTR model for nucleotides
fasttree -gtr -nt alignment.fasta > tree.nwk
Model Options
# Nucleotide models
fasttree -nt alignment.fasta > tree.nwk # Jukes-Cantor
fasttree -gtr -nt alignment.fasta > tree.nwk # GTR
# Protein models
fasttree alignment.fasta > tree.nwk # JTT (default)
fasttree -wag alignment.fasta > tree.nwk # WAG
fasttree -lg alignment.fasta > tree.nwk # LG
# Rate variation
fasttree -gamma -nt alignment.fasta > tree.nwk # Gamma20 likelihoods
Accuracy Options
# More accurate but slower
fasttree -slow -nt alignment.fasta > tree.nwk
# Exhaustive search (slowest, most accurate)
fasttree -slow -slownni -nt alignment.fasta > tree.nwk
# More rounds of NNI
fasttree -spr 4 -mlacc 2 -nt alignment.fasta > tree.nwk
Support Values
# Local support values (Shimodaira-Hasegawa test)
fasttree -nt alignment.fasta > tree.nwk
# Support values are automatically included in output
# Bootstrap (requires multiple alignments)
# Generate bootstrap alignments first, then:
for i in {1..100}; do
fasttree -nt bootstrap_${i}.fasta > boot_${i}.nwk
done
Python Integration
import subprocess
from Bio import Phylo
from io import StringIO
def run_fasttree(alignment_file, model='gtr', is_nucleotide=True):
"""Run FastTree and return tree object."""
cmd = ['fasttree']
if is_nucleotide:
cmd.append('-nt')
if model.lower() == 'gtr':
cmd.append('-gtr')
else:
if model.lower() == 'wag':
cmd.append('-wag')
elif model.lower() == 'lg':
cmd.append('-lg')
cmd.append(alignment_file)
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode == 0:
tree = Phylo.read(StringIO(result.stdout), 'newick')
return tree
else:
raise RuntimeError(f"FastTree failed: {result.stderr}")
# Run FastTree
tree = run_fasttree('alignment.fasta', model='gtr', is_nucleotide=True)
Phylo.draw_ascii(tree)
# Save tree
Phylo.write(tree, 'output_tree.nwk', 'newick')
Handling Large Datasets
# For very large alignments, use double precision
FastTreeDbl -gtr -nt large_alignment.fasta > tree.nwk
# Reduce memory usage
fasttree -fastest -nt alignment.fasta > tree.nwk
Comparing FastTree with RAxML
def compare_trees(tree1_file, tree2_file):
"""Compare two phylogenetic trees."""
from Bio import Phylo
tree1 = Phylo.read(tree1_file, 'newick')
tree2 = Phylo.read(tree2_file, 'newick')
# Get terminal names
taxa1 = set(t.name for t in tree1.get_terminals())
taxa2 = set(t.name for t in tree2.get_terminals())
# Check for same taxa
if taxa1 != taxa2:
print(f"Warning: Different taxa sets")
# Calculate Robinson-Foulds distance
# (requires additional tools like ete3)
from ete3 import Tree
t1 = Tree(tree1_file)
t2 = Tree(tree2_file)
rf, max_rf, common, parts_t1, parts_t2, set1, set2 = t1.robinson_foulds(t2)
return {
'rf_distance': rf,
'max_rf': max_rf,
'normalized_rf': rf / max_rf if max_rf > 0 else 0
}
Output Format
FastTree outputs Newick format with:
- Branch lengths (substitutions per site)
- Support values (SH-like local support)
# Example output
((A:0.1,B:0.2)0.95:0.3,(C:0.1,D:0.2)0.99:0.3);
Performance Comparison
| Tool | Speed | Accuracy | Memory |
|---|---|---|---|
| FastTree | Very fast | Good | Low |
| RAxML | Slow | Best | High |
| IQ-TREE | Medium | Very good | Medium |
| PhyML | Medium | Good | Medium |
FastTree is ideal for:
- Very large datasets (>10,000 sequences)
- Quick exploratory analyses
- When RAxML is too slow