| name | sequence-alignment |
| description | Pairwise and multiple sequence alignment algorithms. Use when comparing biological sequences, finding conserved regions, or computing sequence similarity and identity scores using tools like MUSCLE, ClustalW, or pairwise alignment algorithms. |
Sequence Alignment
Tools for aligning biological sequences.
Pairwise Alignment
from Bio import pairwise2
from Bio.pairwise2 import format_alignment
seq1 = "ACCGGT"
seq2 = "ACGT"
# Global alignment (Needleman-Wunsch)
alignments = pairwise2.align.globalxx(seq1, seq2)
print(format_alignment(*alignments[0]))
# Local alignment (Smith-Waterman)
alignments = pairwise2.align.localxx(seq1, seq2)
# With scoring parameters
alignments = pairwise2.align.globalms(seq1, seq2,
match=2,
mismatch=-1,
open=-0.5,
extend=-0.1)
Substitution Matrices
from Bio.Align import substitution_matrices
# Load BLOSUM62
blosum62 = substitution_matrices.load("BLOSUM62")
# Access scores
score = blosum62["A", "A"] # Match score
score = blosum62["A", "R"] # Mismatch score
Multiple Sequence Alignment
from Bio.Align.Applications import MuscleCommandline
from Bio import AlignIO
# Run MUSCLE alignment
muscle_cline = MuscleCommandline(input="sequences.fasta",
out="aligned.fasta")
muscle_cline()
# Read alignment
alignment = AlignIO.read("aligned.fasta", "fasta")
print(f"Alignment length: {alignment.get_alignment_length()}")
print(f"Number of sequences: {len(alignment)}")
# Calculate conservation
from Bio.Align import AlignInfo
summary = AlignInfo.SummaryInfo(alignment)
consensus = summary.dumb_consensus()