| name | biopython |
| description | BioPython library for computational molecular biology. Use when working with biological sequence data, parsing bioinformatics file formats (FASTA, GenBank, PDB), performing sequence analysis, or accessing biological databases programmatically. |
BioPython
BioPython provides tools for biological computation in Python.
Core Modules
from Bio import SeqIO # Sequence I/O
from Bio.Seq import Seq # Sequence objects
from Bio.SeqRecord import SeqRecord
from Bio import Entrez # NCBI database access
from Bio import AlignIO # Alignment I/O
from Bio.Blast import NCBIWWW # BLAST searches
Reading Sequences
# Parse FASTA file
for record in SeqIO.parse("sequences.fasta", "fasta"):
print(f"ID: {record.id}")
print(f"Sequence: {record.seq}")
print(f"Length: {len(record.seq)}")
# Parse GenBank file
for record in SeqIO.parse("sequence.gb", "genbank"):
print(f"Features: {len(record.features)}")
Sequence Operations
from Bio.Seq import Seq
seq = Seq("ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG")
# Transcription and translation
mrna = seq.transcribe()
protein = seq.translate()
# Complement and reverse complement
complement = seq.complement()
rev_complement = seq.reverse_complement()
Writing Sequences
records = [
SeqRecord(Seq("MKQHKAMIVALIVICITAVVAALVTRKDLCEVHIRTGQTEVAVF"),
id="seq1", description="Example protein")
]
SeqIO.write(records, "output.fasta", "fasta")