Claude Code Plugins

Community-maintained marketplace

Feedback

Primary Python tool for 40+ bioinformatics services. Preferred for multi-database workflows: UniProt, KEGG, ChEMBL, PubChem, Reactome, QuickGO. Unified API for queries, ID mapping, pathway analysis. For direct REST control, use individual database skills (uniprot-database, kegg-database).

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name bioservices
description Primary Python tool for 40+ bioinformatics services. Preferred for multi-database workflows: UniProt, KEGG, ChEMBL, PubChem, Reactome, QuickGO. Unified API for queries, ID mapping, pathway analysis. For direct REST control, use individual database skills (uniprot-database, kegg-database).

BioServices

Overview

BioServices is a Python package providing programmatic access to approximately 40 bioinformatics web services and databases. Retrieve biological data, perform cross-database queries, map identifiers, analyze sequences, and integrate multiple biological resources in Python workflows. The package handles both REST and SOAP/WSDL protocols transparently.

When to Use This Skill

This skill should be used when:

  • Retrieving protein sequences, annotations, or structures from UniProt, PDB, Pfam
  • Analyzing metabolic pathways and gene functions via KEGG or Reactome
  • Searching compound databases (ChEBI, ChEMBL, PubChem) for chemical information
  • Converting identifiers between different biological databases (KEGG↔UniProt, compound IDs)
  • Running sequence similarity searches (BLAST, MUSCLE alignment)
  • Querying gene ontology terms (QuickGO, GO annotations)
  • Accessing protein-protein interaction data (PSICQUIC, IntactComplex)
  • Mining genomic data (BioMart, ArrayExpress, ENA)
  • Integrating data from multiple bioinformatics resources in a single workflow

Core Capabilities

1. Protein Analysis

Retrieve protein information, sequences, and functional annotations:

from bioservices import UniProt

u = UniProt(verbose=False)

# Search for protein by name
results = u.search("ZAP70_HUMAN", frmt="tab", columns="id,genes,organism")

# Retrieve FASTA sequence
sequence = u.retrieve("P43403", "fasta")

# Map identifiers between databases
kegg_ids = u.mapping(fr="UniProtKB_AC-ID", to="KEGG", query="P43403")

Key methods:

  • search(): Query UniProt with flexible search terms
  • retrieve(): Get protein entries in various formats (FASTA, XML, tab)
  • mapping(): Convert identifiers between databases

Reference: references/services_reference.md for complete UniProt API details.

2. Pathway Discovery and Analysis

Access KEGG pathway information for genes and organisms:

from bioservices import KEGG

k = KEGG()
k.organism = "hsa"  # Set to human

# Search for organisms
k.lookfor_organism("droso")  # Find Drosophila species

# Find pathways by name
k.lookfor_pathway("B cell")  # Returns matching pathway IDs

# Get pathways containing specific genes
pathways = k.get_pathway_by_gene("7535", "hsa")  # ZAP70 gene

# Retrieve and parse pathway data
data = k.get("hsa04660")
parsed = k.parse(data)

# Extract pathway interactions
interactions = k.parse_kgml_pathway("hsa04660")
relations = interactions['relations']  # Protein-protein interactions

# Convert to Simple Interaction Format
sif_data = k.pathway2sif("hsa04660")

Key methods:

  • lookfor_organism(), lookfor_pathway(): Search by name
  • get_pathway_by_gene(): Find pathways containing genes
  • parse_kgml_pathway(): Extract structured pathway data
  • pathway2sif(): Get protein interaction networks

Reference: references/workflow_patterns.md for complete pathway analysis workflows.

3. Compound Database Searches

Search and cross-reference compounds across multiple databases:

from bioservices import KEGG, UniChem

k = KEGG()

# Search compounds by name
results = k.find("compound", "Geldanamycin")  # Returns cpd:C11222

# Get compound information with database links
compound_info = k.get("cpd:C11222")  # Includes ChEBI links

# Cross-reference KEGG → ChEMBL using UniChem
u = UniChem()
chembl_id = u.get_compound_id_from_kegg("C11222")  # Returns CHEMBL278315

Common workflow:

  1. Search compound by name in KEGG
  2. Extract KEGG compound ID
  3. Use UniChem for KEGG → ChEMBL mapping
  4. ChEBI IDs are often provided in KEGG entries

Reference: references/identifier_mapping.md for complete cross-database mapping guide.

4. Sequence Analysis

Run BLAST searches and sequence alignments:

from bioservices import NCBIblast

s = NCBIblast(verbose=False)

# Run BLASTP against UniProtKB
jobid = s.run(
    program="blastp",
    sequence=protein_sequence,
    stype="protein",
    database="uniprotkb",
    email="your.email@example.com"  # Required by NCBI
)

# Check job status and retrieve results
s.getStatus(jobid)
results = s.getResult(jobid, "out")

Note: BLAST jobs are asynchronous. Check status before retrieving results.

5. Identifier Mapping

Convert identifiers between different biological databases:

from bioservices import UniProt, KEGG

# UniProt mapping (many database pairs supported)
u = UniProt()
results = u.mapping(
    fr="UniProtKB_AC-ID",  # Source database
    to="KEGG",              # Target database
    query="P43403"          # Identifier(s) to convert
)

# KEGG gene ID → UniProt
kegg_to_uniprot = u.mapping(fr="KEGG", to="UniProtKB_AC-ID", query="hsa:7535")

# For compounds, use UniChem
from bioservices import UniChem
u = UniChem()
chembl_from_kegg = u.get_compound_id_from_kegg("C11222")

Supported mappings (UniProt):

  • UniProtKB ↔ KEGG
  • UniProtKB ↔ Ensembl
  • UniProtKB ↔ PDB
  • UniProtKB ↔ RefSeq
  • And many more (see references/identifier_mapping.md)

6. Gene Ontology Queries

Access GO terms and annotations:

from bioservices import QuickGO

g = QuickGO(verbose=False)

# Retrieve GO term information
term_info = g.Term("GO:0003824", frmt="obo")

# Search annotations
annotations = g.Annotation(protein="P43403", format="tsv")

7. Protein-Protein Interactions

Query interaction databases via PSICQUIC:

from bioservices import PSICQUIC

s = PSICQUIC(verbose=False)

# Query specific database (e.g., MINT)
interactions = s.query("mint", "ZAP70 AND species:9606")

# List available interaction databases
databases = s.activeDBs

Available databases: MINT, IntAct, BioGRID, DIP, and 30+ others.

Multi-Service Integration Workflows

BioServices excels at combining multiple services for comprehensive analysis. Common integration patterns:

Complete Protein Analysis Pipeline

Execute a full protein characterization workflow:

python scripts/protein_analysis_workflow.py ZAP70_HUMAN your.email@example.com

This script demonstrates:

  1. UniProt search for protein entry
  2. FASTA sequence retrieval
  3. BLAST similarity search
  4. KEGG pathway discovery
  5. PSICQUIC interaction mapping

Pathway Network Analysis

Analyze all pathways for an organism:

python scripts/pathway_analysis.py hsa output_directory/

Extracts and analyzes:

  • All pathway IDs for organism
  • Protein-protein interactions per pathway
  • Interaction type distributions
  • Exports to CSV/SIF formats

Cross-Database Compound Search

Map compound identifiers across databases:

python scripts/compound_cross_reference.py Geldanamycin

Retrieves:

  • KEGG compound ID
  • ChEBI identifier
  • ChEMBL identifier
  • Basic compound properties

Batch Identifier Conversion

Convert multiple identifiers at once:

python scripts/batch_id_converter.py input_ids.txt --from UniProtKB_AC-ID --to KEGG

Best Practices

Output Format Handling

Different services return data in various formats:

  • XML: Parse using BeautifulSoup (most SOAP services)
  • Tab-separated (TSV): Pandas DataFrames for tabular data
  • Dictionary/JSON: Direct Python manipulation
  • FASTA: BioPython integration for sequence analysis

Rate Limiting and Verbosity

Control API request behavior:

from bioservices import KEGG

k = KEGG(verbose=False)  # Suppress HTTP request details
k.TIMEOUT = 30  # Adjust timeout for slow connections

Error Handling

Wrap service calls in try-except blocks:

try:
    results = u.search("ambiguous_query")
    if results:
        # Process results
        pass
except Exception as e:
    print(f"Search failed: {e}")

Organism Codes

Use standard organism abbreviations:

  • hsa: Homo sapiens (human)
  • mmu: Mus musculus (mouse)
  • dme: Drosophila melanogaster
  • sce: Saccharomyces cerevisiae (yeast)

List all organisms: k.list("organism") or k.organismIds

Integration with Other Tools

BioServices works well with:

  • BioPython: Sequence analysis on retrieved FASTA data
  • Pandas: Tabular data manipulation
  • PyMOL: 3D structure visualization (retrieve PDB IDs)
  • NetworkX: Network analysis of pathway interactions
  • Galaxy: Custom tool wrappers for workflow platforms

Resources

scripts/

Executable Python scripts demonstrating complete workflows:

  • protein_analysis_workflow.py: End-to-end protein characterization
  • pathway_analysis.py: KEGG pathway discovery and network extraction
  • compound_cross_reference.py: Multi-database compound searching
  • batch_id_converter.py: Bulk identifier mapping utility

Scripts can be executed directly or adapted for specific use cases.

references/

Detailed documentation loaded as needed:

  • services_reference.md: Comprehensive list of all 40+ services with methods
  • workflow_patterns.md: Detailed multi-step analysis workflows
  • identifier_mapping.md: Complete guide to cross-database ID conversion

Load references when working with specific services or complex integration tasks.

Installation

pip install bioservices

Dependencies are automatically managed. Package is tested on Python 3.9-3.12.

Additional Information

For detailed API documentation and advanced features, refer to: