Claude Code Plugins

Community-maintained marketplace

Feedback

bulk-rna-seq-differential-expression-with-omicverse

@Starlitnightly/omicverse
768
0

Guide Claude through omicverse's bulk RNA-seq DEG pipeline, from gene ID mapping and DESeq2 normalization to statistical testing, visualization, and pathway enrichment. Use when a user has bulk count matrices and needs differential expression analysis in omicverse.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name bulk-rna-seq-differential-expression-with-omicverse
title Bulk RNA-seq differential expression with omicverse
description Guide Claude through omicverse's bulk RNA-seq DEG pipeline, from gene ID mapping and DESeq2 normalization to statistical testing, visualization, and pathway enrichment. Use when a user has bulk count matrices and needs differential expression analysis in omicverse.

Bulk RNA-seq differential expression with omicverse

Overview

Follow this skill to run the end-to-end differential expression (DEG) workflow showcased in t_deg.ipynb. It assumes the user provides a raw gene-level count matrix (e.g., from featureCounts) and wants to analyse bulk RNA-seq cohorts inside omicverse.

Instructions

  1. Set up the session
    • Import omicverse as ov, scanpy as sc, and matplotlib.pyplot as plt.
    • Call ov.plot_set() so downstream plots adopt omicverse styling.
  2. Prepare ID mapping assets
    • When gene IDs must be converted to gene symbols, instruct the user to download mapping pairs via ov.utils.download_geneid_annotation_pair() and store them under genesets/.
    • Mention the available prebuilt genomes (T2T-CHM13, GRCh38, GRCh37, GRCm39, danRer7, danRer11) and that users can generate their own mapping from GTF files if needed.
  3. Load the raw counts
    • Read tab-delimited featureCounts output with ov.pd.read_csv(..., sep='\t', header=1, index_col=0).
    • Strip trailing .bam segments from column names using list comprehension so sample IDs are clean.
  4. Map gene identifiers
    • Run ov.bulk.Matrix_ID_mapping(counts_df, 'genesets/pair_<GENOME>.tsv') to replace gene_id entries with gene symbols.
  5. Initialise the DEG object
    • Create dds = ov.bulk.pyDEG(mapped_counts).
    • Handle duplicate gene symbols with dds.drop_duplicates_index() to keep the highest expressed version.
  6. Normalise and estimate size factors
    • Execute dds.normalize() to calculate DESeq2 size factors, correcting for library size and batch differences.
  7. Run differential testing
    • Collect treatment and control replicate labels into lists.
    • Call dds.deg_analysis(treatment_groups, control_groups, method='ttest') for the default Welch t-test.
    • Offer optional alternatives: method='edgepy' for edgeR-like tests and method='limma' for limma-style modelling.
  8. Filter and threshold results
    • Note that lowly expressed genes are retained by default; filter using dds.result.loc[dds.result['log2(BaseMean)'] > 1] when needed.
    • Set dynamic fold-change and significance cutoffs via dds.foldchange_set(fc_threshold=-1, pval_threshold=0.05, logp_max=6) (fc_threshold=-1 auto-selects based on log2FC distribution).
  9. Visualise differential expression
    • Produce volcano plots with dds.plot_volcano(title=..., figsize=..., plot_genes=... or plot_genes_num=...) to highlight key genes.
    • Generate per-gene boxplots using dds.plot_boxplot(genes=[...], treatment_groups=..., control_groups=..., figsize=..., legend_bbox=...); adjust y-axis tick labels if required.
  10. Perform pathway enrichment (optional)
    • Download curated pathway libraries through ov.utils.download_pathway_database().
    • Load genesets with ov.utils.geneset_prepare(<path>, organism='Mouse'|'Human'|...).
    • Build the DEG gene list from dds.result.loc[dds.result['sig'] != 'normal'].index.
    • Run enrichment with ov.bulk.geneset_enrichment(gene_list=deg_genes, pathways_dict=..., pvalue_type='auto', organism=...). Encourage users without internet access to provide a background gene list.
    • Visualise single-library results via ov.bulk.geneset_plot(...) and combine multiple ontologies using ov.bulk.geneset_plot_multi(enr_dict, colors_dict, num=...).
  11. Document outputs
    • Suggest exporting dds.result and enrichment tables to CSV for downstream reporting.
    • Encourage users to save figures generated by matplotlib (plt.savefig(...)) when running outside notebooks.
  12. Troubleshooting tips
    • Ensure sample labels in treatment_groups/control_groups exactly match column names post-cleanup.
    • Verify required packages (omicverse, pyComplexHeatmap, gseapy) are installed for enrichment visualisations.
    • Remind users that internet access is required the first time they download gene mappings or pathway databases.

Examples

  • "I have a featureCounts matrix for mouse tumour samples—normalize it with DESeq2, run t-test DEG, and highlight the top 8 genes in a volcano plot."
  • "Use omicverse to compute edgeR-style differential expression between treated and control replicates, then run GO enrichment on significant genes."
  • "Guide me through converting Ensembl IDs to symbols, performing limma DEG, and plotting boxplots for Krtap9-5 and Lef1."

References