| name | lifesciences-crispr |
| description | Validates synthetic lethality claims from CRISPR knockout screens using BioGRID ORCS 5-phase workflow. This skill should be used when the user asks to "validate synthetic lethality", "query CRISPR essentiality data", "find gene dependencies", "compare cell line screens", or mentions BioGRID ORCS, gene knockout data, essentiality scores (CERES, MAGeCK, BAGEL), or asks to validate claims from published CRISPR papers. |
CRISPR Essentiality & Synthetic Lethality Validation
Validate synthetic lethality hypotheses using BioGRID ORCS CRISPR screen data via curl.
Quick Reference
| Task | Endpoint |
|---|---|
| Get gene essentiality data | /gene/{entrez_id} |
| Find cell line screens | /screens/?cellLine={name} |
| List all screens | /screens/ |
BioGRID ORCS API
IMPORTANT: Use orcsws.thebiogrid.org (NOT orcs.thebiogrid.org)
- Base URL:
https://orcsws.thebiogrid.org - Auth: Requires
BIOGRID_API_KEY(free: https://webservice.thebiogrid.org/) - Rate Limit: ~10 req/s
- Format: Tab-delimited (NOT JSON - don't use
format=jsonparameter)
Data Format
Tab-delimited with key columns:
- Column 1: Screen ID
- Column 2: Entrez Gene ID
- Column 4: Gene Symbol
- Column 8: Essentiality Score (CERES/MAGeCK/etc.)
- Column 9: FDR (False Discovery Rate)
5-Phase Synthetic Lethality Validation Workflow
Phase 1: Resolve Gene Identifiers
Use Life Sciences MCPs to get Entrez IDs:
# Using HGNC MCP: search_genes → get_gene
# Extract "entrez" from cross_references object
# Example: DHODH → {"cross_references": {"entrez": "1723"}}
# Using Entrez MCP: search_genes
# Top result format: "NCBIGene:1723"
# Extract numeric ID: 1723
Phase 2: Query ORCS for Essentiality Data
# Get all CRISPR screens for gene (using Entrez ID)
curl -s "https://orcsws.thebiogrid.org/gene/1723?accesskey=${BIOGRID_API_KEY}" > gene_screens.tsv
# Count screens
wc -l gene_screens.tsv
# Output: ~1400 screens for popular genes like DHODH
Phase 3: Identify Relevant Cell Line Screens
# Find screens for specific cell line
curl -s "https://orcsws.thebiogrid.org/screens/?cellLine=786-O&accesskey=${BIOGRID_API_KEY}" \
| awk 'BEGIN{FS="\t"} {print "Screen " $1 ": " $25}'
# Output: Screen 213: 786-O
# Find screens for another cell line
curl -s "https://orcsws.thebiogrid.org/screens/?cellLine=Caki-1&accesskey=${BIOGRID_API_KEY}" \
| awk 'BEGIN{FS="\t"} {print "Screen " $1 ": " $25}'
# Output: Screen 511: Caki-1
Phase 4: Extract Dependency Scores
# Filter gene data by Screen ID
curl -s "https://orcsws.thebiogrid.org/gene/1723?accesskey=${BIOGRID_API_KEY}" | \
grep -E "^213\s|^511\s" | \
awk 'BEGIN{FS="\t"} {print "Screen " $1 ": score=" $8 ", FDR=" $9}'
# Output:
# Screen 213 (786-O): score=-0.377, FDR=0.0452 ✅ Significant
# Screen 511 (Caki-1): score=-0.556, FDR=0.036 ✅ Significant
Interpretation:
- Negative score = gene is essential (cell death upon knockout)
- FDR < 0.05 = statistically significant dependency
- CERES score: Most common, corrects for copy number effects
Phase 5: Compare Across Genetic Backgrounds
# Find all renal cancer cell line screens
curl -s "https://orcsws.thebiogrid.org/screens/?accesskey=${BIOGRID_API_KEY}" | \
grep -i "renal" | awk 'BEGIN{FS="\t"} {print $1 "\t" $25}' | sort -u
# Extract scores for all renal lines
curl -s "https://orcsws.thebiogrid.org/gene/1723?accesskey=${BIOGRID_API_KEY}" | \
grep -E "^213\s|^511\s|^512\s|^204\s|^205\s" | \
awk 'BEGIN{FS="\t"} {printf "%-10s Score: %s FDR: %s\n", $1, $8, $9}'
Scoring Methods
| Method | Description | Interpretation |
|---|---|---|
| CERES | Computational correction for copy number effects | Most common, negative = essential |
| Kolmogorov-Smirnov | Statistical enrichment test | Log p-value, higher = more significant |
| MAGeCK | Model-based Analysis of Genome-wide CRISPR | Negative = depletion = essential |
| BAGEL | Bayesian Analysis of Gene EssentiaLity | Bayes Factor, positive = essential |
Common Pitfalls
- Wrong endpoint: Use
orcsws.thebiogrid.org(NOTorcs.thebiogrid.org) - Missing API key: Check
.envfile first:grep BIOGRID_API_KEY .env - JSON format error: Don't use
format=json- API returns tab-delimited by default - Gene identifiers: Always use Entrez IDs (not gene symbols) for
/gene/{id}endpoint - Screen metadata: Use
/screens/?cellLine={name}to find Screen IDs before querying gene data - Column numbers: Score = column 8, FDR = column 9
Complete Example
See references/biogrid-orcs-validation.md for a complete worked example validating DHODH/VHL synthetic lethality from a Science Advances paper.
Result: 2/4 VHL-mutant lines show significant DHODH dependency (context-dependent penetrance).
Integration with Life Sciences MCPs
Workflow:
- Use HGNC or Entrez MCP to resolve gene symbols to Entrez IDs
- Query ORCS for essentiality data using Entrez IDs
- Analyze dependency scores across cell lines
- Validate synthetic lethality hypotheses
Example:
# Phase 1: Use MCP to get Entrez ID
hgnc_result = await client.call_tool("hgnc_search_genes", {"query": "DHODH"})
gene = await client.call_tool("hgnc_get_gene", {"hgnc_id": hgnc_result["items"][0]["id"]})
entrez_id = gene["cross_references"]["entrez"] # "1723"
# Phase 2-5: Use curl with ORCS (see workflow above)
References
- BioGRID ORCS: https://orcs.thebiogrid.org/
- API Key: https://webservice.thebiogrid.org/
- Meyers RM et al. 2017: CERES methodology (PMID: 29083409)
- Complete Example: references/biogrid-orcs-validation.md