| name | scholarag |
| description | Build PRISMA 2020-compliant systematic literature review systems with RAG-powered analysis in VS Code. Use when researcher needs automated paper retrieval (Semantic Scholar, OpenAlex, arXiv), AI-assisted PRISMA screening (50% or 90% threshold), vector database creation (ChromaDB), or research conversation interface. Supports knowledge_repository (comprehensive, 15K+ papers, teaching/exploration) and systematic_review (publication-quality, 50-300 papers, meta-analysis) modes. Conversation-first workflow with 7 stages. |
ScholaRAG: Systematic Review Automation Skill
For: Claude Code (AI assistant in VS Code) Purpose: Guide researchers through PRISMA 2020 systematic literature review + RAG-powered analysis
Quick Start (5 minutes)
For Researchers
- Initialize project:
python scholarag_cli.py init - Paste Stage 1 prompt: Copy from https://www.scholarag.com/guide/01-introduction
- Answer Claude's questions → Config created automatically
- Proceed through 7 stages conversationally
For AI Assistants (Claude Code)
When researcher provides a ScholaRAG prompt:
- Check for HTML metadata block (
<!-- METADATA ... -->at top of prompt) - Identify current stage (1-7) from metadata
stagefield - Follow conversation pattern (from metadata
conversation_flow) - Validate completion (using metadata
validation_rules) - Auto-execute commands (when
auto_execute: true) - Update
.claude/context.json(track progress) - Show next stage prompt (from metadata
next_stage.prompt_file)
Researcher should NEVER touch terminal. You execute all scripts automatically.
7-Stage Workflow Overview
| Stage | Name | Read This File | Duration | Auto-Execute |
|---|---|---|---|---|
| 1 | Research Setup | skills/claude_only/stage1_research_setup.md | 15-20 min | ✅ scholarag init |
| 2 | Query Strategy | skills/claude_only/stage2_query_strategy.md | 15-25 min | ❌ Manual review |
| 3 | PRISMA Config | skills/claude_only/stage3_prisma_config.md | 20-30 min | ❌ Manual review |
| 4 | RAG Design | skills/claude_only/stage4_rag_design.md | 10-15 min | ❌ Manual review |
| 5 | Execution | skills/claude_only/stage5_execution.md | 2-4 hours | ✅ Run all 5 scripts |
| 6 | Research Conversation | skills/claude_only/stage6_research_conversation.md | Ongoing | ❌ Interactive |
| 7 | Documentation | skills/claude_only/stage7_documentation.md | 30-60 min | ✅ Generate PRISMA |
Progressive Disclosure: Load stage file only when researcher enters that stage. Don't preload all 7 stages (token waste).
Critical Branching Points
🔴 project_type (Stage 1 Decision)
Two modes available:
| Mode | Threshold | Output | Best For |
|---|---|---|---|
knowledge_repository |
50% (lenient) | 15K-20K papers | Teaching, AI assistant, exploration |
systematic_review |
90% (strict) | 50-300 papers | Meta-analysis, publication |
Quick decision:
- Publishing systematic review? →
systematic_review✅ - Comprehensive domain coverage? →
knowledge_repository✅
Detailed decision tree: skills/reference/project_type_decision_tree.md
When to read decision tree:
- Researcher asks: "Which project_type should I choose?"
- Researcher says: "I'm unsure about my research goals"
- Stage 1 initialization (proactively offer decision helper)
🔴 Stage 6 Scenarios (7 Research Modes)
Stage 6 branches into 7 specialized conversation scenarios:
- overview (Context Scanning): High-level themes, methods, findings
- hypothesis (Hypothesis Validation): Evidence for/against with effect sizes
- statistics (Statistical Extraction): RCT data table (tools, Cohen's d, samples)
- methods (Methodology Comparison): RCT vs quasi vs mixed methods
- contradictions (Contradiction Detection): Conflicting results + analysis
- policy (Policy Translation): Actionable recommendations for stakeholders
- grant (Future Research Design): Follow-up study design + hypotheses
Details: skills/claude_only/stage6_research_conversation.md
When to read: Stage 6 entry (researcher asks "What can I query?")
Error Recovery
When errors occur: skills/reference/error_recovery.md
Quick fixes (common issues):
| Error | Quick Fix | Detailed Guide |
|---|---|---|
| Too many papers (>30K) | Refine query in Stage 2, re-run fetch | error_recovery.md §2.1 |
| API key missing | Add ANTHROPIC_API_KEY to .env |
error_recovery.md §3.1 |
| Low PDF success (<30%) | Filter for open_access in Stage 1 |
error_recovery.md §4.1 |
| All papers excluded (0 papers) | Lower threshold or broaden query | error_recovery.md §3.2 |
Reference Materials (Load Only When Needed)
Progressive disclosure: Don't preload these. Read only when researcher asks specific questions.
| Topic | File | When to Read |
|---|---|---|
| API endpoints | skills/reference/api_reference.md | Researcher asks about Semantic Scholar, OpenAlex, arXiv |
| Config schema | skills/reference/config_schema.md | Researcher asks "What fields are in config.yaml?" |
| PRISMA checklist | skills/reference/prisma_guidelines.md | Researcher asks about PRISMA 2020 compliance |
| Troubleshooting | skills/reference/troubleshooting.md | Researcher reports errors not in Quick Fixes |
Architecture Overview
File dependencies: https://www.scholarag.com/codebook/architecture
Key principle: Scripts read from config.yaml (single source of truth), never hardcode values.
Critical scripts (read project_type from config):
03_screen_papers.py: Sets threshold (50% or 90%)07_generate_prisma.py: Changes diagram title ("Knowledge Repository" vs "Systematic Review")
For Codex Users
If researcher is using OpenAI Codex instead of Claude Code:
See AGENTS.md for bash-based task workflows.
Codex workflow differs:
- Task-oriented (not conversation-oriented)
- Bash commands (not validation rules)
- Exit codes (not metadata parsing)
Universal reference files (Claude + Codex both use):
skills/reference/project_type_decision_tree.mdskills/reference/api_reference.mdskills/reference/config_schema.md
Token Optimization Notes
This file: ~400 lines (loaded once per conversation)
Stage-specific files: ~300-500 lines each (loaded on-demand)
Total per conversation: ~700 lines (this file + current stage file)
Previous approach: ~2,000 lines (all context upfront)
Token reduction: 65% ✅
How it works:
- Researcher starts Stage 1 → You load this file +
stage1_research_setup.md - Researcher moves to Stage 2 → You load
stage2_query_strategy.md(Stage 1 file unloaded) - Reference files loaded only when researcher asks (e.g., "How does Semantic Scholar API work?")
Metadata Block Format
All prompts in prompts/*.md contain HTML comment metadata at top:
<!-- METADATA
stage: 1
stage_name: "Research Domain Setup"
expected_duration: "15-20 minutes"
conversation_mode: "interactive"
expected_turns: "6-10"
outputs:
required:
- project_name: "Descriptive name"
- research_question: "Clear, answerable question"
- project_type: "knowledge_repository OR systematic_review"
validation_rules:
project_type:
required: true
allowed_values: ["knowledge_repository", "systematic_review"]
cli_commands:
- command: "python scholarag_cli.py init ..."
auto_execute: true
next_stage:
stage: 2
prompt_file: "prompts/02_query_strategy.md"
-->
How to use:
- Parse YAML inside HTML comment (lines between
<!-- METADATAand-->) - Extract fields:
stage,expected_turns,validation_rules,cli_commands,next_stage - Follow conversation pattern: Ask questions matching
expected_turnscount - Validate: Check user inputs against
validation_rules - Execute: Run
cli_commandswhen conversation complete - Transition: Show prompt from
next_stage.prompt_file
Divergence Handling
Common researcher confusions (from metadata divergence_handling):
Divergence 1: "Can you help me download PDFs?" (in Stage 1)
Response: "PDF downloading happens in Stage 4 (after screening in Stage 3). Right now in Stage 1, let's first define your research scope and choose project_type. We'll design queries in Stage 2, configure PRISMA in Stage 3, then download PDFs in Stage 4."
Divergence 2: "I want to skip systematic review" (in Stage 1)
Response: "If you don't need publication-quality systematic review, choose project_type: knowledge_repository in the next question. This mode uses lenient filtering (50% threshold) for comprehensive domain coverage (15K-20K papers). It's perfect for teaching materials, AI assistants, or exploratory research."
Divergence 3: "What's the difference between the two modes?" (in Stage 1)
Response: "Let me explain:
knowledge_repository:
- 50% threshold (lenient, removes only spam)
- 15,000-20,000 papers output
- For: Teaching, exploration, AI assistant
systematic_review:
- 90% threshold (strict, PRISMA 2020)
- 50-300 papers output
- For: Meta-analysis, publication
See full decision tree: skills/reference/project_type_decision_tree.md"
Conversation Flow Example (Stage 1)
Typical pattern (6-10 turns):
Turn 1: Researcher provides research topic
- You ask: "Is this for exploratory domain mapping or publication-quality systematic review?"
Turn 2-3: Researcher answers scope questions
- You suggest:
project_typebased on answers, explain threshold implications - Example: "Based on your goal of meta-analysis, I recommend
systematic_reviewmode with 90% screening threshold."
- You suggest:
Turn 4-5: Researcher confirms project_type choice
- You suggest: Year range, publication types, expected databases
- Example: "For language learning studies, I recommend 2015-2025 (10 years) focusing on Semantic Scholar and ERIC."
Turn 6-8: Researcher provides final details (domain, year range)
- You summarize: All decisions, ask for confirmation
- Example: "Here's what I'll create: [summary]. Ready to initialize?"
Turn 9-10: Researcher confirms initialization
- You execute:
scholarag_cli.py init, createconfig.yaml, show next steps - Example: "✅ Project initialized! Next, let's design your search query in Stage 2."
- You execute:
Completion Checklist (Stage-Specific)
Stage 1 example (from metadata completion_checklist):
-
project_nameis descriptive and unique (≥10 chars) -
research_questionis specific and answerable (≥20 chars) -
project_typechosen with understanding of implications (50% vs 90%) -
year_rangeis realistic for scope (≤25 years, not before 2000) -
config.yamlcreated successfully (file exists, valid YAML)
When all checked → Auto-execute scholarag_cli.py init → Show Stage 2 prompt
Example Commands You Will Execute
Stage 1: Initialize
python scholarag_cli.py init \
--name "AI-Chatbots-Language-Learning" \
--question "How do AI chatbots improve speaking proficiency in EFL learners?" \
--domain education
Stage 5: Run Pipeline (All 5 Scripts)
# Fetch papers
python scripts/01_fetch_papers.py --project projects/YYYY-MM-DD_ProjectName
# Deduplicate
python scripts/02_deduplicate.py --project projects/YYYY-MM-DD_ProjectName
# Screen with AI
python scripts/03_screen_papers.py --project projects/YYYY-MM-DD_ProjectName
# Download PDFs
python scripts/04_download_pdfs.py --project projects/YYYY-MM-DD_ProjectName
# Build RAG
python scripts/05_build_rag.py --project projects/YYYY-MM-DD_ProjectName
Stage 7: Generate PRISMA
python scripts/07_generate_prisma.py --project projects/YYYY-MM-DD_ProjectName
Integration with .claude/context.json
You should update this file after each stage:
{
"current_stage": {
"stage": 2,
"name": "Query Strategy",
"status": "in_progress",
"started_at": "2025-10-24T10:30:00Z"
},
"completed_stages": [
{
"stage": 1,
"name": "Research Setup",
"completed_at": "2025-10-24T10:25:00Z",
"outputs": {
"project_name": "AI-Chatbots-Language-Learning",
"research_question": "How do AI chatbots improve speaking proficiency?",
"project_type": "systematic_review"
}
}
],
"project": {
"name": "AI-Chatbots-Language-Learning",
"created": "2025-10-24",
"research_question": "How do AI chatbots improve speaking proficiency in EFL learners?",
"project_type": "systematic_review"
}
}
Purpose: Track progress, enable scholarag status command to show current stage.
FAQ for AI Assistants
Q: Should I always read stage files in order (1→2→3...)?
A: No! Read only the file for the current stage researcher is in. Use progressive disclosure.
Q: What if researcher jumps to Stage 5 without completing Stages 1-4?
A: Check .claude/context.json for completed stages. If missing prerequisites, politely redirect:
"Stage 5 requires config.yaml from Stage 1, search_query from Stage 2, and PRISMA criteria from Stage 3. Let's complete those first."
Q: When should I read skills/reference/ files?
A: Only when researcher explicitly asks. Examples:
- "How does Semantic Scholar API work?" → Read
api_reference.md - "What are all the config.yaml fields?" → Read
config_schema.md - "Why should I choose systematic_review?" → Read
project_type_decision_tree.md
Q: What if I don't understand metadata in prompts/*.md?
A: All metadata fields are documented in skills/claude_only/metadata_spec.md. Read that file if you encounter unknown fields.
Additional Resources
Detailed implementation guide: See CLAUDE.md for:
- 🎓 User profile (researchers with limited coding experience)
- How Claude Code should behave (DO/DON'T guidelines)
- Auto-execution patterns (echo pipes, CLI arguments)
- Full CLI reference and troubleshooting
For Codex/Cursor users: See AGENTS.md for task-based bash workflows
Last Updated: 2025-10-24 (v2.0 - Agent Skills Integration) Companion files: CLAUDE.md (detailed guide), AGENTS.md (Codex workflows) Compatible with: Claude Code v1.0+, Anthropic API Token Budget: ~380 lines (this file) + ~300-500 lines (stage file) = ~700-900 lines per conversation