Claude Code Plugins

Community-maintained marketplace

Feedback

Process large document corpora (1000+ docs, millions of tokens) through knowledge graph construction and stateful multi-hop reasoning. Use when (1) User provides a large corpus exceeding context limits, (2) Questions require connections across multiple documents, (3) Multi-hop reasoning needed for complex queries, (4) User wants persistent queryable knowledge from documents. Replaces brute-force document stuffing with intelligent graph traversal.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name recursive-knowledge
description Process large document corpora (1000+ docs, millions of tokens) through knowledge graph construction and stateful multi-hop reasoning. Use when (1) User provides a large corpus exceeding context limits, (2) Questions require connections across multiple documents, (3) Multi-hop reasoning needed for complex queries, (4) User wants persistent queryable knowledge from documents. Replaces brute-force document stuffing with intelligent graph traversal.

Recursive Knowledge Processing

Process arbitrarily large document sets through knowledge graph construction and stateful multi-hop queries. Based on RLM research but with proper state management and termination logic.

Core Concept

Instead of stuffing documents into context (which causes degradation), this skill:

  1. Indexes documents into a knowledge graph (entities, relationships)
  2. Answers queries by traversing the graph
  3. Tracks state to avoid redundant exploration
  4. Uses confidence thresholds to know when to stop

Workflow

Phase 1: Indexing

For a new corpus, run the indexer:

python3 scripts/index_corpus.py --input /path/to/documents --output /path/to/graph.json

This extracts:

  • Entities: People, organizations, concepts, dates, locations
  • Relationships: References, mentions, contradicts, supports, relates_to
  • Metadata: Source document, position, extraction confidence

For details on entity/relationship schema, see references/graph-schema.md.

Phase 2: Querying

For user queries against an indexed corpus:

python3 scripts/query.py --graph /path/to/graph.json --query "user question here"

The query engine:

  1. Parses query into target entities/relationships
  2. Finds entry points in graph
  3. Traverses with state tracking
  4. Stops when confidence threshold met
  5. Returns answer with provenance

Phase 3: Incremental Updates

Add new documents to existing graph:

python3 scripts/index_corpus.py --input /path/to/new_docs --output /path/to/graph.json --append

State Management (Critical)

The key improvement over naive recursive approaches is stateful traversal. See references/state-management.md for full details.

During query execution, track:

State Purpose
visited_nodes Prevent re-exploring same entities
visited_edges Prevent re-traversing same relationships
findings Accumulated evidence with sources
confidence Current certainty level (0-1)
depth Current traversal depth

Termination conditions:

STOP if:
  - confidence >= 0.85 (high certainty)
  - len(corroborating_sources) >= 3 (multiple agreement)
  - depth > max_depth (prevent infinite exploration)
  - all relevant paths exhausted

Multi-Hop Reasoning

For questions requiring connection across documents:

  1. Identify query components (what entities/facts needed)
  2. Find entry points for each component
  3. Traverse from each entry point
  4. Look for path intersections
  5. Synthesize findings at intersection points

Example: "Who worked with X on project Y?"

  • Entry point 1: Entity "X" → relationships → projects
  • Entry point 2: Entity "Project Y" → relationships → people
  • Intersection: People connected to both X and Project Y

See references/traversal-patterns.md for patterns.

When NOT to Use This Skill

  • Small document sets that fit in context (<50k tokens) - just use direct context
  • Simple keyword search - use grep/search tools instead
  • No multi-hop reasoning needed - simpler approaches work
  • Real-time streaming data - this is for static corpora

File Reference

  • scripts/index_corpus.py - Build graph from documents
  • scripts/query.py - Execute queries with state management
  • scripts/graph_ops.py - Graph CRUD utilities
  • references/graph-schema.md - Entity and relationship types
  • references/state-management.md - Termination and confidence logic
  • references/traversal-patterns.md - Multi-hop query patterns