| name | claudemem-search |
| description | ⚡ PRIMARY TOOL for semantic code understanding with LLM enrichment. ANTI-PATTERNS: Reading 5+ files sequentially, Glob then read all, Grep for 'how does X work'. CORRECT: claudemem search first (use --use-case navigation for agents), Read specific lines after. |
| allowed-tools | Bash, Task, AskUserQuestion |
Claudemem Semantic Code Search Expert (v0.2.0)
This Skill provides comprehensive guidance on leveraging claudemem v0.2.0 with LLM enrichment for intelligent, context-aware semantic code search.
What's New in v0.2.0
┌─────────────────────────────────────────────────────────────┐
│ CLAUDEMEM v0.2.0 ARCHITECTURE │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ SEARCH LAYER ││
│ │ Query → Embed → Vector Search + BM25 → Ranked Results ││
│ └─────────────────────────────────────────────────────────┘│
│ ↓ │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ ENRICHMENT LAYER (LLM) ⭐NEW ││
│ │ file_summary │ symbol_summary │ idiom │ usage_example ││
│ │ (1 call/file)│ (batched/file) │ │ ││
│ └─────────────────────────────────────────────────────────┘│
│ ↓ │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ INDEX LAYER ││
│ │ AST Parse → Chunk (functions/classes) → Embed → LanceDB ││
│ └─────────────────────────────────────────────────────────┘│
│ │
└─────────────────────────────────────────────────────────────┘
Key Innovation: Dual Matching
Search queries now match BOTH:
- Raw code chunks (exact implementation, syntax)
- LLM-enriched summaries (semantic meaning, purpose, behavior)
This dramatically improves semantic understanding over v0.1.x.
Document Types (NEW in v0.2.0)
1. code_chunk (Raw AST Code)
Source: Tree-sitter AST parsing Content: Actual code blocks (functions, classes, methods) Best for: Implementation details, signatures, exact syntax
| Field | Description |
|---|---|
id |
SHA256 hash |
content |
Raw code |
filePath |
File location |
startLine / endLine |
Line numbers (1-indexed) |
chunkType |
function, class, method, module, block |
name |
Function/class name |
signature |
Extracted signature |
When to prioritize:
- Finding exact implementations
- Looking up function signatures
- Code completion (FIM)
- Syntax-level understanding
2. file_summary (LLM-Enriched) ⭐NEW
Source: LLM analysis (1 call per file) Content: File purpose, exports, dependencies, patterns Best for: Architecture discovery, understanding file roles
Example enriched content:
File: src/core/indexer.ts
Purpose: Core indexing orchestrator for claudemem
Responsibilities:
- Coordinates file scanning, parsing, and embedding
- Manages incremental updates via content hashing
- Integrates with enrichment pipeline for LLM summaries
Exports: CodebaseIndexer, IndexStatus
Dependencies: VectorStore, FileTracker, Enricher
Patterns: Factory pattern, progress callbacks
When to prioritize:
- Understanding codebase structure
- Finding entry points
- Mapping dependencies
- Architecture analysis
3. symbol_summary (LLM-Enriched, Batched) ⭐NEW
Source: LLM analysis (1 call for ALL symbols in file) Content: Function/class documentation Best for: API understanding, finding by behavior
Example enriched content:
function: enrichFiles
Summary: Enriches multiple files using batched LLM calls for efficiency
Parameters:
- files: Array of files with content and code chunks
- options: Concurrency and progress callback settings
Returns: EnrichmentResult with document counts and errors
Side effects: Stores documents in vector store, updates tracker
Usage: Called during index --enrich or standalone enrich command
When to prioritize:
- Finding functions by behavior (not name)
- Understanding parameters and returns
- Identifying side effects
- API exploration
Search Use Cases & Weight Presets ⭐NEW
claudemem v0.2.0 provides three optimized search modes:
1. FIM (Fill-in-Middle) Completion
Use case: Code completion, autocomplete Optimizes for: Exact code patterns
claudemem search "async function handle" --use-case fim
Weight distribution:
| Document Type | Weight |
|---|---|
| code_chunk | 50% |
| usage_example | 25% |
| idiom | 15% |
| symbol_summary | 10% |
2. Search (Human Queries) - DEFAULT
Use case: Developer searching codebase Optimizes for: Balanced understanding
claudemem search "authentication flow" # default mode
Weight distribution:
| Document Type | Weight |
|---|---|
| file_summary | 25% |
| symbol_summary | 25% |
| code_chunk | 20% |
| idiom | 15% |
| usage_example | 10% |
3. Navigation (Agent Discovery) ⭐RECOMMENDED FOR AGENTS
Use case: AI agent exploring codebase Optimizes for: Understanding structure
claudemem search "authentication middleware" --use-case navigation
Weight distribution:
| Document Type | Weight |
|---|---|
| symbol_summary | 35% |
| file_summary | 30% |
| code_chunk | 20% |
| idiom | 10% |
| project_doc | 5% |
⚠️ IMPORTANT: When using claudemem in detective agents, ALWAYS use --use-case navigation for optimal results.
CLI Commands Reference (Updated for v0.2.0)
Index Codebase
# Basic indexing (AST + embeddings only)
claudemem index [path]
# Force full re-index
claudemem index -f
# Index with LLM enrichment ⭐NEW
claudemem index --enrich
# Force re-index with enrichment
claudemem index -f --enrich
Enrich Indexed Files ⭐NEW
# Run enrichment on indexed files
claudemem enrich [path]
# Control parallelism (default: 10)
claudemem enrich --concurrency 5
# Enrich specific path
claudemem enrich ./src/core
Search
# Semantic search (default: search use case)
claudemem search "authentication middleware"
# Limit results
claudemem search "error handling" -n 20
# Filter by language
claudemem search "class definition" -l typescript
# Specific use case ⭐NEW
claudemem search "validate input" --use-case navigation
claudemem search "async handler" --use-case fim
Status
# Show index and enrichment status
claudemem status
# Output includes:
# - Total files/chunks indexed
# - Document type counts (code_chunk, file_summary, symbol_summary) ⭐NEW
# - Enrichment progress (pending/complete) ⭐NEW
# - Embedding model used
AI Instructions
# Get role-specific instructions
claudemem ai architect # System design focus
claudemem ai developer # Implementation focus
claudemem ai tester # Test coverage focus
claudemem ai debugger # Error tracing focus
# Raw output for clipboard
claudemem ai developer --raw | pbcopy
When to Use This Skill
Claude should invoke this Skill when:
- User mentions: "claudemem", "tree-sitter search", "local semantic search"
- User wants semantic search WITHOUT cloud dependencies
- User asks: "install claudemem", "set up local code search"
- User has OpenRouter API key but not OpenAI/Zilliz
- Before launching codebase-detective when claudemem is preferred
- User asks about alternatives to claude-context
- User asks about enrichment, document types, or search modes
Phase 1: Installation Validation (REQUIRED)
Step 1: Check if claudemem is Installed
# Check if claudemem CLI is available
which claudemem || command -v claudemem
# Check version (must be 0.2.0+)
claudemem --version
If NOT installed, present installation options:
AskUserQuestion({
questions: [{
question: "claudemem CLI not found. How would you like to install it?",
header: "Install",
multiSelect: false,
options: [
{ label: "npm (Recommended)", description: "npm install -g claude-codemem" },
{ label: "Homebrew (macOS)", description: "brew tap MadAppGang/claude-mem && brew install --cask claudemem" },
{ label: "Shell script", description: "curl -fsSL https://raw.githubusercontent.com/MadAppGang/claudemem/main/install.sh | bash" },
{ label: "Skip installation", description: "I'll install it manually later" }
]
}]
})
Step 2: Check Configuration and Enrichment Status ⭐UPDATED
# Check if initialized (looks for config)
ls ~/.claudemem/config.json 2>/dev/null || echo "Not configured"
# Check for project-local index
ls .claudemem/ 2>/dev/null || echo "No local index"
# Check full status including enrichment
claudemem status
Status output now includes:
- Total files/chunks indexed
- Document type breakdown (code_chunk, file_summary, symbol_summary)
- Enrichment status (complete, pending, not run)
Step 3: Index with Enrichment (Recommended)
# Full index with LLM enrichment (recommended)
claudemem index --enrich
# Or index first, then enrich separately
claudemem index
claudemem enrich
⚠️ Without enrichment, you only get code_chunk results (v0.1.x behavior). ✅ With enrichment, you get file_summary + symbol_summary for much better semantic understanding.
Phase 2: Indexing Best Practices (Updated)
2.1 Initial Indexing with Enrichment
# Index with enrichment (recommended for semantic search)
claudemem index --enrich
# Or index in stages
claudemem index # Fast: AST + embeddings
claudemem enrich # Slower: LLM enrichment
What happens during enrichment:
- LLM analyzes each file (1 call/file)
- Generates file_summary with purpose, exports, patterns
- Batches symbol analysis (1 call for all symbols in file)
- Stores enriched documents in LanceDB
2.2 Check Enrichment Status
claudemem status
Look for:
Document Types:
code_chunk: 1,234
file_summary: 567 ← Should match file count
symbol_summary: 890 ← Functions/classes documented
Enrichment: complete ← Ready for semantic search
2.3 Embedding Models
claudemem --models
Curated Picks:
| Model | Best For | Price | Context |
|---|---|---|---|
voyage/voyage-code-3 |
Best Quality (default) | $0.180/1M | 32K |
qwen/qwen3-embedding-8b |
Best Balanced | $0.010/1M | 33K |
qwen/qwen3-embedding-0.6b |
Best Value | $0.002/1M | 33K |
Recommendation: Use voyage/voyage-code-3 for best code understanding (default).
Phase 3: Search Query Formulation (Updated)
3.1 Use Case Selection ⭐NEW
Choose the right use case for your task:
| Task | Use Case | Command |
|---|---|---|
| Developer searching | search (default) |
claudemem search "query" |
| AI agent exploring | navigation |
claudemem search "query" --use-case navigation |
| Code completion | fim |
claudemem search "query" --use-case fim |
3.2 Effective Query Patterns
Concept-Based Queries (Best for enriched search):
claudemem search "user authentication login flow with JWT tokens"
claudemem search "database connection pooling initialization"
claudemem search "error handling middleware for HTTP requests"
Why These Work Better with Enrichment:
- Matches file_summary (file purpose, patterns)
- Matches symbol_summary (function behavior, side effects)
- Matches code_chunk (exact implementation)
- Triple-layer matching = much higher relevance
3.3 Query Templates by Use Case
Architecture Discovery (use file_summary):
claudemem search "main entry point application bootstrap" --use-case navigation
claudemem search "service layer business logic orchestration" --use-case navigation
claudemem search "repository data access pattern" --use-case navigation
API Exploration (use symbol_summary):
claudemem search "create user account function parameters" --use-case navigation
claudemem search "validate input before save" --use-case navigation
claudemem search "error response formatting" --use-case navigation
Implementation Details (use code_chunk):
claudemem search "JWT token generation implementation"
claudemem search "password hashing bcrypt"
claudemem search "database transaction commit"
Phase 4: Integration Patterns for Agents
4.1 Pattern: Semantic-First Discovery
Anti-pattern: Sequential file reads, grep for keywords Best practice: Semantic search → targeted file reads
// WRONG: Read all files
const files = await glob("src/**/*.ts");
for (const file of files) {
const content = await read(file);
if (content.includes("auth")) { /* ... */ }
}
// RIGHT: Semantic search first
// 1. Check enrichment status
claudemem status // Verify enrichment complete
// 2. Search with navigation use case
claudemem search "authentication flow user login" --use-case navigation -n 10
// 3. Only read high-scoring matches
// Results are ranked by combined code_chunk + file_summary + symbol_summary
4.2 Pattern: Document Type Selection
Match document type to your needs:
| Task | Primary Types | Why |
|---|---|---|
| Architecture discovery | file_summary |
Understands file purposes |
| API exploration | symbol_summary |
Has params, returns, side effects |
| Code completion | code_chunk |
Exact syntax needed |
| Understanding behavior | symbol_summary |
LLM-analyzed purpose |
| Finding patterns | file_summary |
Contains detected patterns |
4.3 Pattern: Progressive Discovery
Start broad with file_summary, narrow down to symbol_summary, then code_chunk:
# Step 1: Broad architecture search (file_summary weighted)
claudemem search "authentication" --use-case navigation -n 5
# Step 2: Specific function search (symbol_summary weighted)
claudemem search "validate JWT token function" --use-case navigation -n 10
# Step 3: Implementation details (code_chunk weighted)
claudemem search "JWT verification implementation" -n 3
4.4 Pattern: Check Enrichment Before Relying on It
# ALWAYS check status first
claudemem status
# If enrichment not complete, run it
claudemem enrich
# Then search with confidence
claudemem search "auth flow" --use-case navigation
Phase 5: MCP Server Integration
5.1 Available Tools
// Semantic search
search_code(
query: string,
limit?: number, // Default: 10
language?: string, // Filter by language
autoIndex?: boolean // Auto-index changes (default: true)
)
// Index codebase
index_codebase(
path?: string, // Default: current directory
force?: boolean, // Force re-index
model?: string // Override embedding model
)
// Get status
get_status(path?: string)
// Clear index
clear_index(path?: string)
// List models
list_embedding_models(freeOnly?: boolean)
5.2 MCP Configuration
Add to .mcp.json:
{
"mcpServers": {
"claudemem": {
"command": "claudemem",
"args": ["--mcp"]
}
}
}
Phase 6: Score Interpretation
Understanding Search Scores
| Score | Meaning | Action |
|---|---|---|
| > 0.85 | Strong match | Use directly |
| 0.70-0.85 | Good match | Review briefly |
| 0.50-0.70 | Partial match | Verify manually |
| < 0.50 | Weak match | Refine query |
With enrichment, scores are generally higher because:
- file_summary matches purpose/intent
- symbol_summary matches behavior description
- code_chunk matches implementation
Phase 7: Troubleshooting
Problem: No enriched results
# Check enrichment status
claudemem status
# Look for:
# Enrichment: not run OR incomplete
# Run enrichment if needed
claudemem enrich
Problem: Slow enrichment
# Reduce concurrency
claudemem enrich --concurrency 3
# Or enrich specific directories
claudemem enrich ./src/core
claudemem enrich ./src/services
Problem: Low search scores
- Use more descriptive queries
- Check if files are indexed AND enriched:
claudemem status - Try
--use-case navigationfor agent tasks - Use language filter:
-l typescript
Problem: Missing file_summary or symbol_summary
# Check document type counts
claudemem status
# If file_summary count is 0, enrichment hasn't run
claudemem enrich
# Force re-enrichment
claudemem index -f --enrich
Quality Checklist (Updated for v0.2.0)
Before completing a claudemem workflow, ensure:
- claudemem CLI is installed (v0.2.0+)
- OpenRouter API key is configured
- Codebase is indexed (check with
claudemem status) - Enrichment is complete (file_summary + symbol_summary counts > 0) ⭐NEW
- Search queries use natural language concepts
- Using appropriate use case (
--use-case navigationfor agents) ⭐NEW - Results are relevant and actionable
- File locations are documented for follow-up
🔴 ANTI-PATTERNS (DO NOT DO)
╔══════════════════════════════════════════════════════════════════════════════╗
║ COMMON MISTAKES TO AVOID ║
╠══════════════════════════════════════════════════════════════════════════════╣
║ ║
║ ❌ Reading 5+ files sequentially when investigating a feature ║
║ → WHY WRONG: Token waste, no ranking, no context ║
║ → DO INSTEAD: claudemem search "feature concept" --use-case navigation ║
║ ║
║ ❌ Using Glob to find all files, then reading them one-by-one ║
║ → WHY WRONG: Gets ALL files, not RELEVANT files ║
║ → DO INSTEAD: claudemem search "what you're looking for" ║
║ ║
║ ❌ Using Grep for architectural questions like "how does X work" ║
║ → WHY WRONG: Text match ≠ semantic understanding ║
║ → DO INSTEAD: claudemem search "X functionality flow" --use-case nav ║
║ ║
║ ❌ Searching without checking enrichment status ║
║ → WHY WRONG: Missing file_summary and symbol_summary matches ║
║ → DO INSTEAD: claudemem status first, enrich if needed ║
║ ║
║ ❌ Using default search mode for agent exploration ║
║ → WHY WRONG: Default weights optimized for humans, not agents ║
║ → DO INSTEAD: --use-case navigation for agent tasks ║
║ ║
╚══════════════════════════════════════════════════════════════════════════════╝
Anti-Pattern vs Correct Pattern
| Anti-Pattern | Why It's Wrong | Correct Pattern |
|---|---|---|
claudemem search "auth" (no enrichment) |
Missing LLM summaries | claudemem status → enrich → search |
claudemem search "auth flow" (agent) |
Wrong use case | claudemem search "auth flow" --use-case navigation |
Read auth/login.ts then Read auth/session.ts... |
No ranking, token waste | claudemem search "auth login session" |
grep -r "auth" src/ |
No semantic understanding | claudemem search "authentication flow" |
| Assume enrichment is done | May miss summaries | Check claudemem status first |
The Correct Workflow (v0.2.0)
┌─────────────────────────────────────────────────────────────────┐
│ CORRECT INVESTIGATION FLOW (v0.2.0) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1. claudemem status → Check index AND enrichment │
│ 2. claudemem enrich → Run if enrichment incomplete │
│ 3. claudemem search "..." → Use --use-case navigation │
│ --use-case navigation │
│ 4. Review results → See ranked file/symbol/code │
│ 5. Read specific lines → ONLY from search results │
│ │
│ ⚠️ NEVER: Start with Read/Glob for semantic questions │
│ ⚠️ NEVER: Search without verifying enrichment │
│ │
└─────────────────────────────────────────────────────────────────┘
Notes
- Requires OpenRouter API key (https://openrouter.ai) - all embedding models are paid
- NEW: Enrichment requires additional LLM calls (1 per file + 1 batch per file for symbols)
- Default model:
voyage/voyage-code-3(best code understanding, $0.180/1M tokens) - Run
claudemem --modelsto see all available models and choose based on budget/quality - All data stored locally in
.claudemem/directory (no cloud storage) - Tree-sitter provides excellent parsing for TypeScript, Go, Python, Rust
- Hybrid search combines keyword (BM25) + semantic (embeddings)
- Can run as MCP server with
--mcpflag - Initial indexing takes ~1-2 minutes for typical projects
- NEW: Enrichment adds ~5-10 minutes depending on codebase size
- Automatic change detection re-indexes modified files on search
- NEW: Use
--use-case navigationfor AI agent exploration
Maintained by: Jack Rudenko @ MadAppGang Plugin: code-analysis v2.4.0 Last Updated: December 2025