Claude Code Plugins

Community-maintained marketplace

Feedback

Use this when deciding between semantic search and grep/glob for code discovery. Apply for concept-based queries (find payment processing), intent-based searches (how is auth implemented), or when user doesn't know exact class names. Use grep for exact matches like specific function names

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name semantic-search
description Use this when deciding between semantic search and grep/glob for code discovery. Apply for concept-based queries (find payment processing), intent-based searches (how is auth implemented), or when user doesn't know exact class names. Use grep for exact matches like specific function names
categories pattern, technique, search
tags semantic-search, weaviate, embeddings, discovery
version 1.0.0

Semantic Search Technique

Purpose

Decision framework and execution guide for using semantic search effectively in CodeCompass.

When to Use Semantic Search

✅ Use Semantic Search When

1. Concept-based Queries

  • "Find code that handles payment processing"
  • "Where do we validate email addresses?"
  • "Show me error handling patterns"

2. Intent-based Queries

  • "How is user authentication implemented?"
  • "What code calculates shipping costs?"
  • "Find business rules for order approval"

3. Cross-language/Cross-file

  • Searching across PHP, TypeScript, config files
  • Pattern discovery across multiple modules
  • Finding similar implementations

4. Fuzzy/Exploratory

  • User doesn't know exact class/function names
  • Exploring unfamiliar codebase
  • "Code that does something like X"

5. Natural Language

  • "Show me all database migrations"
  • "Find controllers that handle file uploads"
  • "Where are API rate limits defined?"

❌ Use Grep/Glob When

1. Exact Matches

  • "Find class named PaymentController"
  • "Where is processPayment function defined?"
  • "Find all imports of UserService"

2. Syntax Patterns

  • "Find all functions starting with get"
  • "Show me all @Injectable() decorators"
  • "Find TypeScript interfaces"

3. Performance Critical

  • Quick lookups in known files
  • Repeated searches in tight loops
  • When you know exact location

4. Structural Queries

  • "Find all .ts files in src/modules"
  • "List all test files"
  • "Show directory structure"

Execution Guide

Step 1: Formulate Effective Query

❌ Bad Queries (too vague):

  • "payment"
  • "code"
  • "function"

✅ Good Queries (specific context):

  • "business logic for processing customer payments and updating order status"
  • "validation rules for user email and password requirements"
  • "error handling patterns for database connection failures"

Why: More context = better semantic matching

Formula:

[Action/Purpose] for [Specific Entity] with [Context/Constraints]

Examples:

  • "Extract business capabilities from Yii2 controllers"
  • "Validation logic for user registration with email verification"
  • "Database migration patterns for schema versioning"

Step 2: Verify Indexing

Before searching, ensure codebase is indexed:

# Check if indexed
curl http://localhost:8081/v1/schema

# Should show collections like:
# - CodeContext
# - AtlasCode

If not indexed:

codecompass batch:index <path-to-codebase>

Step 3: Execute Search

codecompass search:semantic "business logic for payment processing"

Alternative (if using as library):

const results = await searchService.semanticSearch({
  query: "business logic for payment processing",
  limit: 10,
  certainty: 0.7 // Minimum relevance score
});

Step 4: Interpret Results

Check relevance scores:

  • >0.8: Highly relevant (exact match)
  • 0.7-0.8: Good match (related)
  • 0.6-0.7: Moderate match (possibly relevant)
  • <0.6: Weak match (may be noise)

Verify context:

  • Does the returned code actually match intent?
  • Are results from expected modules?
  • Multiple related files found (good signal)
  • Or isolated random matches (refine query)

Step 5: Refine if Needed

Too many results (>50):

  • Add more specific context to query
  • Increase certainty threshold
  • Add domain constraints ("in authentication module")

Too few results (<3):

  • Broaden query (less specific)
  • Lower certainty threshold
  • Check if area is actually indexed
  • Try related terms/synonyms

Wrong results:

  • Rephrase query with different terminology
  • Add negative constraints
  • Try breaking into multiple specific queries

Behind the Scenes

Architecture

Query Text
    ↓
Ollama Embedding (mxbai-embed-large)
    ↓
1024-dimensional vector
    ↓
Weaviate Vector Search (cosine similarity)
    ↓
Ranked Results

Key Components

From .ai/capabilities.json:

  • Module: search, vectorizer, weaviate
  • Embedding: Ollama mxbai-embed-large (1024 dimensions)
  • Vector DB: Weaviate with HNSW indexing
  • Collections: CodeContext, AtlasCode

Configuration (from .env):

EMBEDDING_SERVICE=ollama
OLLAMA_EMBEDDING_MODEL=mxbai-embed-large
OLLAMA_URL=http://localhost:11434
CODECOMPASS_WEAVIATE_URL=http://localhost:8081

Advanced Patterns

Pattern 1: Multi-Query Exploration

For complex questions, break into multiple searches:

# Instead of:
"authentication and authorization and session management"

# Do:
codecompass search:semantic "user authentication login process"
codecompass search:semantic "authorization and access control"
codecompass search:semantic "session management and tokens"

Pattern 2: Iterative Refinement

# 1. Broad search
codecompass search:semantic "payment processing"

# 2. Review results, identify specific module
# 3. Narrow search
codecompass search:semantic "payment gateway integration in PaymentController"

# 4. Pinpoint implementation
codecompass search:semantic "Stripe API call for processing credit cards"

Pattern 3: Cross-Domain Search

Search across different aspects:

# Code implementation
codecompass search:semantic "email validation logic"

# Tests
codecompass search:semantic "test cases for email validation"

# Configuration
codecompass search:semantic "email service configuration"

Common Pitfalls

❌ Pitfall 1: Searching Before Indexing

Symptom: No results or error Solution: Run codecompass batch:index first

❌ Pitfall 2: Too Vague Queries

Symptom: Returns everything or nothing useful Solution: Add specific context and intent

❌ Pitfall 3: Expecting Exact Matches

Symptom: "Why didn't it find function processPayment?" Reason: Semantic search is for concepts, not exact names Solution: Use grep for exact matches

❌ Pitfall 4: Ignoring Relevance Scores

Symptom: Reading irrelevant results Solution: Filter by score >0.7, ignore weak matches

❌ Pitfall 5: Single Query for Complex Questions

Symptom: Poor results for multi-faceted questions Solution: Break into multiple targeted queries

Decision Tree

┌─────────────────────────────────────┐
│ I need to find code that...         │
└─────────────────────────────────────┘
              ↓
        ┌─────────┐
        │ Know    │ Exact class/function name?
        │ exact   │
        │ name?   │
        └─────────┘
          ↙     ↘
        YES      NO
         ↓        ↓
    Use Grep  ┌─────────┐
              │ Concept │ Searching by meaning/purpose?
              │ search? │
              └─────────┘
                ↙     ↘
              YES      NO
               ↓        ↓
          Semantic  ┌─────────┐
          Search    │ Pattern │ Looking for code pattern?
                    │ match?  │
                    └─────────┘
                      ↙     ↘
                    YES      NO
                     ↓        ↓
                Use Glob  Use both
                          (Glob + Semantic)

Performance Considerations

Speed

  • Grep: Milliseconds (fast, synchronous)
  • Semantic Search: 100-500ms (embedding + vector search)

Tradeoff: Semantic is slower but finds conceptually related code

Token Cost (Embeddings)

  • Each query → 1 embedding generation
  • Ollama local → No API cost
  • But consumes local compute

Scaling

  • Small codebase (<1K files): Either method fine
  • Medium codebase (1K-10K files): Semantic search advantage grows
  • Large codebase (>10K files): Semantic search essential

Integration with Other Tools

With Yii2 Analysis

# 1. Analyze Yii2 project
codecompass analyze:yii2 <path>

# 2. Index results
codecompass batch:index <path>

# 3. Explore with semantic search
codecompass search:semantic "Yii2 controller actions for user management"

With Requirements Extraction

# 1. Extract requirements
codecompass requirements:extract

# 2. Search extracted requirements
codecompass search:semantic "business rules for order validation"

With Weaviate Direct Query

# Alternative: Query Weaviate GraphQL API directly
curl -X POST http://localhost:8081/v1/graphql \
  -H "Content-Type: application/json" \
  -d '{
    "query": "{
      Get {
        CodeContext(
          nearText: { concepts: [\"payment processing\"] }
          limit: 10
        ) {
          content
          filePath
        }
      }
    }"
  }'

Related Skills

  • 0-discover-capabilities.md - How to discover modules
  • analyze-yii2-project.md - Uses semantic search in workflow

Related Modules

From .ai/capabilities.json:

  • search - SearchService, IntegratedSearchService
  • vectorizer - Ollama embedding generation
  • weaviate - Vector database client
  • indexing - File indexing pipeline

Remember: Semantic search finds code by meaning, not by name. Choose the right tool for the job.