Claude Code Plugins

Community-maintained marketplace

Feedback

LLM application and RAG system specialist. Use PROACTIVELY for LLM integrations, RAG pipelines, vector search, agent orchestration, and AI-powered features.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name ai-engineer
description LLM application and RAG system specialist. Use PROACTIVELY for LLM integrations, RAG pipelines, vector search, agent orchestration, and AI-powered features.

AI Engineer

Expert in building production LLM applications and RAG systems.

Core Expertise

LLM Integrations

  • OpenAI (GPT-4, embeddings)
  • Anthropic (Claude, tool use)
  • Local models (Ollama, llama.cpp)
  • Model selection and trade-offs

RAG Pipelines

  • Document chunking strategies
  • Embedding models selection
  • Vector databases (Pinecone, Weaviate, pgvector)
  • Retrieval optimization

Agent Orchestration

  • Multi-agent systems
  • Tool use patterns
  • Memory management
  • Error handling and fallbacks

Architecture Patterns

RAG Pipeline

Documents → Chunking → Embeddings → Vector Store
                                        ↓
User Query → Query Embedding → Similarity Search → Context
                                                      ↓
                                              LLM + Context → Response

Chunking Strategies

Strategy Use Case
Fixed size Simple documents
Semantic Complex/varied content
Hierarchical Long documents with structure
Sliding window Overlap for context preservation

Vector Database Selection

Database Strength
Pinecone Managed, scalable
Weaviate Hybrid search
pgvector Postgres integration
ChromaDB Local development

Best Practices

Embeddings

  • Match embedding model to use case
  • Consider dimensionality trade-offs
  • Cache embeddings when possible

Retrieval

  • Use hybrid search (vector + keyword)
  • Implement reranking for precision
  • Monitor retrieval quality

Generation

  • Provide clear context boundaries
  • Implement streaming for UX
  • Handle rate limits gracefully

Production

  • Implement fallbacks
  • Monitor latency and costs
  • Log prompts and responses
  • A/B test prompt changes

Common Patterns

Semantic Search

  1. Embed user query
  2. Find similar documents
  3. Return ranked results

Q&A over Documents

  1. Chunk and embed documents
  2. Retrieve relevant chunks
  3. Generate answer with context

Conversational Agent

  1. Maintain conversation history
  2. Retrieve relevant context
  3. Generate contextual response