| name | using-llm-specialist |
| description | LLM specialist router to prompt engineering, fine-tuning, RAG, evaluation, and safety skills. |
| mode | true |
Using LLM Specialist
You are an LLM engineering specialist. This skill routes you to the right specialized skill based on the user's LLM-related task.
When to Use This Skill
Use this skill when the user needs help with:
- Prompt engineering and optimization
- Fine-tuning LLMs (full, LoRA, QLoRA)
- Building RAG systems
- Evaluating LLM outputs
- Managing context windows
- Optimizing LLM inference
- LLM safety and alignment
Routing Decision Tree
Step 1: Identify the task category
Prompt Engineering → Use prompt-engineering-patterns
- Writing effective prompts
- Few-shot learning
- Chain-of-thought prompting
- System message design
- Output formatting
- Prompt optimization
Fine-tuning → Use llm-finetuning-strategies
- When to fine-tune vs prompt engineering
- Full fine-tuning vs LoRA vs QLoRA
- Dataset preparation
- Hyperparameter selection
- Evaluation and validation
- Catastrophic forgetting prevention
RAG (Retrieval-Augmented Generation) → Use rag-architecture-patterns
- RAG system architecture
- Retrieval strategies (dense, sparse, hybrid)
- Chunking strategies
- Re-ranking
- Context injection
- RAG evaluation
Evaluation → Use llm-evaluation-metrics
- Task-specific metrics (classification, generation, summarization)
- Human evaluation
- LLM-as-judge
- Benchmark selection
- A/B testing
- Quality assurance
Context Management → Use context-window-management
- Context window limits (4k, 8k, 32k, 128k tokens)
- Summarization strategies
- Sliding window
- Hierarchical context
- Token counting
- Context pruning
Inference Optimization → Use llm-inference-optimization
- Reducing latency
- Increasing throughput
- Batching strategies
- KV cache optimization
- Quantization (INT8, INT4)
- Speculative decoding
Safety & Alignment → Use llm-safety-alignment
- Prompt injection prevention
- Jailbreak detection
- Content filtering
- Bias mitigation
- Hallucination reduction
- Guardrails
Routing Examples
Example 1: User asks about prompts
User: "My LLM isn't following instructions consistently. How can I improve my prompts?"
Route to: prompt-engineering-patterns
- Covers instruction clarity, few-shot examples, format specification
Example 2: User asks about fine-tuning
User: "I have 10,000 examples of customer support conversations. Should I fine-tune a model or use prompts?"
Route to: llm-finetuning-strategies
- Covers when to fine-tune vs prompt engineering
- Dataset preparation
- LoRA vs full fine-tuning
Example 3: User asks about RAG
User: "I want to build a Q&A system over my company's documentation. How do I give the LLM access to this information?"
Route to: rag-architecture-patterns
- Covers RAG architecture
- Chunking strategies
- Retrieval methods
Example 4: User asks about evaluation
User: "How do I measure if my LLM's summaries are good quality?"
Route to: llm-evaluation-metrics
- Covers summarization metrics (ROUGE, BERTScore)
- Human evaluation
- LLM-as-judge
Example 5: User asks about context limits
User: "My documents are 50,000 tokens but my model only supports 8k context. What do I do?"
Route to: context-window-management
- Covers summarization, chunking, hierarchical context
Example 6: User asks about speed
User: "My LLM inference is too slow (500ms per request). How can I make it faster?"
Route to: llm-inference-optimization
- Covers quantization, batching, KV cache, speculative decoding
Example 7: User asks about safety
User: "Users are trying to jailbreak my LLM to bypass content filters. How do I prevent this?"
Route to: llm-safety-alignment
- Covers prompt injection prevention, jailbreak detection, guardrails
Multiple Skills May Apply
Sometimes multiple skills are relevant:
Example: "I'm building a RAG system and need to evaluate retrieval quality."
- Primary:
rag-architecture-patterns(RAG architecture) - Secondary:
llm-evaluation-metrics(retrieval metrics: MRR, NDCG)
Example: "I'm fine-tuning an LLM but context exceeds 4k tokens."
- Primary:
llm-finetuning-strategies(fine-tuning process) - Secondary:
context-window-management(handling long contexts)
Example: "My RAG system is slow and I need better prompts for the generation step."
- Primary:
rag-architecture-patterns(RAG architecture) - Secondary:
llm-inference-optimization(speed optimization) - Tertiary:
prompt-engineering-patterns(generation prompts)
Approach: Start with the primary skill, then reference secondary skills as needed.
Common Task Patterns
Pattern 1: Building an LLM application
- Start with prompt-engineering-patterns (get prompt right first)
- If prompts insufficient → llm-finetuning-strategies (customize model)
- If need external knowledge → rag-architecture-patterns (add retrieval)
- Validate quality → llm-evaluation-metrics (measure performance)
- Optimize speed → llm-inference-optimization (reduce latency)
- Add safety → llm-safety-alignment (guardrails)
Pattern 2: Improving existing LLM system
- Identify bottleneck:
- Quality issue → prompt-engineering-patterns or llm-finetuning-strategies
- Knowledge gap → rag-architecture-patterns
- Context overflow → context-window-management
- Slow inference → llm-inference-optimization
- Safety concern → llm-safety-alignment
- Apply specialized skill
- Measure improvement → llm-evaluation-metrics
Pattern 3: LLM research/experimentation
- Design evaluation → llm-evaluation-metrics (metrics first!)
- Baseline: prompt engineering → prompt-engineering-patterns
- If insufficient: fine-tuning → llm-finetuning-strategies
- Compare: RAG vs fine-tuning → Both skills
- Optimize best approach → llm-inference-optimization
Quick Reference
| Task | Primary Skill | Common Secondary Skills |
|---|---|---|
| Better outputs | prompt-engineering-patterns | llm-evaluation-metrics |
| Customize behavior | llm-finetuning-strategies | prompt-engineering-patterns |
| External knowledge | rag-architecture-patterns | context-window-management |
| Quality measurement | llm-evaluation-metrics | - |
| Long documents | context-window-management | rag-architecture-patterns |
| Faster inference | llm-inference-optimization | - |
| Safety/security | llm-safety-alignment | prompt-engineering-patterns |
Default Routing Logic
If task is unclear, ask clarifying questions:
- "What are you trying to achieve with the LLM?" (goal)
- "What problem are you facing?" (bottleneck)
- "Have you tried prompt engineering?" (start simple)
Then route to the most relevant skill.
Summary
This is a meta-skill that routes to specialized LLM engineering skills.
The 7 specialized skills:
- prompt-engineering-patterns: Effective prompting techniques
- llm-finetuning-strategies: When and how to fine-tune
- rag-architecture-patterns: Building retrieval-augmented systems
- llm-evaluation-metrics: Measuring LLM quality
- context-window-management: Handling long contexts
- llm-inference-optimization: Speed and efficiency
- llm-safety-alignment: Safety, security, alignment
When multiple skills apply: Start with the primary skill, reference others as needed.
Default approach: Start simple (prompts), add complexity only when needed (fine-tuning, RAG, optimization).