| name | agent-audit |
| description | Validates agent configurations for model selection appropriateness, tool restriction accuracy, focus area quality, and approach completeness. Use when reviewing, auditing, or improving agents, checking model choice (Sonnet/Haiku/Opus), validating tool permissions, assessing focus area specificity, or ensuring approach methodology is complete. Also triggers when user asks about agent best practices, wants to optimize agent design, or needs help with agent validation. |
| allowed-tools | Read, Grep, Glob, Bash |
Reference Files
Advanced agent validation guidance:
- model-selection.md - Model choice decision matrix, use cases, and appropriateness criteria
- tool-restrictions.md - Tool permission patterns, security implications, and restriction fit
- focus-area-quality.md - Focus area specificity assessment, quality scoring, and criteria
- approach-methodology.md - Approach completeness, required components, and methodology patterns
- examples.md - Good vs poor agent comparisons and full audit reports
- report-format.md - Standardized audit report template and structure
- common-issues.md - Frequent problems, fixes, and troubleshooting patterns
Agent Auditor
Validates agent configurations for model selection, tool restrictions, focus areas, and approach methodology.
Quick Start
Basic audit workflow:
- Read agent file
- Check model selection appropriateness
- Validate tool restrictions
- Assess focus area quality
- Review approach methodology
- Generate audit report
Example usage:
User: "Audit my bash-scripting skill"
→ Reads skills/bash-scripting/SKILL.md
→ Validates model (Sonnet), tools, focus areas, approach
→ Generates report with findings and recommendations
Agent Audit Checklist
Critical Issues
Must be fixed for agent to function correctly:
- Valid YAML frontmatter - Proper syntax, required fields present
- name field matches filename - Name consistency
- model field present and valid - Sonnet, Haiku, or Opus only
- At least 3 focus areas - Minimum viable expertise definition
- Tool restrictions present - allowed_tools or allowed-patterns specified
- No security vulnerabilities - Tools don't expose dangerous capabilities
Important Issues
Should be fixed for optimal agent performance:
- Model matches complexity - Haiku for simple, Sonnet default, Opus rare
- 5-15 focus areas - Not too few (vague) or too many (unfocused)
- Focus areas specific - Concrete, not generic statements
- Tools match usage - No missing or excessive permissions
- Approach section complete - Methodology defined, output format specified
- File size reasonable - <500 lines or uses progressive disclosure
Nice-to-Have Improvements
Polish for excellent agent quality:
- Model choice justified - Clear reason for non-default model
- Focus areas have examples - Technology/framework specificity
- Approach has decision frameworks - If/then logic for complex tasks
- Tool restrictions documented - Why specific tools are allowed/restricted
- Context economy - Concise without sacrificing clarity
Audit Workflow
Step 1: Read Agent File
Identify the agent file to audit:
# Single agent
Read skills/bash-scripting/SKILL.md
# Find all agents
Glob agents/*.md
Step 2: Validate Model Selection
Check model field:
model: sonnet # Good - default choice
model: haiku # Check: Is agent simple enough?
model: opus # Check: Is complexity justified?
Decision criteria:
- Haiku: Simple read-only analysis, fast response needed, low cost priority
- Sonnet: Default for most agents, balanced cost/capability
- Opus: Complex reasoning required, highest capability needed
Common issues:
- Opus overuse: Using expensive model when Sonnet sufficient
- Haiku underperformance: Too simple for task complexity
- Missing model: No model field specified (defaults to Sonnet)
See model-selection.md for detailed decision matrix.
Step 3: Validate Tool Restrictions
Check allowed_tools or allowed-patterns:
allowed_tools:
- Read
- Grep
- Glob
- Bash
Validation checklist:
- Tools specified: Has allowed_tools field (not unrestricted)
- Tools match usage: All mentioned tools are allowed
- No missing tools: All needed tools are included
- No excessive tools: No unnecessary permissions
- Security implications: No dangerous tool combinations
Common patterns:
- Read-only analyzer: [Read, Grep, Glob, Bash (read commands)]
- Code generator: [Read, Write, Edit, Grep, Glob, Bash]
- Orchestrator: [Task, Skill, Read, AskUserQuestion]
See tool-restrictions.md for security analysis.
Step 4: Assess Focus Area Quality
Target: 5-15 focus areas that are specific, concrete, and comprehensive
Quality criteria:
Specific vs Generic:
- ✗ Generic: "Python programming"
- ✓ Specific: "FastAPI REST APIs with SQLAlchemy ORM"
Concrete vs Vague:
- ✗ Vague: "Best practices"
- ✓ Concrete: "Defensive programming with strict error handling"
Coverage:
- Too few (<5): Expertise unclear or overly narrow
- Sweet spot (5-15): Comprehensive, focused expertise
- Too many (>15): Unfocused, trying to do everything
Example analysis:
## Focus Areas
- Defensive programming with strict error handling ✓
- POSIX compliance and cross-platform portability ✓
- Safe argument parsing and input validation ✓
- Robust file operations and temporary resource management ✓
- Production-grade logging and error reporting ✓
Score: 5/5 areas, all specific and concrete → GOOD
See focus-area-quality.md for scoring methodology.
Step 5: Review Approach Methodology
Check approach section completeness:
Required elements:
- Methodology defined - Step-by-step process
- Decision frameworks - How to handle different scenarios
- Output format - What the agent produces
- Integration with focus - How approach uses expertise
Example complete approach:
## Approach
1. Analyze requirements and constraints
2. Design solution using defensive programming principles
3. Implement with POSIX compliance
4. Add comprehensive error handling
5. Test on multiple platforms
6. Document with inline comments
Output: Production-ready Bash script with full error handling
Incomplete approach (missing steps, no output format):
## Approach
Write good Bash scripts following best practices.
See approach-methodology.md for templates.
Step 6: Check Context Economy
File size assessment:
# Count lines
wc -l skills/bash-scripting/SKILL.md
Targets:
- <300 lines: Excellent - concise and focused
- 300-500 lines: Good - comprehensive without bloat
- 500-800 lines: Consider progressive disclosure
- >800 lines: Should use references/ directory
If oversized:
- Extract detailed content to references/ files
- Keep main file focused on core workflow
- Link to references from main file
- Maintain one-level-deep structure
Step 7: Generate Audit Report
Compile findings into standardized report format. See report-format.md for the complete template.
Agent-Specific Validation
For detailed validation criteria in each area, see the reference files:
- Model Selection: See model-selection.md for appropriateness criteria, use cases, and red flags
- Tool Restrictions: See tool-restrictions.md for security implications and restriction fit analysis
- Focus Area Quality: See focus-area-quality.md for specificity assessment and scoring methodology
- Approach Completeness: See approach-methodology.md for required components and impact analysis
Common Issues
For detailed troubleshooting guidance, see common-issues.md.
Common patterns include:
- Opus overuse: Expensive model for simple tasks
- Generic focus areas: Lack of specificity and concrete examples
- Missing tool restrictions: Unrestricted access creates security risks
- Overly restrictive tools: Missing tools the agent needs
- Incomplete approach: Vague methodology without clear steps
Report Format
Use the standardized template in report-format.md for all agent audit reports.
Integration with audit-coordinator
Invocation pattern:
User: "Audit my agent"
→ audit-coordinator invokes agent-audit
→ agent-audit performs specialized validation
→ Results returned to audit-coordinator
→ Consolidated with claude-code-evaluator findings
Sequence:
- agent-audit (primary) - Agent-specific validation
- claude-code-evaluator (secondary) - General structure validation
- claude-code-test-runner (optional) - Functional testing
Report compilation:
- agent-audit findings (model, tools, focus, approach)
- claude-code-evaluator findings (YAML, markdown, structure)
- Unified report with reconciled priorities
Examples
Example 1: Good Agent (claude-code-evaluator)
Status: PASS
Strengths:
- Model: Sonnet (appropriate for analysis tasks)
- Tools: Read, Grep, Glob, Bash (read-only pattern, secure)
- Focus: 12 specific areas covering evaluation expertise
- Approach: Complete methodology with output format
Score: 9/10 - Excellent agent design
Example 2: Agent Needs Work
Status: NEEDS WORK
Issues:
- Model: Opus (expensive, Sonnet sufficient for task)
- Tools: No allowed_tools (unrestricted access)
- Focus: 3 generic areas ("best practices", "code quality")
- Approach: Missing (no methodology defined)
Critical fixes:
- Add allowed_tools field
- Change model to Sonnet
- Expand focus areas to 5-10 specific items
- Add complete approach section
Score: 4/10 - Requires significant improvement
See examples.md for complete audit reports.
For detailed guidance on each validation area, consult the reference files linked at the top of this document.