| name | artifact-validator |
| description | Validate and grade Claude Code Skills, Commands, Subagents, and Hooks for quality and correctness. Check YAML syntax, verify naming conventions, validate required fields, test activation patterns, assess description quality. Generate quality scores using Q = 0.40R + 0.30C + 0.20S + 0.10E framework with specific improvement recommendations. Use when validating artifacts, checking quality, troubleshooting activation issues, or ensuring artifact correctness before deployment. |
| allowed-tools | Read, Bash, Grep, Glob |
Artifact Validator: Quality Assurance for Claude Code Artifacts
Comprehensive validation and quality grading system for Claude Code Skills, Commands, Subagents, and Hooks. Ensures artifacts meet technical requirements, follow best practices, and will activate/execute reliably.
Core Capabilities
- YAML Validation: Check syntax, structure, and required fields
- Naming Verification: Ensure directory names match YAML names
- Description Quality: Grade activation potential for Skills
- Field Validation: Verify all required and optional fields
- Activation Testing: Simulate trigger scenarios for Skills
- Quality Grading: Apply Q = 0.40R + 0.30C + 0.20S + 0.10E framework
- Improvement Recommendations: Provide specific actionable feedback
- Cross-Artifact Checks: Detect conflicts and overlaps
Methodology
Step 1: Identify Artifact Type
First, determine what type of artifact to validate:
Detection rules:
- Skill: Located in
.claude/skills/*/SKILL.md - Command: Located in
.claude/commands/*.md - Subagent: Located in
.claude/agents/*.yaml - Hook: Configuration in
settings.jsonunderhookskey
Example:
# Auto-detect artifact type
if [[ -f ".claude/skills/my-skill/SKILL.md" ]]; then
TYPE="skill"
elif [[ -f ".claude/commands/my-command.md" ]]; then
TYPE="command"
elif [[ -f ".claude/agents/my-agent.yaml" ]]; then
TYPE="subagent"
fi
Step 2: YAML Syntax Validation
Validate YAML frontmatter is syntactically correct:
For Skills and Commands:
# Extract YAML frontmatter (between --- delimiters)
python3 -c "
import yaml
import sys
with open('SKILL.md') as f:
content = f.read()
parts = content.split('---')
if len(parts) < 3:
print('ERROR: Missing YAML frontmatter delimiters')
sys.exit(1)
try:
data = yaml.safe_load(parts[1])
print('✅ YAML syntax valid')
print(f'Fields: {list(data.keys())}')
except yaml.YAMLError as e:
print(f'❌ YAML syntax error: {e}')
sys.exit(1)
"
For Subagents:
# Full YAML file
python3 -c "
import yaml
with open('agent.yaml') as f:
try:
data = yaml.safe_load(f)
print('✅ YAML syntax valid')
except yaml.YAMLError as e:
print(f'❌ YAML error: {e}')
"
Common YAML errors:
- Missing closing
---delimiter - Tab characters (use spaces only)
- Smart quotes instead of straight quotes
- Array syntax for comma-separated fields
Step 3: Required Fields Validation
Check that all required fields are present and valid:
Skill required fields:
name(string, matches directory name)description(string, ≤1024 characters)
Skill optional fields:
allowed-tools(comma-separated string)
Command required fields:
description(string)
Command optional fields:
argument-hint(string)allowed-tools(comma-separated string)disable-model-invocation(boolean)
Subagent required fields:
name(string)description(string)
Subagent optional fields:
tools(array)model(enum: sonnet, opus, haiku)
Validation example:
import yaml
with open('SKILL.md') as f:
parts = f.read().split('---')
data = yaml.safe_load(parts[1])
# Check required fields
required = ['name', 'description']
missing = [f for f in required if f not in data]
if missing:
print(f"❌ Missing required fields: {missing}")
else:
print("✅ All required fields present")
# Validate field types and constraints
if 'name' in data and not isinstance(data['name'], str):
print("❌ 'name' must be a string")
if 'description' in data:
if not isinstance(data['description'], str):
print("❌ 'description' must be a string")
elif len(data['description']) > 1024:
print(f"❌ 'description' exceeds 1024 chars: {len(data['description'])}")
else:
print(f"✅ description length: {len(data['description'])}/1024")
Step 4: Naming Convention Validation
Verify directory/file names match YAML fields:
For Skills:
# Directory name must match YAML 'name' field
SKILL_DIR=$(basename $(dirname SKILL.md))
YAML_NAME=$(grep "^name:" SKILL.md | cut -d: -f2 | xargs)
if [ "$SKILL_DIR" == "$YAML_NAME" ]; then
echo "✅ Name matches: $SKILL_DIR"
else
echo "❌ Mismatch: directory=$SKILL_DIR, YAML=$YAML_NAME"
fi
Naming rules:
- Lowercase only
- Hyphens for word separation (no underscores)
- No spaces
- Alphanumeric + hyphens only
- Max 64 characters
Validation:
import re
def validate_name(name, artifact_type):
"""Validate artifact name follows conventions."""
# Check length
if len(name) > 64:
return f"❌ Name too long: {len(name)} chars (max 64)"
# Check pattern
if not re.match(r'^[a-z0-9-]+$', name):
return "❌ Name must be lowercase alphanumeric with hyphens only"
# Check doesn't start/end with hyphen
if name.startswith('-') or name.endswith('-'):
return "❌ Name cannot start or end with hyphen"
# Check no consecutive hyphens
if '--' in name:
return "❌ No consecutive hyphens allowed"
return f"✅ Name '{name}' is valid"
Step 5: Description Quality Assessment (Skills Only)
Grade Skill descriptions using activation quality criteria:
Quality dimensions:
Action Verbs (0-10 points)
- Count action verbs in description
- 5+ verbs = 10 points, scale down for fewer
Capabilities Specificity (0-10 points)
- 10 points: Specific capabilities with details
- 5 points: Generic capabilities
- 0 points: Vague "helps with" language
Technology Mentions (0-10 points)
- Count technologies/tools/file types mentioned
- 3+ = 10 points, scale down for fewer
Trigger Keywords (0-10 points)
- Count "Use when/for..." keywords
- 10+ keywords = 10 points, scale down
Character Efficiency (0-10 points)
- Optimal: 400-700 chars = 10 points
- Scale down for too short (<200) or too long (>900)
Scoring formula:
Description Score = (Verbs + Specificity + Tech + Keywords + Efficiency) / 50 * 100
Grade:
- A (≥90): Excellent, will activate reliably
- B (80-89): Good, minor improvements possible
- C (70-79): Acceptable, needs improvements
- D (60-69): Poor, significant revision needed
- F (<60): Failing, complete rewrite required
Example assessment:
def assess_description(description):
"""Grade Skill description quality."""
# 1. Count action verbs
action_verbs = ['extract', 'parse', 'test', 'validate', 'generate',
'optimize', 'convert', 'analyze', 'detect', 'compare']
verb_count = sum(1 for verb in action_verbs if verb in description.lower())
verb_score = min(10, verb_count * 2)
# 2. Check specificity (heuristic: no vague terms)
vague_terms = ['helps', 'various', 'multiple', 'different']
has_vague = any(term in description.lower() for term in vague_terms)
specificity_score = 5 if has_vague else 10
# 3. Count technologies (file extensions, tool names)
tech_indicators = ['.pdf', '.json', '.csv', 'api', 'sql', 'http',
'rest', 'openapi', 'yaml']
tech_count = sum(1 for tech in tech_indicators if tech in description.lower())
tech_score = min(10, tech_count * 3)
# 4. Check for "Use when/for"
has_triggers = 'use when' in description.lower() or 'use for' in description.lower()
trigger_score = 10 if has_triggers else 0
# 5. Character efficiency
length = len(description)
if 400 <= length <= 700:
efficiency_score = 10
elif 200 <= length < 400 or 700 < length <= 900:
efficiency_score = 7
elif length < 200 or length > 900:
efficiency_score = 3
else:
efficiency_score = 0
total = verb_score + specificity_score + tech_score + trigger_score + efficiency_score
percentage = (total / 50) * 100
if percentage >= 90:
grade = 'A'
elif percentage >= 80:
grade = 'B'
elif percentage >= 70:
grade = 'C'
elif percentage >= 60:
grade = 'D'
else:
grade = 'F'
return {
'grade': grade,
'score': percentage,
'breakdown': {
'verbs': verb_score,
'specificity': specificity_score,
'technologies': tech_score,
'triggers': trigger_score,
'efficiency': efficiency_score
}
}
Step 6: Content Quality Validation
Check the artifact's content beyond YAML frontmatter:
For Skills (SKILL.md):
- Overview paragraph exists and explains purpose
- Core Capabilities section with 3+ capabilities
- Methodology section with step-by-step workflow
- At least 3 examples (recommend 5)
- Examples have code blocks with language specified
- No placeholder text (TODO, FIXME, Lorem Ipsum)
- Troubleshooting section present
- No broken markdown syntax
For Commands:
- Description explains what command does
- If arguments used, they're documented
- Command body has executable content
- No placeholder prompts
For Subagents:
- System prompt is substantial (>100 words)
- Clear focus/specialization explained
- Instructions are actionable
Validation approach:
def validate_skill_content(filepath):
"""Check Skill content quality."""
with open(filepath) as f:
content = f.read()
issues = []
# Check for required sections
required_sections = [
'Core Capabilities',
'Methodology',
'Examples'
]
for section in required_sections:
if section not in content:
issues.append(f"Missing section: {section}")
# Count examples
example_count = content.count('### Example')
if example_count < 3:
issues.append(f"Only {example_count} examples (need 3+)")
# Check for placeholders
placeholders = ['TODO', 'FIXME', 'PLACEHOLDER', 'Lorem ipsum']
for placeholder in placeholders:
if placeholder in content:
issues.append(f"Contains placeholder: {placeholder}")
# Check code blocks have language
code_blocks = re.findall(r'```(\w*)\n', content)
unnamed_blocks = [i for i, lang in enumerate(code_blocks) if not lang]
if unnamed_blocks:
issues.append(f"{len(unnamed_blocks)} code blocks missing language")
if not issues:
return "✅ Content quality: PASS"
else:
return f"⚠️ Content issues:\n" + "\n".join(f" - {issue}" for issue in issues)
Step 7: Activation Pattern Testing (Skills Only)
Simulate activation scenarios to predict reliability:
Test categories:
- Direct keyword match
- Contextual trigger
- File extension trigger (if applicable)
Example test suite:
def test_activation_potential(description):
"""Predict if Skill will activate for common scenarios."""
# Extract keywords from description
keywords = set(description.lower().split())
test_scenarios = [
{
'name': 'Direct keyword match',
'request': 'extract tables from pdf',
'expected_keywords': ['extract', 'tables', 'pdf']
},
{
'name': 'Technology mention',
'request': 'use pdfplumber to get data',
'expected_keywords': ['pdfplumber', 'data']
}
]
results = []
for scenario in test_scenarios:
request_words = set(scenario['request'].split())
matches = sum(1 for kw in scenario['expected_keywords']
if kw in keywords)
if matches >= 2:
results.append(f"✅ {scenario['name']}: Likely activates")
else:
results.append(f"⚠️ {scenario['name']}: May not activate")
return results
Step 8: Generate Quality Report
Compile all validation results into comprehensive report:
Report structure:
# Artifact Validation Report
**Artifact:** [name]
**Type:** [Skill/Command/Subagent/Hook]
**Date:** [YYYY-MM-DD]
## Overall Grade: [A/B/C/D/F]
### YAML Validation
- Syntax: [✅/❌]
- Required fields: [✅/❌]
- Optional fields: [list present fields]
### Naming Convention
- Directory/file name: [✅/❌]
- YAML name field: [✅/❌]
- Consistency: [✅/❌]
### Description Quality (Skills only)
- Grade: [A/B/C/D/F]
- Score: [X/100]
- Breakdown:
- Action verbs: [X/10]
- Specificity: [X/10]
- Technologies: [X/10]
- Triggers: [X/10]
- Efficiency: [X/10]
### Content Quality
- Structure: [✅/❌/⚠️]
- Examples: [count] ([✅ if ≥3, ❌ if <3])
- Placeholders: [✅ none / ❌ found]
- Code blocks: [✅/⚠️]
### Activation Potential (Skills only)
- Direct keywords: [✅/⚠️/❌]
- Contextual triggers: [✅/⚠️/❌]
- File extensions: [✅/⚠️/❌/N/A]
- Predicted success rate: [X%]
## Issues Found
[List of all issues with severity]
## Recommendations
[Specific, actionable improvements]
## Quality Framework Score
Using Q = 0.40·R + 0.30·C + 0.20·S + 0.10·E:
- Relevance (R): [score] - [explanation]
- Completeness (C): [score] - [explanation]
- Consistency (S): [score] - [explanation]
- Efficiency (E): [score] - [explanation]
**Final Q score: [X.XX]**
**Grade: [A/B+/B/C/D/F]**
Examples
Example 1: Validate a Skill with Issues
Scenario: Check pdf-processor Skill for quality
Validation steps:
# Step 1: Check YAML syntax
python3 -c "
import yaml
with open('.claude/skills/pdf-processor/SKILL.md') as f:
content = f.read()
parts = content.split('---')
yaml.safe_load(parts[1])
"
# Output: (no error = valid)
# Step 2: Verify name consistency
DIRNAME=$(basename .claude/skills/pdf-processor)
YAMLNAME=$(grep "^name:" .claude/skills/pdf-processor/SKILL.md | cut -d: -f2 | xargs)
if [ "$DIRNAME" == "$YAMLNAME" ]; then
echo "✅ Names match: $DIRNAME"
else
echo "❌ Mismatch: dir=$DIRNAME, yaml=$YAMLNAME"
fi
Issues found:
⚠️ Description only 45 characters (too short)
❌ Missing "Use when..." triggers
⚠️ Only 2 action verbs (need 5+)
❌ Missing Examples section
Recommendations:
1. Expand description to 400-700 characters
2. Add "Use when working with PDF files, document extraction..." clause
3. Add action verbs: parse, merge, split, convert
4. Create Examples section with 3-5 complete examples
Quality grade: D (65/100)
Example 2: Validate High-Quality Skill
Scenario: Validate artifact-advisor Skill
Results:
✅ YAML syntax: Valid
✅ Name consistency: artifact-advisor (matches)
✅ Description: 586 characters (optimal range)
✅ Required fields: All present
✅ Action verbs: 7 (analyze, recommend, advise, choose, justify)
✅ Technologies: 4 (Skills, Commands, Subagents, Hooks)
✅ Triggers: Present ("Use when user asks...")
✅ Examples: 7 examples found
✅ No placeholders
✅ Code blocks: All have language specified
Quality grade: A (94/100)
Quality framework:
- Relevance: 0.95 (highly focused on artifact selection)
- Completeness: 0.92 (comprehensive examples and decision trees)
- Consistency: 0.90 (consistent terminology and structure)
- Efficiency: 0.88 (concise yet thorough)
Q = 0.40(0.95) + 0.30(0.92) + 0.20(0.90) + 0.10(0.88) = 0.924
Final grade: A
Example 3: Validate Command
Scenario: Check deployment command
Command file: .claude/commands/deploy.md
---
description: Deploy application to specified environment with pre-deployment checks
argument-hint: <environment> [branch]
allowed-tools: Bash
disable-model-invocation: true
---
Deploy to $1 environment from ${2:-main} branch.
Run tests, build, and deploy with validation.
Validation:
✅ YAML syntax: Valid
✅ Description: Present and descriptive
✅ argument-hint: Properly formatted
✅ disable-model-invocation: Correctly set (prevents accidental deployment)
✅ Command body: Uses $1, $2 arguments correctly
✅ No placeholders
Quality grade: A (92/100)
Example 4: Batch Validation
Scenario: Validate all Skills in project
# Find all Skills
for skill_dir in .claude/skills/*/; do
skill_name=$(basename "$skill_dir")
skill_file="$skill_dir/SKILL.md"
if [ -f "$skill_file" ]; then
echo "Validating: $skill_name"
# Run validation
python3 validate.py "$skill_file"
echo "---"
fi
done
Output:
Validating: artifact-advisor
✅ Grade: A (94/100)
---
Validating: skill-builder
✅ Grade: A (92/100)
---
Validating: pdf-processor
⚠️ Grade: C (75/100)
Issues: Description too short, missing triggers
---
Summary:
Total Skills: 3
Grade A: 2 (67%)
Grade B: 0 (0%)
Grade C: 1 (33%)
Grade D or below: 0 (0%)
Recommendation: Fix pdf-processor description
Example 5: Pre-Deployment Validation
Scenario: Validate Skill before git commit
Workflow:
# 1. Run full validation
python3 .claude/scripts/validate-artifact.py .claude/skills/new-skill/SKILL.md
# 2. Check for critical issues
if [ $? -ne 0 ]; then
echo "❌ Validation failed - fix issues before committing"
exit 1
fi
# 3. Check grade threshold
GRADE=$(python3 validate.py SKILL.md | grep "Grade:" | cut -d: -f2 | xargs | cut -c1)
if [ "$GRADE" != "A" ] && [ "$GRADE" != "B" ]; then
echo "⚠️ Grade $GRADE - consider improving before deployment"
echo "Proceed anyway? (y/n)"
read response
if [ "$response" != "y" ]; then
exit 1
fi
fi
# 4. All checks passed
echo "✅ Validation passed - ready to commit"
Common Patterns
Pattern 1: Quick Validation Check
Run a fast check for critical issues only:
validate_quick() {
local file=$1
# Check YAML syntax
python3 -c "import yaml; yaml.safe_load(open('$file').read().split('---')[1])" 2>&1
# Check name field exists
grep -q "^name:" "$file" || echo "❌ Missing 'name' field"
# Check description exists
grep -q "^description:" "$file" || echo "❌ Missing 'description' field"
# Check description length
DESC_LEN=$(grep "^description:" "$file" | cut -d: -f2- | wc -c)
if [ $DESC_LEN -gt 1024 ]; then
echo "❌ Description too long: $DESC_LEN chars"
fi
}
Pattern 2: Continuous Integration Validation
Validate all artifacts in CI/CD pipeline:
# .github/workflows/validate-artifacts.yml
name: Validate Claude Code Artifacts
on: [push, pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Validate Skills
run: |
for skill in .claude/skills/*/SKILL.md; do
python3 scripts/validate.py "$skill" || exit 1
done
- name: Validate Commands
run: |
for cmd in .claude/commands/*.md; do
python3 scripts/validate.py "$cmd" || exit 1
done
- name: Check grades
run: |
python3 scripts/grade-report.py
Pattern 3: Interactive Validation with Fixes
Validate and offer to fix common issues:
def validate_with_fixes(filepath):
"""Validate and optionally fix issues."""
issues = validate_artifact(filepath)
for issue in issues:
print(f"\n{issue['severity']} {issue['message']}")
if issue['fixable']:
response = input("Apply automatic fix? (y/n): ")
if response.lower() == 'y':
apply_fix(filepath, issue)
print("✅ Fixed")
Troubleshooting
Issue 1: "YAML syntax error" on valid YAML
Symptoms: Validator reports syntax error but YAML looks correct
Cause: Hidden characters (tabs, smart quotes, BOM)
Solution:
# Check for tabs
grep -P '\t' SKILL.md
# Replace tabs with spaces
sed -i 's/\t/ /g' SKILL.md
# Check for smart quotes
grep -P '[""]' SKILL.md
# Replace with straight quotes manually
# Remove BOM if present
sed -i '1s/^\xEF\xBB\xBF//' SKILL.md
Issue 2: Name mismatch false positive
Symptoms: Validator says names don't match but they appear identical
Cause: Whitespace in YAML name field
Solution:
# Check exact value
grep "^name:" SKILL.md | cat -A
# Should see: name: skill-name$
# If you see: name: skill-name $ (space before $)
# Fix by trimming:
sed -i 's/^name: \(.*\) *$/name: \1/' SKILL.md
Issue 3: Description passes validation but doesn't activate
Symptoms: Skill validated successfully but doesn't activate in practice
Cause: Validation checks structure, not real-world activation
Solution:
- Run activation testing protocol (separate from validation)
- Validation ≠ activation testing
- Use activation-test-protocol.md for real tests
- Validation ensures artifact CAN work, activation testing ensures it DOES work
Issue 4: Grade seems too low
Symptoms: Skill feels high quality but gets low grade
Cause: Grading is strict and focuses on specific metrics
Diagnosis:
Check breakdown:
- Verbs: Low score? Add more action verbs to description
- Technologies: Low score? Mention specific tools/file types
- Triggers: 0 points? Missing "Use when..." clause
- Efficiency: Low score? Description too short or too long
Fix: Address lowest-scoring dimension first
Best Practices
Do's ✅
- ✅ Run validation before committing artifacts to git
- ✅ Aim for Grade A (≥90) on all artifacts
- ✅ Fix critical issues (YAML syntax, missing fields) immediately
- ✅ Address warnings (low description quality) before deployment
- ✅ Validate after every significant edit
- ✅ Use validation in CI/CD pipelines
- ✅ Keep validation scripts up to date with latest specs
Don'ts ❌
- ❌ Don't deploy artifacts with validation errors
- ❌ Don't ignore warnings - they predict real problems
- ❌ Don't confuse validation with activation testing
- ❌ Don't over-optimize for validation score (natural quality matters)
- ❌ Don't skip validation because artifact "seems fine"
- ❌ Don't validate only new artifacts (validate existing ones too)
Quality Framework Reference
The validation system uses the Centauro quality framework:
Q = 0.40·Relevance + 0.30·Completeness + 0.20·Consistency + 0.10·Efficiency
Dimensions:
Relevance (R) - 40% weight:
- Does artifact solve the stated problem?
- Is description focused and specific?
- Are examples relevant to use cases?
Completeness (C) - 30% weight:
- Are all required fields present?
- Does it have sufficient examples?
- Is methodology explained?
Consistency (S) - 20% weight:
- Naming conventions followed?
- Terminology consistent?
- Structure matches spec?
Efficiency (E) - 10% weight:
- Description concise but informative?
- No redundant content?
- Optimal length?
Grading scale:
- A: ≥0.90 (Excellent)
- B+: 0.85-0.89 (Very Good)
- B: 0.80-0.84 (Good)
- C: 0.70-0.79 (Acceptable)
- D: 0.60-0.69 (Poor)
- F: <0.60 (Failing)
Related Skills
Use with:
- skill-builder: Create Skills, then validate with artifact-validator
- artifact-advisor: Decide artifact type, create it, then validate
- artifact-troubleshooter: If validation finds issues, use for deep debugging
External tools:
- yamllint: Advanced YAML linting
- markdownlint: Markdown syntax checking
- pre-commit hooks: Automatic validation before commits
Validate early, validate often. Quality artifacts are the foundation of effective Claude Code workflows.