name	skill-builder
description	Create, evaluate, and improve Agent skills to production quality (100/100). Use when the user wants to create a new skill, review an existing skill, score a skill against best practices, or improve a skill's quality. Also use when the user mentions skill development, skill templates, or skill optimization.

Skill Builder Workflow

Create, evaluate, and improve Agent skills to production quality.

Quick Start

Mode	When to Use	Starting Step
Create	Building a new skill from scratch	Step 1
Evaluate	Scoring an existing skill	Step 4
Improve	Upgrading a skill to 100/100	Step 5

Skill Files

File	Purpose
`SKILL.md`	This workflow
`SCORING.md`	Structure + Efficacy rubrics (MUST READ before scoring)
`TEMPLATES.md`	Starter templates and patterns (MUST READ before creating)
`EXAMPLES.md`	Before/after improvement examples
`CHECKLIST.md`	50-point validation checklist

Mode 1: Create a New Skill

Step 1: Gather Requirements

Ask the user:

What does the skill do? (core capability)
When should it activate? (trigger contexts)
What tools/scripts are needed? (dependencies)
What's the expected output? (deliverables)
What input quality issues are common? (see Input Decomposition below)
What does this assume the user knows? (see User Assumptions below)

Input Decomposition

[!IMPORTANT] Most real-world inputs are messy. If the domain typically has vague, incomplete, or poorly-structured input, the skill MUST include a transformation step.

Ask: "What does bad input look like in this domain?"

Input Quality	Skill Must Include
Usually clean and structured	No transformation needed
Sometimes vague or incomplete	Validation step that asks for clarification
Often messy or ambiguous	Decomposition step with probing questions to transform input

Decomposition step pattern:

### Step N: Decompose Input

Transform raw input into structured form using these probes:

| Probe | Purpose |
|-------|---------|
| "What specifically happened?" | Extract concrete actions |
| "What was the outcome?" | Capture measurable results |
| "How often does this occur?" | Establish patterns |

User Capability Assumptions

List what the skill assumes the user can do. For each assumption, either:

(a) Remove it by adding a compensating step, OR
(b) Document it as a prerequisite

Assumption	Compensation Strategy
User can provide structured input	Add decomposition step
User knows domain terminology	Add glossary or explain inline
User can make judgment calls	Add decision logic with explicit criteria
User knows quality standards	Add validation checklist

Step 1.5: Identify the Hardest Parts

[!CRITICAL] State-of-the-art skills solve the hard problems, not just the easy ones. Before designing the workflow, identify where experts struggle and novices get stuck.

Ask: "What are the 2-3 hardest judgment calls in this domain?"

Signs of a hard judgment call:

Experts disagree on the right answer
Multiple valid options exist
Context determines the best choice
Novices consistently get it wrong

For each hard part, the skill MUST include:

Hard Part Type	Required Solution
Ambiguous categorization	Disambiguation logic with explicit criteria
Quality/intensity judgment	Calibration guidance with thresholds
Context-dependent choice	Decision matrix or if/then rules
Subjective evaluation	Rubric with concrete examples

Example pattern for disambiguation:

| If X could be A or B... | Ask this to disambiguate |
|-------------------------|--------------------------|
| [Ambiguous situation 1] | Was the emphasis on [criterion]? → A. On [other criterion]? → B |
| [Ambiguous situation 2] | Did it primarily [test for A] or [test for B]? |

[!WARNING] A lookup table is not disambiguation. If your skill has a reference table but no logic for handling cases that match multiple entries, it's incomplete.

Step 2: Assess Complexity & Choose Structure

[!CAUTION] Default to Simple. Only upgrade complexity if the skill genuinely needs it. Ask: "Would this skill work without this file?" If yes, don't add it.

Complexity Assessment:

If the skill...	Then it's...
Does ONE thing, linear flow, no scripts, <5 decision points	Simple
Multi-step workflow, needs reference tables, moderate domain knowledge	Standard
Many conditionals, requires scripts, extensive domain expertise, high failure modes	Complex

Structure by Complexity:

Complexity	Structure
Simple	`SKILL.md` only
Standard	`SKILL.md` + `REFERENCE.md` or `EXAMPLES.md`
Complex	Above + `TESTING.md` + `scripts/`

[!TIP] Signs you're over-engineering:

Adding TESTING.md with obvious scenarios ("it should work")

Creating REFERENCE.md that repeats the workflow

Writing EXAMPLES.md when 2 inline examples suffice

Read TEMPLATES.md for starter templates.

Step 3: Write the SKILL.md

Use templates from TEMPLATES.md. Ensure:

Frontmatter — valid YAML with name (must match folder name) and description
Description — includes BOTH what it does AND when to use it
"Why?" line — one sentence after title explaining the problem this solves
Workflow — clear, numbered steps
Progressive disclosure — link to supporting files (only if needed)

[!TIP] Description is critical for discovery. Include multiple trigger keywords.

Step 3.a: Register the Skill

[!CRITICAL] Do NOT edit AGENTS.md manually.

Run the skills-index-updater skill or script:

python3 ~/.claude/skills/skills-index-updater/scripts/update_skill_index.py

Verify AGENTS.md contains your new/updated skill.

After creating, proceed to Step 4 to evaluate.

Mode 2: Evaluate an Existing Skill

Step 4: Score the Skill

[!CRITICAL] Read SCORING.md completely before scoring. It contains both rubrics and scoring worksheets.

Process:

Read all skill files (SKILL.md + supporting files)
Score Structure (0-100): 9 categories — documentation completeness
Score Efficacy (0-100): 6 categories — actual effectiveness
Use Combined Score Matrix in SCORING.md for verdict
Identify gaps in both dimensions

Present results using the format in SCORING.md.

If either score < 90, proceed to Step 5.

Mode 3: Improve to 100/100

Step 5: Plan Improvements

Based on evaluation, prioritize:

Priority	Fixes	Target
P1 Critical	Missing frontmatter, invalid YAML, empty description	Required to function
P2 Important	Missing triggers, no examples, no progressive disclosure	Required for 95+
P3 Polish	Missing troubleshooting, no quick start, terminology issues	Required for 100

Step 6: Execute Improvements

[!CAUTION] Get user approval before making changes. Present the plan and wait for confirmation.

Work systematically:

Fix frontmatter first (skill won't load without valid YAML)
Enhance description with trigger keywords
Add progressive disclosure if SKILL.md > 200 lines
Create supporting files as needed
Add quality sections (Troubleshooting, Quick Start)

Step 7: Verify Final Score

Re-read all skill files
Re-score against both rubrics
Confirm scores meet target
Present final structure and summary

Validation Checklist (Quick)

Before declaring complete:

name in frontmatter matches folder name
description includes what AND when
"Why?" line present after title
SKILL.md under 500 lines
Structure matches complexity (not over-engineered)
Examples show concrete input/output
Consistent terminology throughout

Full checklist: CHECKLIST.md

Troubleshooting

Problem	Solution
Skill not discovered	Check description has trigger keywords
Low Structure score	Add missing sections per SCORING.md rubric
Low Efficacy score	Simplify — skill may be doing too many things
Frontmatter errors	Validate YAML syntax, check for reserved words
User confused by skill	Add Quick Start, improve decision density

Reference

SCORING.md — Structure + Efficacy rubrics with worksheets
TEMPLATES.md — Starter templates and common patterns
EXAMPLES.md — Before/after improvement examples
CHECKLIST.md — 50-point validation checklist

Install Skill

SKILL.md