name	agent-creator
version	3.0.1
description	Creates specialized AI agents with optimized system prompts using the official 5-phase SOP methodology (v3.0 adds Phase 0.5 cognitive frame selection), combined with evidence-based prompting techniques and Claude Agent SDK implementation. Use this skill when creating production-ready agents for specific domains, workflows, or tasks requiring consistent high-quality performance with deeply embedded domain knowledge and cognitive frame optimization. Integrates with recursive improvement loop.

Agent Creator - Enhanced with 5-Phase SOP Methodology (v3.0)

This skill provides the official comprehensive framework for creating specialized AI agents, integrating the proven 5-phase methodology (v3.0 adds Phase 0.5 cognitive frame selection) from Desktop .claude-flow with Claude Agent SDK implementation and evidence-based prompting techniques.

Trigger Keywords

USE WHEN user mentions:

"create agent", "build agent", "new agent", "design agent"
"agent for [domain]", "specialist agent", "domain expert agent"
"rewrite agent", "optimize agent", "improve agent"
"agent with [capability]", "agent that does [task]"
"multi-agent workflow", "coordinating agents"
"production-ready agent", "agent system prompt"

DO NOT USE when:

User wants a simple SKILL (not agent) - use skill-creator-agent or micro-skill-creator
User wants to improve a PROMPT (not system prompt) - use prompt-architect
User wants to improve THIS skill itself - use skill-forge
User wants quick automation without agent architecture - use micro-skill-creator

Instead use:

skill-creator-agent when creating skills that spawn agents (higher-level)
micro-skill-creator when creating atomic, focused skills
prompt-architect when optimizing user prompts (not system prompts)
skill-forge when improving agent-creator itself

When to Use This Skill

Use agent-creator for:

Creating project-specialized agents with deeply embedded domain knowledge
Building agents for recurring tasks requiring consistent behavior
Rewriting existing agents to optimize performance
Creating multi-agent workflows with sequential or parallel coordination
Agents that will integrate with MCP servers and Claude Flow

MCP Requirements

This skill requires the following MCP servers for optimal functionality:

memory-mcp (6.0k tokens)

Purpose: Store agent specifications, design decisions, and metadata for cross-session persistence and pattern learning.

Tools Used:

mcp__memory-mcp__memory_store: Store agent specs, cognitive frameworks, and design patterns
mcp__memory-mcp__vector_search: Retrieve similar agent patterns for reuse

Activation (PowerShell):

# Check if already active
claude mcp list

# Add if not present
claude mcp add memory-mcp node C:\Users\17175\memory-mcp\build\index.js

Usage Example:

// Store agent specification
await mcp__memory-mcp__memory_store({
  text: `Agent: ${agentName}. Role: ${roleTitle}. Domains: ${expertiseDomains}. Capabilities: ${coreCapabilities}. Commands: ${specialistCommands}`,
  metadata: {
    key: `agents/${agentName}/specification`,
    namespace: "agent-creation",
    layer: "long-term",
    category: "agent-architecture",
    tags: {
      WHO: "agent-creator",
      WHEN: new Date().toISOString(),
      PROJECT: agentName,
      WHY: "agent-specification"
    }
  }
});

// Retrieve similar agent patterns
const similarAgents = await mcp__memory-mcp__vector_search({
  query: `Agent for ${domain} with capabilities ${capabilities}`,
  limit: 5
});

Token Cost: 6.0k tokens (3.0% of 200k context) When to Load: When creating new agents or optimizing existing agent architectures

The 5-Phase Agent Creation Methodology (v3.0)

Source: Desktop .claude-flow/ official SOP documentation + Recursive Improvement System Total Time: 2.5-4 hours per agent (first-time), 1.5-2 hours (speed-run)

This methodology was developed through systematic reverse engineering of fog-compute agent creation and validated through production use. v3.0 adds Phase 0.5 for cognitive frame selection with multi-lingual embedding.

Phase 0: Expertise Loading (5-10 minutes) [NEW]

Objective: Load domain expertise before beginning agent creation.

Activities:

Detect Domain
- What domain does this agent operate in?
- Examples: authentication, payments, ML, frontend, etc.

Check for Expertise File

# Check if expertise exists
ls .claude/expertise/{domain}.yaml

Load If Available

if expertise_exists:
  - Run: /expertise-validate {domain}
  - Load: file_locations, patterns, known_issues
  - Context: Agent inherits domain knowledge
else:
  - Flag: Discovery mode - agent will learn
  - After: Generate expertise from agent creation

Apply to Agent Design
- Use expertise.file_locations for code references
- Use expertise.patterns for conventions
- Use expertise.known_issues to prevent bugs

Validation Gate:

Checked for domain expertise
Loaded expertise if available
Flagged for discovery if not

Outputs:

Domain expertise context (if available)
Discovery mode flag (if not)

Phase 0.5: Cognitive Frame Selection (5-10 minutes) [NEW in v3.0]

Objective: Select optimal cognitive frame based on goal analysis.

Activities:

Goal Analysis
- 1st Order Goal: What will this agent DO immediately?
- 2nd Order Goal: WHY is this agent needed?
- 3rd Order Goal: What ULTIMATE outcome does this enable?

Frame Selection Checklist

Question	If YES, Use Frame
Is completion tracking critical?	Aspectual (Russian)
Is source verification critical?	Evidential (Turkish)
Is audience calibration critical?	Hierarchical (Japanese)
Is semantic analysis needed?	Morphological (Arabic)
Is object comparison needed?	Classifier (Mandarin)

Frame Embedding Embed frame activation phrase in agent system prompt:

For Evidential Frame:

## Kanitsal Cerceve (Evidential Mode)
Bu agent her iddia icin kaynak belirtir:
- DOGRUDAN: "I tested this directly"
- CIKARIM: "Evidence suggests..."
- BILDIRILEN: "Documentation states..."

For Aspectual Frame:

## Aspektual'naya Ramka (Aspectual Mode)
Etot agent otslezhivaet zavershenie:
- [SV] Polnost'yu zaversheno - COMPLETED
- [NSV] V protsesse - IN_PROGRESS
- [BLOCKED] Ozhidaet - WAITING

Multi-Lingual Section Include 3-5 lines in target language within agent template.

Validation Gate:

Goal analysis completed (all 3 orders)
Frame selection checklist run
Frame activation phrase prepared
Multi-lingual section ready

Outputs:

Selected cognitive frame
Frame activation phrase
Multi-lingual embedding for system prompt

Phase 1: Initial Analysis & Intent Decoding (30-60 minutes)

Objective: Deep domain understanding through systematic research, not assumptions.

Activities:

Domain Breakdown
- What problem does this agent solve?
- What are the key challenges in this domain?
- What patterns do human experts use?
- What are common failure modes?
Technology Stack Mapping
- What tools, frameworks, libraries are used?
- What file types, formats, protocols?
- What integrations or APIs?
- What configuration patterns?
Integration Points
- What MCP servers will this agent use?
- What other agents will it coordinate with?
- What data flows in/out?
- What memory patterns needed?

Validation Gate:

Can describe domain in specific, technical terms
Identified 5+ key challenges
Mapped technology stack comprehensively
Clear on integration requirements

Outputs:

Domain analysis document
Technology stack inventory
Integration requirements list

Phase 2: Meta-Cognitive Extraction (30-45 minutes)

Objective: Identify the cognitive expertise domains activated when you reason about this agent's tasks.

Activities:

Expertise Domain Identification
- What knowledge domains are activated when you think about this role?
- What heuristics, patterns, rules-of-thumb?
- What decision-making frameworks?
- What quality standards?

Agent Specification Creation

# Agent Specification: [Name]

## Role & Expertise
- Primary role: [Specific title]
- Expertise domains: [List activated domains]
- Cognitive patterns: [Heuristics used]

## Cognitive Frame (NEW in v3.0)
```yaml
cognitive_frame:
  primary: evidential|aspectual|hierarchical|morphological|classifier
  goal_analysis:
    first_order: "..."
    second_order: "..."
    third_order: "..."
  frame_embedding: |
    [Multi-lingual activation phrase]

Core Capabilities

[Capability with specific examples]
[Capability with specific examples] ...

Decision Frameworks

When X, do Y because Z
Always check A before B
Never skip validation of C

Quality Standards

Output must meet [criteria]
Performance measured by [metrics]
Failure modes to prevent: [list]

Supporting Artifacts
- Create examples of good vs bad outputs
- Document edge cases
- List common pitfalls

Validation Gate:

Identified 3+ expertise domains
Documented 5+ decision heuristics
Created complete agent specification
Examples demonstrate quality standards

Outputs:

Agent specification document
Example outputs (good/bad)
Edge case inventory

Phase 3: Agent Architecture Design (45-60 minutes)

Objective: Transform specification into production-ready base system prompt.

Activities:

System Prompt Structure Design

# [AGENT NAME] - SYSTEM PROMPT v1.0

## 🎭 CORE IDENTITY

I am a **[Role Title]** with comprehensive, deeply-ingrained knowledge of [domain]. Through systematic reverse engineering and domain expertise, I possess precision-level understanding of:

- **[Domain Area 1]** - [Specific capabilities from Phase 2]
- **[Domain Area 2]** - [Specific capabilities from Phase 2]
- **[Domain Area 3]** - [Specific capabilities from Phase 2]

My purpose is to [primary objective] by leveraging [unique expertise].

## 📋 UNIVERSAL COMMANDS I USE

**File Operations**:
- /file-read, /file-write, /glob-search, /grep-search
WHEN: [Specific situations from domain analysis]
HOW: [Exact patterns]

**Git Operations**:
- /git-status, /git-commit, /git-push
WHEN: [Specific situations]
HOW: [Exact patterns]

**Communication & Coordination**:
- /memory-store, /memory-retrieve
- /agent-delegate, /agent-escalate
WHEN: [Specific situations]
HOW: [Exact patterns with namespace conventions]

## 🎯 MY SPECIALIST COMMANDS

[List role-specific commands with exact syntax and examples]

## 🔧 MCP SERVER TOOLS I USE

**Claude Flow MCP**:
- mcp__claude-flow__agent_spawn
  WHEN: [Specific coordination scenarios]
  HOW: [Exact function call patterns]

- mcp__claude-flow__memory_store
  WHEN: [Cross-agent data sharing]
  HOW: [Namespace pattern: agent-role/task-id/data-type]

**[Other relevant MCP servers from Phase 1]**

## 🧠 COGNITIVE FRAMEWORK

### Self-Consistency Validation
Before finalizing deliverables, I validate from multiple angles:
1. [Domain-specific validation 1]
2. [Domain-specific validation 2]
3. [Cross-check with standards]

### Program-of-Thought Decomposition
For complex tasks, I decompose BEFORE execution:
1. [Domain-specific decomposition pattern]
2. [Dependency analysis]
3. [Risk assessment]

### Plan-and-Solve Execution
My standard workflow:
1. PLAN: [Domain-specific planning]
2. VALIDATE: [Domain-specific validation]
3. EXECUTE: [Domain-specific execution]
4. VERIFY: [Domain-specific verification]
5. DOCUMENT: [Memory storage patterns]

## 🚧 GUARDRAILS - WHAT I NEVER DO

[From Phase 2 failure modes and edge cases]

**[Failure Category 1]**:
❌ NEVER: [Dangerous pattern]
WHY: [Consequences from domain knowledge]

WRONG:
  [Bad example]

CORRECT:
  [Good example]

## ✅ SUCCESS CRITERIA

Task complete when:
- [ ] [Domain-specific criterion 1]
- [ ] [Domain-specific criterion 2]
- [ ] [Domain-specific criterion 3]
- [ ] Results stored in memory
- [ ] Relevant agents notified

## 📖 WORKFLOW EXAMPLES

### Workflow 1: [Common Task Name from Phase 1]

**Objective**: [What this achieves]

**Step-by-Step Commands**:
```yaml
Step 1: [Action]
  COMMANDS:
    - /[command-1] --params
    - /[command-2] --params
  OUTPUT: [Expected]
  VALIDATION: [Check]

Step 2: [Next Action]
  COMMANDS:
    - /[command-3] --params
  OUTPUT: [Expected]
  VALIDATION: [Check]

Timeline: [Duration] Dependencies: [Prerequisites]

Evidence-Based Technique Integration

For each technique (from existing agent-creator skill):
- Self-consistency: When to use, how to apply
- Program-of-thought: Decomposition patterns
- Plan-and-solve: Planning frameworks
Integrate these naturally into the agent's methodology.
Quality Standards & Guardrails

From Phase 2 failure modes, create explicit guardrails:
- What patterns to avoid
- What validations to always run
- When to escalate vs. retry
- Error handling protocols

Validation Gate:

System prompt follows template structure
All Phase 2 expertise embedded
Evidence-based techniques integrated
Guardrails cover identified failure modes
2+ workflow examples with exact commands

Outputs:

Base system prompt (v1.0)
Cognitive framework specification
Guardrails documentation

Phase 4: Deep Technical Enhancement (60-90 minutes)

Objective: Reverse-engineer exact implementation patterns and document with precision.

Activities:

Code Pattern Extraction

For technical agents, extract EXACT patterns from codebase:

## Code Patterns I Recognize

### Pattern: [Name]
**File**: `path/to/file.py:123-156`

```python
class ExamplePattern:
    def __init__(
        self,
        param1: Type = default,  # Line 125: Exact default
        param2: Type = default   # Line 126: Exact default
    ):
        # Extracted from actual implementation
        pass

When I see this pattern, I know:

[Specific insight about architecture]
[Specific constraint or requirement]
[Common mistake to avoid]

Critical Failure Mode Documentation

From experience and domain knowledge:

## Critical Failure Modes

### Failure: [Name]
**Severity**: Critical/High/Medium
**Symptoms**: [How to recognize]
**Root Cause**: [Why it happens]
**Prevention**:
  ❌ DON'T: [Bad pattern]
  ✅ DO: [Good pattern with exact code]

**Detection**:
  ```bash
  # Exact command to detect this failure
  [command]

Integration Patterns

Document exact MCP tool usage:

## MCP Integration Patterns

### Pattern: Cross-Agent Data Sharing
```javascript
// Exact pattern for storing outputs
mcp__claude-flow__memory_store({
  key: "marketing-specialist/campaign-123/audience-analysis",
  value: {
    segments: [...],
    targeting: {...},
    confidence: 0.89
  },
  ttl: 86400
})

Namespace Convention:

Format: {agent-role}/{task-id}/{data-type}
Example: backend-dev/api-v2/schema-design

Performance Metrics

Define what to track:

## Performance Metrics I Track

```yaml
Task Completion:
  - /memory-store --key "metrics/[my-role]/tasks-completed" --increment 1
  - /memory-store --key "metrics/[my-role]/task-[id]/duration" --value [ms]

Quality:
  - validation-passes: [count successful validations]
  - escalations: [count when needed help]
  - error-rate: [failures / attempts]

Efficiency:
  - commands-per-task: [avg commands used]
  - mcp-calls: [tool usage frequency]

These metrics enable continuous improvement.

Validation Gate:

Code patterns include file/line references
Failure modes have detection + prevention
MCP patterns show exact syntax
Performance metrics defined
Agent can self-improve through metrics

Outputs:

Enhanced system prompt (v2.0)
Code pattern library
Failure mode handbook
Integration pattern guide
Metrics specification

Integrated Agent Creation Process

Combining 5-phase SOP (v2.0) with existing best practices:

Complete Workflow

Phase 0: Expertise Loading (5-10 min) [NEW in v2.0]
- Detect domain from request
- Check for expertise file
- Load if available, flag discovery mode if not
- Output: Expertise context or discovery flag
Phase 1: Domain Analysis (30-60 min)
- Research domain systematically
- Map technology stack
- Identify integration points
- Output: Domain analysis doc
Phase 2: Expertise Extraction (30-45 min)
- Identify cognitive domains
- Create agent specification
- Document decision frameworks
- Output: Agent spec + examples
Phase 3: Architecture Design (45-60 min)
- Draft base system prompt
- Integrate evidence-based techniques
- Add quality guardrails
- Output: Base prompt v1.0
Phase 4: Technical Enhancement (60-90 min)
- Extract code patterns
- Document failure modes
- Define MCP integrations
- Add performance metrics
- Output: Enhanced prompt v2.0
SDK Implementation (30-60 min)
- Implement with Claude Agent SDK
- Configure tools and permissions
- Set up MCP servers
- Output: Production agent
Testing & Validation (30-45 min)
- Test typical cases
- Test edge cases
- Test error handling
- Verify consistency
- Output: Test report
Documentation & Packaging (15-30 min)
- Create agent README
- Document usage examples
- Package supporting files
- Output: Complete agent package

Total Time: 3.5-5.5 hours (first-time), 2-3 hours (speed-run) [+5-10 min for Phase 0]

Claude Agent SDK Implementation

Once system prompt is finalized, implement with SDK:

TypeScript Implementation

import { query, tool } from '@anthropic-ai/claude-agent-sdk';
import { z } from 'zod';

// Custom domain-specific tools
const domainTool = tool({
  name: 'domain_operation',
  description: 'Performs domain-specific operation',
  parameters: z.object({
    param: z.string()
  }),
  handler: async ({ param }) => {
    // Implementation from Phase 4
    return { result: 'data' };
  }
});

// Agent configuration
for await (const message of query('Perform domain task', {
  model: 'claude-sonnet-4-5',
  systemPrompt: enhancedPromptV2,  // From Phase 4
  permissionMode: 'acceptEdits',
  allowedTools: ['Read', 'Write', 'Bash', domainTool],
  mcpServers: [{
    command: 'npx',
    args: ['claude-flow@alpha', 'mcp', 'start'],
    env: { ... }
  }],
  settingSources: ['user', 'project']
})) {
  console.log(message);
}

Python Implementation

from claude_agent_sdk import query, tool, ClaudeAgentOptions
import asyncio

@tool()
async def domain_operation(param: str) -> dict:
    """Domain-specific operation from Phase 4."""
    # Implementation
    return {"result": "data"}

async def run_agent():
    options = ClaudeAgentOptions(
        model='claude-sonnet-4-5',
        system_prompt=enhanced_prompt_v2,  # From Phase 4
        permission_mode='acceptEdits',
        allowed_tools=['Read', 'Write', 'Bash', domain_operation],
        mcp_servers=[{
            'command': 'npx',
            'args': ['claude-flow@alpha', 'mcp', 'start']
        }],
        setting_sources=['user', 'project']
    )

    async for message in query('Perform domain task', **options):
        print(message)

asyncio.run(run_agent())

Agent Specialization Patterns

From existing agent-creator skill, enhanced with 5-phase methodology (v2.0):

Analytical Agents

Phase 0 Focus: Load domain expertise for data patterns Phase 1 Focus: Evidence evaluation patterns, data quality standards Phase 2 Focus: Analytical heuristics, validation frameworks Phase 3 Focus: Self-consistency checking, confidence calibration Phase 4 Focus: Statistical validation code, error detection patterns

Generative Agents

Phase 0 Focus: Load domain expertise for output conventions Phase 1 Focus: Quality criteria, template patterns Phase 2 Focus: Creative heuristics, refinement cycles Phase 3 Focus: Plan-and-solve frameworks, requirement tracking Phase 4 Focus: Generation patterns, quality validation code

Diagnostic Agents

Phase 0 Focus: Load domain expertise for known issues Phase 1 Focus: Problem patterns, debugging workflows Phase 2 Focus: Hypothesis generation, systematic testing Phase 3 Focus: Program-of-thought decomposition, evidence tracking Phase 4 Focus: Detection scripts, root cause analysis patterns

Orchestration Agents

Phase 0 Focus: Load domain expertise for workflow patterns Phase 1 Focus: Workflow patterns, dependency management Phase 2 Focus: Coordination heuristics, error recovery Phase 3 Focus: Plan-and-solve with dependencies, progress tracking Phase 4 Focus: Orchestration code, retry logic, escalation paths

Testing & Validation

From existing framework + SOP enhancements:

Test Suite Creation

Typical Cases - Expected behavior on common tasks
Edge Cases - Boundary conditions and unusual inputs
Error Cases - Graceful handling and escalation
Integration Cases - End-to-end workflow with other agents
Performance Cases - Speed, efficiency, resource usage

Validation Checklist

Identity: Agent maintains consistent role
Commands: Uses universal commands correctly
Specialist Skills: Demonstrates domain expertise
MCP Integration: Coordinates via memory and tools
Guardrails: Prevents identified failure modes
Workflows: Executes examples successfully
Metrics: Tracks performance data
Code Patterns: Applies exact patterns from Phase 4
Error Handling: Escalates appropriately
Consistency: Produces stable outputs on repeat

Quick Reference

When to Use Each Phase

Phase 0 (Expertise Loading) [NEW in v2.0]:

Always - Check for existing domain expertise first
Skip search thrash if expertise available
Enables discovery mode if expertise missing

Phase 1 (Analysis):

Always - Required foundation
Especially for domains you're less familiar with

Phase 2 (Expertise Extraction):

Always - Captures cognitive patterns
Essential for complex reasoning tasks

Phase 3 (Architecture):

Always - Creates base system prompt
Critical for clear behavioral specification

Phase 4 (Enhancement):

For production agents
For technical domains requiring exact patterns
When precision and failure prevention are critical

Speed-Run Approach (Experienced Creators)

Phase 0 (5 min): Quick expertise check
Combined Phase 1+2 (30 min): Rapid domain analysis + spec
Phase 3 (30 min): Base prompt from template
Phase 4 (45 min): Code patterns + failure modes
Testing (15 min): Quick validation suite

Total: 2 hours 5 min for experienced creators with templates

Examples from Production

Example: Marketing Specialist Agent

See: docs/agent-architecture/agents-rewritten/MARKETING-SPECIALIST-AGENT.md

Phase 0 Output: Loaded marketing domain expertise (if available) Phase 1 Output: Marketing domain analysis, tools (Google Analytics, SEMrush, etc.) Phase 2 Output: Marketing expertise (CAC, LTV, funnel optimization, attribution) Phase 3 Output: Base prompt with 9 specialist commands Phase 4 Output: Campaign workflow patterns, A/B test validation, ROI calculations

Result: Production-ready agent with deeply embedded marketing expertise

Maintenance & Iteration

Continuous Improvement

Metrics Review: Weekly review of agent performance metrics
Failure Analysis: Document and fix new failure modes
Pattern Updates: Add newly discovered code patterns
Workflow Optimization: Refine based on usage patterns

Version Control

v1.0: Base prompt from Phase 3
v1.x: Minor refinements from testing
v2.0: Enhanced with Phase 4 patterns
v2.x: Production iterations and improvements

Summary

This enhanced agent-creator skill combines:

Phase 0: Expertise Loading (NEW in v2.0)
Phase 1-4: Official SOP methodology (Desktop .claude-flow)
Evidence-based prompting techniques (self-consistency, PoT, plan-and-solve)
Claude Agent SDK implementation (TypeScript + Python)
Production validation and testing frameworks
Continuous improvement through metrics
Recursive improvement loop integration

Use this methodology to create agents with:

Deeply embedded domain knowledge
Exact command and MCP tool specifications
Production-ready failure prevention
Measurable performance tracking

Cross-Skill Coordination

Agent Creator works with:

cognitive-lensing: Select optimal cognitive frames for agents (Phase 0.5 integration)
skill-forge: Improve agent-creator itself through meta-prompting
prompt-forge: Optimize agent system prompts using evidence-based techniques
eval-harness: Validate created agents against benchmarks

Integration Points:

cognitive-lensing provides frame selection during agent creation (goal-based analysis)
prompt-forge optimizes agent prompts after Phase 3 architecture design
skill-forge uses meta-loop to improve the agent creation process itself
eval-harness validates agent quality through regression and benchmark tests

See: .claude/skills/META-SKILLS-COORDINATION.md for full coordination matrix.

GraphViz Diagram

Create agent-creator-process.dot to visualize the 5-phase workflow:

digraph AgentCreator {
    rankdir=TB;
    compound=true;
    node [shape=box, style=filled, fontname="Arial"];

    start [shape=ellipse, label="Start:\nAgent Request", fillcolor=lightgreen];
    end [shape=ellipse, label="Complete:\nProduction Agent", fillcolor=green, fontcolor=white];

    subgraph cluster_phase0 {
        label="Phase 0: Expertise Loading";
        fillcolor=lightyellow;
        style=filled;
        p0 [label="Load Domain\nExpertise"];
    }

    subgraph cluster_phase1 {
        label="Phase 1: Analysis";
        fillcolor=lightblue;
        style=filled;
        p1 [label="Domain\nBreakdown"];
    }

    subgraph cluster_phase2 {
        label="Phase 2: Extraction";
        fillcolor=lightblue;
        style=filled;
        p2 [label="Meta-Cognitive\nExtraction"];
    }

    subgraph cluster_phase3 {
        label="Phase 3: Architecture";
        fillcolor=lightblue;
        style=filled;
        p3 [label="System Prompt\nDesign"];
    }

    subgraph cluster_phase4 {
        label="Phase 4: Enhancement";
        fillcolor=lightblue;
        style=filled;
        p4 [label="Technical\nPatterns"];
    }

    eval [shape=octagon, label="Eval Harness\nGate", fillcolor=orange];

    start -> p0;
    p0 -> p1;
    p1 -> p2;
    p2 -> p3;
    p3 -> p4;
    p4 -> eval;
    eval -> end [label="pass", color=green];
    eval -> p1 [label="fail", color=red, style=dashed];

    labelloc="t";
    label="Agent Creator: 5-Phase Workflow (v2.0)";
    fontsize=16;
}

Next: Begin agent creation using this enhanced methodology.

Recursive Improvement Integration (v2.0)

Agent Creator is part of the recursive self-improvement loop:

Role in the Loop

Agent Creator (FOUNDRY)
    |
    +--> Creates auditor agents (prompt, skill, expertise, output)
    +--> Creates domain experts
    +--> Can be improved BY the loop

Input/Output Contracts

input_contract:
  required:
    - domain: string  # What domain the agent operates in
    - purpose: string  # What the agent should accomplish
  optional:
    - expertise_file: path  # Pre-loaded expertise
    - similar_agents: list  # Reference agents
    - constraints: list  # Specific requirements

output_contract:
  required:
    - agent_file: path  # Created agent markdown
    - test_cases: list  # Validation tests
    - version: semver  # Agent version
  optional:
    - expertise_delta: object  # Learnings to add to expertise
    - metrics: object  # Creation performance metrics

Eval Harness Integration

Created agents are tested against:

benchmark: agent-generation-benchmark-v1
  tests:
    - has_identity_section
    - has_capabilities
    - has_guardrails
    - has_memory_integration
  minimum_scores:
    completeness: 0.8
    specificity: 0.75
    integration: 0.7

regression: agent-creator-regression-v1
  tests:
    - identity_section_present (must_pass)
    - capabilities_defined (must_pass)
    - guardrails_included (must_pass)
    - memory_integration_specified (must_pass)

Memory Namespace

namespaces:
  - agent-creator/specifications/{agent}: Agent specs
  - agent-creator/generations/{id}: Created agents
  - agent-creator/metrics: Performance tracking
  - improvement/audits/agent-creator: Audits of this skill

Uncertainty Handling

When requirements are unclear:

confidence_check:
  if confidence >= 0.8:
    - Proceed with agent creation
    - Document assumptions
  if confidence 0.5-0.8:
    - Present 2-3 agent design options
    - Ask user to select approach
    - Document uncertainty areas
  if confidence < 0.5:
    - DO NOT proceed
    - List what is unclear
    - Ask specific clarifying questions
    - NEVER fabricate requirements

!! SKILL COMPLETION VERIFICATION (MANDATORY) !!

After invoking this skill, you MUST complete ALL items below before proceeding:

Completion Checklist

Agent Spawning: Did you spawn at least 1 agent via Task()?
- Example: Task("Agent Name", "Task description", "agent-type-from-registry")
Agent Registry Validation: Is your agent from the registry?
- Registry location: claude-code-plugins/ruv-sparc-three-loop-system/agents/
- Valid categories: delivery, foundry, operations, orchestration, platforms, quality, research, security, specialists, tooling
- NOT valid: Made-up agent names
TodoWrite Called: Did you call TodoWrite with 5+ todos?
- Example: TodoWrite({ todos: [8-10 items covering all work] })
Work Delegation: Did you delegate to agents (not do work yourself)?
- CORRECT: Agents do the implementation via Task()
- WRONG: You write the code directly after reading skill

Correct Pattern After Skill Invocation

// After Skill("<skill-name>") is invoked:
[Single Message - ALL in parallel]:
  Task("Agent 1", "Description of task 1...", "agent-type-1")
  Task("Agent 2", "Description of task 2...", "agent-type-2")
  Task("Agent 3", "Description of task 3...", "agent-type-3")
  TodoWrite({ todos: [
    {content: "Task 1 description", status: "in_progress", activeForm: "Working on task 1"},
    {content: "Task 2 description", status: "pending", activeForm: "Working on task 2"},
    {content: "Task 3 description", status: "pending", activeForm: "Working on task 3"},
  ]})

Wrong Pattern (DO NOT DO THIS)

// WRONG - Reading skill and then doing work yourself:
Skill("<skill-name>")
// Then you write all the code yourself without Task() calls
// This defeats the purpose of the skill system!

The skill is NOT complete until all checklist items are checked.

Remember the pattern: Skill() -> Task() -> TodoWrite() - ALWAYS

Core Principles

Agent Creator operates on 3 fundamental principles:

Principle 1: Domain Expertise Embedding

Agent effectiveness stems from deeply embedded domain knowledge, not surface-level instructions. Through Phase 0 expertise loading and systematic domain analysis, agents inherit proven patterns and avoid known pitfalls before execution begins.

In practice:

Load domain expertise files before agent creation to inherit institutional knowledge
Extract meta-cognitive patterns from expert reasoning during Phase 2
Document code patterns with exact file/line references in Phase 4 for precision

Principle 2: Evidence-Based Cognitive Frameworks

Research-validated prompting techniques dramatically improve agent reliability. Self-consistency reduces factual errors by 42%, program-of-thought improves logical accuracy by 37%, and plan-and-solve increases multi-step success rates by 53%.

In practice:

Apply self-consistency for analytical agents requiring fact validation
Use program-of-thought for agents performing systematic decomposition
Implement plan-and-solve for orchestration agents managing complex workflows

Principle 3: Continuous Improvement Through Metrics

Agents that track performance metrics can self-improve over time. By measuring task completion rates, validation passes, escalations, and error rates, agents identify optimization opportunities and feed learnings back into expertise files.

In practice:

Instrument agents to track metrics via memory-store operations
Review weekly performance data to identify failure patterns
Update agent prompts and expertise files based on production learnings

Common Anti-Patterns

Anti-Pattern	Problem	Solution
Generic Instructions Without Domain Context	Agent lacks critical domain knowledge, makes avoidable mistakes, reinvents wheels	Complete Phase 0 expertise loading and Phase 1 domain analysis before designing agent prompts
Skipping Phase 2 Meta-Cognitive Extraction	Agent follows instructions mechanically without understanding expert reasoning patterns	Identify expertise domains and decision heuristics that experts naturally apply
Vague Command Specifications	Agent receives "Process data" or "Handle errors" without concrete patterns	Provide exact command syntax, MCP tool usage patterns, and specific workflow examples
Missing Failure Mode Documentation	Agent encounters known edge cases without documented handling strategies	Document Phase 4 critical failure modes with detection scripts and prevention patterns
No Performance Tracking	Agent cannot self-improve because success/failure data is never captured	Add metrics tracking to agent prompts, store via memory-mcp, review weekly

Conclusion

Agent Creator transforms agent development from ad-hoc prompt writing into systematic knowledge engineering. By progressing through 5 phases - expertise loading, domain analysis, meta-cognitive extraction, architecture design, and technical enhancement - you create agents with deeply embedded domain knowledge rather than shallow instruction-following.

The investment in systematic agent creation compounds over time. Agents built with this methodology handle edge cases gracefully, avoid documented failure modes, and improve continuously through metrics tracking. When integrated with expertise files and recursive improvement loops, agents become institutional knowledge repositories that preserve and enhance organizational capabilities.

Use Agent Creator when building production-ready agents for domains requiring consistent high-quality performance. The 2.5-4 hour first-time investment becomes 1.5-2 hours for speed-runs, yielding agents that reliably execute complex workflows without constant supervision.

Version History

v3.0.1 (2025-12-19)

Fixed typo: "n## Trigger Keywords" -> "## Trigger Keywords"
Enhanced cross-skill coordination section with all four foundry skills
Added integration points for cognitive-lensing, skill-forge, prompt-forge, eval-harness
Clarified how skills integrate at different phases of agent creation

v3.0.0 (2025-12-18)

Added Phase 0.5: Cognitive Frame Selection with multi-lingual embedding
Integrated goal analysis framework (1st, 2nd, 3rd order goals)
Added frame selection checklist (Aspectual, Evidential, Hierarchical, Morphological, Classifier)
Added frame activation phrases for Evidential (Turkish) and Aspectual (Russian) modes
Extended Agent Specification template with cognitive_frame YAML section
Added multi-lingual embedding requirements for agent system prompts

v2.2.0 (2025-11-08)

Added Phase 0: Expertise Loading
Integrated expertise system for domain knowledge inheritance
Added discovery mode for agents without pre-existing expertise
Updated 5-phase workflow to 6 phases (Phase 0 + Phases 1-4)
Added expertise validation gates
Updated speed-run timelines to account for expertise loading

v2.1.0 (2025-10-15)

Enhanced Meta-Cognitive Extraction with decision framework templates
Added quality standards documentation
Improved agent specification structure
Added supporting artifacts guidelines

v2.0.0 (2025-09-20)

Official 5-phase SOP methodology integration from Desktop .claude-flow
Added systematic domain analysis (Phase 1)
Added meta-cognitive extraction (Phase 2)
Added architecture design phase (Phase 3)
Added technical enhancement phase (Phase 4)
Integrated evidence-based prompting techniques
Added production validation frameworks

v1.0.0 (2025-08-01)

Initial agent-creator skill
Basic agent creation workflow
Claude Agent SDK implementation
Evidence-based prompting techniques (self-consistency, PoT, plan-and-solve)

Install Skill

SKILL.md

Agent Creator - Enhanced with 5-Phase SOP Methodology (v3.0)

Trigger Keywords

When to Use This Skill

MCP Requirements

memory-mcp (6.0k tokens)

The 5-Phase Agent Creation Methodology (v3.0)

Phase 0: Expertise Loading (5-10 minutes) [NEW]

Phase 0.5: Cognitive Frame Selection (5-10 minutes) [NEW in v3.0]

Phase 1: Initial Analysis & Intent Decoding (30-60 minutes)

Phase 2: Meta-Cognitive Extraction (30-45 minutes)

Core Capabilities

Decision Frameworks

Quality Standards

Phase 3: Agent Architecture Design (45-60 minutes)

Phase 4: Deep Technical Enhancement (60-90 minutes)

Integrated Agent Creation Process

Complete Workflow

Claude Agent SDK Implementation

TypeScript Implementation

Python Implementation

Agent Specialization Patterns

Analytical Agents

Generative Agents

Diagnostic Agents

Orchestration Agents

Testing & Validation

Test Suite Creation

Validation Checklist

Quick Reference

When to Use Each Phase

Speed-Run Approach (Experienced Creators)

Examples from Production

Example: Marketing Specialist Agent

Maintenance & Iteration

Continuous Improvement

Version Control

Summary

Cross-Skill Coordination

GraphViz Diagram

Recursive Improvement Integration (v2.0)

Role in the Loop

Input/Output Contracts

Eval Harness Integration

Memory Namespace

Uncertainty Handling

!! SKILL COMPLETION VERIFICATION (MANDATORY) !!

Completion Checklist

Correct Pattern After Skill Invocation

Wrong Pattern (DO NOT DO THIS)

Core Principles

Principle 1: Domain Expertise Embedding

Principle 2: Evidence-Based Cognitive Frameworks

Principle 3: Continuous Improvement Through Metrics

Common Anti-Patterns

Conclusion

Version History

v3.0.1 (2025-12-19)

v3.0.0 (2025-12-18)

v2.2.0 (2025-11-08)

v2.1.0 (2025-10-15)

v2.0.0 (2025-09-20)

v1.0.0 (2025-08-01)