name	multi-agent-autonomous-workflow
description	Orchestrates multi-agent workflow for feature implementation using specialized subagents. Use when implementing features, epics, or complex multi-story tasks that need analyst, architect, developer, tester, reviewer, and security agents.

Multi-Agent Autonomous Workflow

Orchestrates specialized subagents for extended autonomous work (minutes to hours) with minimal human intervention.

Execution Model: Long-Running Until Complete

This workflow implements the Ralph Wiggum pattern for persistent autonomous execution:

The workflow NEVER exits until the goal is complete or a true blocker requires human intervention. Iterate continuously. Use previous failures as context for next attempt. State persists on disk. The Stop hook enforces completion before exit.

┌─────────────────────────────────────────────────────────────────────────────────┐
│                      LONG-RUNNING EXECUTION MODEL                                │
│                                                                                  │
│   ┌──────────────────────────────────────────────────────────────────────────┐  │
│   │                    OUTER LOOP: PHASES                                     │  │
│   │                                                                           │  │
│   │   Phase 0 ──▶ Phase 1 ──▶ Phase 2 ──▶ Phase 3 ──▶ WORKFLOW_COMPLETE      │  │
│   │   (Setup)    (Analyze)   (Execute)   (Finalize)                           │  │
│   │                  │            │                                           │  │
│   │                  ▼            ▼                                           │  │
│   │              CLARIFY      ┌───────────────────────────────┐               │  │
│   │              QUESTIONS    │  MIDDLE LOOP: STORIES         │               │  │
│   │                           │                               │               │  │
│   │                           │  for each story:              │               │  │
│   │                           │    ┌────────────────────┐     │               │  │
│   │                           │    │ INNER LOOP: ITERATE│     │               │  │
│   │                           │    │ until ALL verified │     │               │  │
│   │                           │    │ or max retries     │     │               │  │
│   │                           │    └────────────────────┘     │               │  │
│   │                           └───────────────────────────────┘               │  │
│   └──────────────────────────────────────────────────────────────────────────┘  │
│                                                                                  │
│   COMPLETION: Stop hook blocks exit until WORKFLOW_COMPLETE marker is output    │
└─────────────────────────────────────────────────────────────────────────────────┘

Phase 0: Setup & Platform Detection

Before any work, set up the execution context. This phase produces visible output.

Step 0.1: Session Recovery Check

ALWAYS run at the start of every session:

# Check if resuming existing workflow
python .claude/core/state.py recover

If resuming: Skip to the appropriate phase/story based on the recovered state.

Output:

## Session Recovery

| Check | Result |
|-------|--------|
| Existing workflow | [Yes/No] |
| Workflow ID | [ID if resuming] |
| Current phase | [Phase name] |
| Stories completed | [X/Y] |
| Current story | [Story ID and title] |

[If resuming]: Continuing from [story/phase]. Skipping to Phase [N].

Step 0.2: Discover Available Platforms (DYNAMIC)

IMPORTANT: Platform discovery is FULLY DYNAMIC. Never hardcode platform names or markers.

# Discover all platforms in the Workflows/platforms directory
# Each platform provides its own platform.json with markers
ls -d Workflows/platforms/*/ 2>/dev/null || ls -d ../Workflows/platforms/*/ 2>/dev/null

For each discovered platform directory, read its platform.json to get:

name: Platform identifier
markers: Files/patterns that identify this platform
matchMode: "any" (match any marker) or "all" (match all markers)
priority: Resolution priority when multiple match

Output the discovery:

## Platform Discovery

Scanning for available platforms...

| Platform | Description | Markers | Match Mode |
|----------|-------------|---------|------------|
| {name} | {displayName} | {markers joined} | {matchMode} |
...

Found {N} platform configurations.

Step 0.3: Match Platform to Codebase (DYNAMIC)

For each discovered platform, check if its markers exist in the codebase:

# Pseudocode for platform matching
for platform in discovered_platforms:
    matched = check_markers(platform.markers, platform.matchMode)
    if matched:
        candidates.append((platform, platform.priority))

selected = max(candidates, key=lambda x: x[1])  # Highest priority wins

Output the matching:

## Platform Detection

Scanning codebase for platform markers...

| Platform | Markers Checked | Found | Match Result |
|----------|-----------------|-------|--------------|
| {name} | {marker} | [Yes/No] | [MATCH/NO MATCH] |
...

**Selected Platform:** {platform.name} (priority: {priority})
**Reason:** {why this platform was selected}

Step 0.4: Load Platform Configuration

From selected platform.json, extract and cache:

Commands (build, test, lint, coverage, etc.)
Conventions (naming, patterns, commit format)
Quality gates (coverage thresholds, required checks)
Project structure patterns
Skills to load

Output:

## Platform Configuration Loaded

**Platform:** {displayName} (v{version})

### Commands
| Action | Command |
|--------|---------|
| {key} | `{command}` |
...

### Conventions
| Convention | Value |
|------------|-------|
| {key} | {value} |
...

### Quality Gates
| Gate | Threshold |
|------|-----------|
| Coverage (S) | {S}% |
| Coverage (M) | {M}% |
| Coverage (L) | {L}% |
| Coverage (XL) | {XL}% |

### Skills Loaded
{For each skill in platform.skills, list it}

---
Platform detection complete. Proceeding with workflow.

Phase 1: Analysis & Planning

Step 1.1: Pre-Analysis Clarification (REQUIRED CHECKPOINT)

BEFORE invoking the analyst agent, you MUST ask clarifying questions.

Use the AskUserQuestion tool to present questions about the goal:

## Pre-Analysis Clarification Required

Before I begin analyzing your request, I need to clarify:

Questions to ask (pick relevant ones):

Scope boundaries: What is IN scope vs OUT of scope?
Priority: If there are multiple features, which is most important?
Technical constraints: Any specific technologies to use or avoid?
Integration points: How should this integrate with existing code?
User requirements: Any specific user-facing requirements mentioned?
Data handling: How should data be stored/processed?
Error handling: Any specific error handling requirements?

When to ask fewer questions:

Goal is very specific and complete
User provided detailed specifications
Technical approach is obvious from context

Always offer: "Proceed with my best judgment" as an option.

Question Timeout Behavior:

If no response within 5 minutes in autonomous mode: proceed with best judgment
Log the assumed answers: python .claude/core/state.py add-clarification "Question" "Assumed: best judgment" --phase pre-analysis --category scope
Always persist clarifications for session recovery

Wait for user response before continuing to Step 1.2.

Step 1.1.1: Persist Clarifications

After receiving user answers, ALWAYS persist them:

# For each question answered:
python .claude/core/state.py add-clarification "What payment provider?" "Stripe" --phase pre-analysis --category technical
python .claude/core/state.py add-clarification "Real-time tech?" "SignalR" --phase pre-analysis --category technical

This ensures clarifications survive session restarts.

Step 1.2: Initialize Workflow State

python .claude/core/state.py init "Goal: {User's goal}"

Step 1.3: Invoke Analyst Subagent

## Task for Analyst

{PLATFORM CONTEXT BLOCK}

**Goal:** {User's goal, incorporating clarifications}

Break this down into user stories with:
- Clear title
- Size estimate (S/M/L/XL)
- Acceptance criteria (testable)
- Security sensitivity flag (if applicable)

Parse analyst output and add stories to state:

python .claude/core/state.py add-story "Story title" --size M
python .claude/core/state.py add-story "Story title" --size L --security
# Repeat for each story

Step 1.4: Pre-Plan Clarification (REQUIRED CHECKPOINT)

BEFORE invoking the architect agent, you MUST review stories and ask for confirmation.

Use the AskUserQuestion tool:

## Pre-Plan Clarification Required

I've identified these stories from your request:

| # | Story | Size | Security |
|---|-------|------|----------|
| S1 | {title} | {size} | {Yes/No} |
| S2 | {title} | {size} | {Yes/No} |
...

Before I design the technical approach, please confirm:

Questions to ask:

Story order: Is this the right priority order? Any changes?
Technical preferences: Any specific patterns/libraries to use or avoid?
Existing code concerns: Any areas I should be careful modifying?
Missing stories: Anything I missed that should be included?

Wait for user response before continuing to Step 1.5.

Step 1.5: Invoke Architect Subagent

## Task for Architect

{PLATFORM CONTEXT BLOCK}

**Stories:** {List from analyst with user's priority adjustments}

Create technical design following:
- Platform project structure patterns
- Platform conventions
- Existing codebase patterns

For each story, identify:
- Files to create/modify
- Dependencies between stories
- Technical risks

Step 1.6: Gate G1 - Verify Design Complete

Check:

All stories have technical design
File changes identified per story
Dependencies mapped
No open questions blocking implementation

If not complete, iterate with architect.

Phase 2: Story Execution Loop

This is the main execution loop. It runs until ALL stories are completed.

CRITICAL RULES:

Every iteration MUST run build and tests before completing
No story can be marked complete until ALL verification checks pass
The loop continues until there are no more incomplete stories
The Stop hook will block exit if any stories remain incomplete
TDD phases must be followed: RED → GREEN → REFACTOR → VERIFY
Failures must be categorized for intelligent retry decisions

TDD Phase Enforcement

Each story MUST follow the TDD cycle. Track phases explicitly:

# 1. RED: Write failing test FIRST
python .claude/core/state.py tdd-phase S1 red
# Developer writes test that fails

# 2. GREEN: Minimum code to pass
python .claude/core/state.py tdd-phase S1 green
# Developer implements minimum code

# 3. REFACTOR: Clean up (optional for small changes)
python .claude/core/state.py tdd-phase S1 refactor
# Developer refactors without changing behavior

# 4. VERIFY: Run all checks
python .claude/core/state.py tdd-phase S1 verify
# All gates must pass

Before moving to GREEN phase, validate:

python .claude/core/state.py tdd-validate S1 green
# Returns: {"valid": true} or {"valid": false, "expected": "red"}

If developer tries to write implementation before tests:

STOP. TDD violation detected.
Current phase: none
Expected: red (write failing test first)

Developer must write a failing test before implementation.

Failure Categorization

When a gate fails, categorize the failure for intelligent retry:

# Auto-categorize based on error message:
python .claude/core/state.py add-failure S1 "Connection refused: database not available"
# Output: Added failure: F1 [infra]

# Or explicitly categorize:
python .claude/core/state.py add-failure S1 "API key not configured" --category external

Categories:

Category	Description	Retry Strategy
`code`	Bug in implementation	Standard retry
`test`	Test itself is wrong	Fix test, retry
`infra`	DB, network, filesystem	Retry with backoff
`external`	API keys, external services	ESCALATE - needs human
`timeout`	Operation timed out	Retry with backoff

Get retry recommendation before retrying:

python .claude/core/state.py retry-recommendation S1
# Returns: {"should_retry": true, "backoff_seconds": 30}
# OR: {"should_retry": false, "escalate": true, "reason": "External service failures"}

┌─────────────────────────────────────────────────────────────────────────────────┐
│                           STORY EXECUTION LOOP                                   │
│                                                                                  │
│   while (incomplete_stories exist):                                              │
│       story = get_next_incomplete_story()                                        │
│       iteration = 0                                                              │
│       previous_failures = []                                                     │
│                                                                                  │
│       while (story NOT completed AND iteration < MAX_RETRIES):                   │
│           iteration++                                                            │
│                                                                                  │
│           ┌─────────────────────────────────────────────────────────────────┐   │
│           │ 1. DEVELOP                                                       │   │
│           │    Invoke developer agent with previous_failures context         │   │
│           │                                                                  │   │
│           │ 2. BUILD VERIFICATION (MANDATORY - Gate G2)                      │   │
│           │    Run: {platform.commands.build}                                │   │
│           │    If FAIL: add to previous_failures, CONTINUE to next iteration │   │
│           │                                                                  │   │
│           │ 3. TEST VERIFICATION (MANDATORY - Gate G3)                       │   │
│           │    Run: {platform.commands.test}                                 │   │
│           │    If FAIL: add to previous_failures, CONTINUE to next iteration │   │
│           │                                                                  │   │
│           │ 4. TESTER AGENT                                                  │   │
│           │    Verify acceptance criteria                                    │   │
│           │    If FAIL: add issues to previous_failures, CONTINUE            │   │
│           │                                                                  │   │
│           │ 5. REVIEWER AGENT                                                │   │
│           │    Code review and architecture compliance                       │   │
│           │    If CHANGES_REQUESTED: add to previous_failures, CONTINUE      │   │
│           │                                                                  │   │
│           │ 6. SECURITY AGENT (if story.securitySensitive)                   │   │
│           │    Security analysis and scanning                                │   │
│           │    If REMEDIATION_NEEDED: add to previous_failures, CONTINUE     │   │
│           │                                                                  │   │
│           │ 7. FINAL VERIFICATION (MANDATORY - Gate G5)                      │   │
│           │    Run: {platform.commands.build} && {platform.commands.test}    │   │
│           │    If FAIL: add to previous_failures, CONTINUE                   │   │
│           │    If PASS: Mark story COMPLETED, commit, break inner loop       │   │
│           └─────────────────────────────────────────────────────────────────┘   │
│                                                                                  │
│       if (story NOT completed after MAX_RETRIES):                               │
│           Escalate as blocker, continue to next story                           │
│                                                                                  │
│       Every 3 stories OR 30 minutes: Generate progress report                   │
│                                                                                  │
└─────────────────────────────────────────────────────────────────────────────────┘

Mandatory Verification Commands

Before ANY loop iteration can complete, run these platform commands:

# Get commands from platform.json (resolved dynamically)
BUILD_CMD=$(python .claude/core/platform.py get-command build)
TEST_CMD=$(python .claude/core/platform.py get-command test)

# Execute
eval "$BUILD_CMD"
eval "$TEST_CMD"

If either command fails, DO NOT:

Exit the inner loop
Mark the story as completed
Proceed to the next story
Create a PR

Instead:

Log the failure
Add to previous_failures context
Retry with the developer agent

Retry Context Template

When retrying development, ALWAYS provide this context to the developer:

## Retry Context

**Story:** {story.id} - {story.title}
**Iteration:** {n}/{MAX_RETRIES}
**Previous Attempts:** {n-1}

### What Failed:
{For each failure:}
- **{failure_type}:** {description}
  - File: {file}:{line} (if applicable)
  - Error: {error_message}

### What To Fix:
{Specific actionable instructions derived from failures}

### Files Changed So Far:
{List of files modified in previous iterations}

### Verification Status:
| Check | Status |
|-------|--------|
| testsPass | {PASS/PENDING} |
| coverageMet | {PASS/PENDING} |
| reviewApproved | {PASS/PENDING} |
| securityCleared | {PASS/PENDING/N/A} |

### Retry Recommendation:
{Output from: python .claude/core/state.py retry-recommendation {story_id}}

Reviewer Severity Levels

When the Reviewer Agent identifies issues, they MUST be categorized by severity:

## Code Review Results

### BLOCKING Issues (Must Fix)
{Issues that block completion - security, correctness, architecture violations}
- **[BLOCKING]** SQL injection vulnerability in UserRepository.cs:45
- **[BLOCKING]** Missing null check causes crash in PaymentService

### SUGGESTIONS (Recommended)
{Improvements that should be made but don't block completion}
- **[SUGGESTION]** Consider using MediatR pipeline for validation
- **[SUGGESTION]** Extract magic number to named constant

### INFO (For Future Reference)
{Non-actionable observations}
- **[INFO]** This pattern could be simplified in future refactoring
- **[INFO]** Consider adding integration tests for edge cases

Retry Logic Based on Severity:

BLOCKING issues found: MUST retry, add to previous_failures
Only SUGGESTIONS: Mark review approved, note suggestions for developer
Only INFO: Mark review approved, continue

External Dependency Detection

Before development, check for external dependencies that may require special handling:

# Check if story requires external services
# Look for keywords in acceptance criteria and design

External Dependency Types:

Type	Detection Keywords	Handling
Payment APIs	stripe, paypal, payment, checkout	Use mock/sandbox in tests
Email Services	email, smtp, sendgrid, mailgun	Mock email sending
SMS/Push	sms, twilio, push notification	Mock notification service
OAuth/SSO	oauth, sso, google auth, azure ad	Use test credentials
Cloud Storage	s3, azure blob, gcs, file upload	Use local storage mock
External APIs	api key, third-party, integration	Create mock responses

When external dependency detected:

Log dependency:

python .claude/core/state.py add-clarification "External dependency: Stripe API" "Using sandbox mode for tests" --phase development --category external

Add mock strategy to developer context:

## External Dependencies

This story requires: Stripe Payment API

**Test Strategy:**
- Use Stripe test mode API keys
- Mock Stripe responses for unit tests
- Mark AC requiring live Stripe as "MANUAL_VERIFICATION_REQUIRED"

If no mock possible:

python .claude/core/state.py add-blocker "Stripe API credentials not available for testing" --severity high
# Escalate to user

Progress Reports

Generate a progress report every 3 completed stories OR every 30 minutes:

## Progress Report

**Workflow:** {goal}
**Workflow ID:** {workflowId}
**Runtime:** {elapsed_time}
**Iteration:** {current_iteration}

### Stories Progress
| Status | Count |
|--------|-------|
| Completed | {X} |
| In Progress | {Y} |
| Pending | {Z} |
| Blocked | {B} |

### Recently Completed:
{For each recently completed story:}
- [{id}] {title} - {iterations} iterations

### Current Story:
- [{id}] {title}
- Status: {status}
- Iteration: {n}/{MAX_RETRIES}
- Verification: {checks summary}

### Blockers:
{List any blockers or "None"}

### Next Steps:
1. {Current action}
2. {Next story after this}

Phase 2.5: Lint Fix Loop (Non-Story Fixes)

When Gate G7 fails on lint but build/tests pass, enter this special loop:

Purpose: Fix lint/formatting issues without creating a new story iteration.

┌─────────────────────────────────────────────────────────────────────────────────┐
│                           LINT FIX LOOP                                          │
│                                                                                  │
│   Gate G7 fails on lint (build + test passed)                                   │
│                                                                                  │
│   while (lint_fails AND attempts < 3):                                          │
│       1. Get lint errors:                                                       │
│          LINT_CMD=$(python .claude/core/platform.py get-command lint)           │
│          eval "$LINT_CMD" 2>&1 | tee lint_errors.txt                            │
│                                                                                  │
│       2. Fix lint issues (DO NOT change logic):                                 │
│          - Formatting only                                                      │
│          - No new features                                                      │
│          - No refactoring                                                       │
│                                                                                  │
│       3. Re-run verification:                                                   │
│          - Build (should still pass)                                            │
│          - Test (should still pass)                                             │
│          - Lint (check if fixed)                                                │
│                                                                                  │
│       4. If lint passes: Exit loop, continue to Phase 3                         │
│          If lint fails: Increment attempts, retry                               │
│                                                                                  │
│   if (attempts >= 3):                                                           │
│       Log blocker: "Unable to fix lint issues after 3 attempts"                 │
│       Continue to Phase 3 anyway (lint is non-blocking for PR)                  │
│       Note in PR: "Lint issues remain - manual review needed"                   │
│                                                                                  │
└─────────────────────────────────────────────────────────────────────────────────┘

Key Rules for Lint Fix Loop:

DO NOT modify any logic or behavior
DO NOT add new tests or features
ONLY fix formatting, whitespace, import order, etc.
Verify tests still pass after each fix
If tests break: Revert lint fix, continue with lint warnings

# Run lint fix loop
python .claude/core/state.py update-phase lint-fix

# Get lint command and run
LINT_CMD=$(python .claude/core/platform.py get-command lint)
eval "$LINT_CMD" 2>&1

# If platform has auto-fix:
LINT_FIX=$(python .claude/core/platform.py get-command lintFix 2>/dev/null || echo "")
if [ -n "$LINT_FIX" ]; then
    eval "$LINT_FIX"
fi

# Verify tests still pass after lint fix
TEST_CMD=$(python .claude/core/platform.py get-command test)
eval "$TEST_CMD"

Phase 3: Completion & PR Creation

Step 3.1: Verify All Stories Complete

python .claude/core/state.py status

All stories must have status completed with all verification checks passed.

If any story is not completed, return to Phase 2.

Step 3.2: Final Build & Test Verification (MANDATORY - Gate G7)

Before creating a PR, run a COMPLETE verification cycle:

# Get commands from platform
BUILD_CMD=$(python .claude/core/platform.py get-command build)
TEST_CMD=$(python .claude/core/platform.py get-command test)
LINT_CMD=$(python .claude/core/platform.py get-command lint 2>/dev/null || echo "")

# Run all checks
eval "$BUILD_CMD"
eval "$TEST_CMD"
if [ -n "$LINT_CMD" ]; then
    eval "$LINT_CMD"
fi

Output verification results:

## Pre-PR Verification (Gate G7)

| Check | Command | Result |
|-------|---------|--------|
| Build | `{build_cmd}` | {PASS/FAIL} |
| Tests | `{test_cmd}` | {PASS/FAIL} ({test_count} tests) |
| Lint | `{lint_cmd}` | {PASS/FAIL/N/A} |

{If ALL PASS}: All verification checks passed. Proceeding with PR creation.
{If ANY FAIL}: Verification failed. Returning to Phase 2 to fix issues.

If ANY check fails:

DO NOT create the PR
Return to Phase 2 to fix the issues
Re-run verification after fixes

Step 3.3: Invoke DevOps (if needed)

If deployment configuration is required:

## Task for DevOps

{PLATFORM CONTEXT BLOCK}

Create/update deployment configuration for implemented features.

Step 3.4: Create Pull Request

# Ensure clean working state
git status

# Push branch
git push -u origin HEAD

# Create PR with comprehensive body
gh pr create --title "[Feature] {Goal Summary}" --body "$(cat <<'EOF'
## Summary
{Brief description of implementation}

## Stories Implemented
{For each story:}
- **{story.id}:** {story.title}

## Key Changes
{Major files/features changed, grouped logically}

## Testing
- All {test_count} tests passing
- Coverage thresholds met for all stories

## Verification Checklist
- [x] Build passes ({platform.commands.build})
- [x] All tests pass ({platform.commands.test})
- [x] Lint checks pass (if applicable)
- [x] Code review approved by reviewer agent
- [x] Security review passed (if applicable)
- [x] All acceptance criteria verified by tester agent

## Workflow Metrics
- **Workflow ID:** {workflowId}
- **Runtime:** {total_time}
- **Stories:** {completed}/{total}
- **Total Iterations:** {sum of all story iterations}

---
Generated with [Claude Code](https://claude.com/claude-code)
EOF
)"

Step 3.5: Complete Workflow State

python .claude/core/state.py complete

Step 3.6: Output Completion Marker (CRITICAL)

This marker is REQUIRED for the Stop hook to allow exit:

## WORKFLOW_COMPLETE

**Goal:** {goal}
**Status:** SUCCESS
**PR URL:** {pr_url}
**Workflow ID:** {workflowId}
**Runtime:** {total_time}
**Stories Completed:** {completed}/{total}

All stories verified. Pull request created. Workflow complete.

Architecture

ORCHESTRATOR (You) ─── Drives workflow through all phases
    │
    │   Phase 0
    ├── Platform Detection (dynamic discovery)
    │
    │   Phase 1
    ├── CLARIFY ──────▶ (Pre-Analysis Questions - REQUIRED)
    ├── analyst ──────▶ Requirements & user stories
    ├── CLARIFY ──────▶ (Pre-Plan Questions - REQUIRED)
    ├── architect ────▶ Technical design
    │
    │   Phase 2 (Loop until all complete)
    ├── developer ────▶ Implementation (TDD)
    ├── BUILD ────────▶ {platform.commands.build} (MANDATORY)
    ├── TEST ─────────▶ {platform.commands.test} (MANDATORY)
    ├── tester ───────▶ Acceptance verification
    ├── reviewer ─────▶ Code review
    ├── security ─────▶ Security audit (if flagged)
    │
    │   Phase 3
    ├── VERIFY ───────▶ Final build/test/lint (MANDATORY)
    ├── devops ───────▶ Infrastructure & deployment
    └── PR ───────────▶ Create pull request + WORKFLOW_COMPLETE

State Management

Persistence Commands

# Initialize workflow
python .claude/core/state.py init "Goal description"

# Session recovery (ALWAYS run at session start)
python .claude/core/state.py recover

# Story management
python .claude/core/state.py add-story "Title" --size M [--security]
python .claude/core/state.py update-story S1 in_progress
python .claude/core/state.py update-story S1 testing
python .claude/core/state.py update-story S1 review
python .claude/core/state.py update-story S1 completed

# Verification checks (fail-first pattern)
python .claude/core/state.py verify S1 testsPass --passed
python .claude/core/state.py verify S1 coverageMet --passed --details "85%"
python .claude/core/state.py verify S1 reviewApproved --passed
python .claude/core/state.py verify S1 securityCleared --passed

# Progress tracking
python .claude/core/state.py status
python .claude/core/state.py progress --lines 20
python .claude/core/state.py compact-context

# Context management (for long runs)
python .claude/core/state.py trim-progress --lines 100

# Git recovery
python .claude/core/state.py mark-working-state
python .claude/core/state.py rollback-to-checkpoint

# User intervention
python .claude/core/state.py await-user-fix <blocker_id> "description" [--check-command "cmd"]
python .claude/core/state.py check-user-fix
python .claude/core/state.py user-fix-complete [--notes "note"]
python .claude/core/state.py resume-context

# Dependency scanning
python .claude/core/state.py scan-dependencies S1

# Completion
python .claude/core/state.py complete

State File Location

State persists in .claude/workflow-state.json. This enables:

Session recovery after interruptions
Progress tracking across long runs
Stop hook completion detection
Rollback capability per story

Long-Running Execution Strategies

Stop Hook Behavior

The Stop hook (stop.py) implements the Ralph Wiggum pattern:

On exit attempt: Hook intercepts and checks for completion
If WORKFLOW_COMPLETE not found: Block exit, provide continuation context
If WORKFLOW_COMPLETE found: Allow normal exit
Safety limit: Max iterations (default 100) prevents infinite loops

Context Management for Extended Runs

For Runs Under 1 Hour:

Update state after each story completion
Generate progress report every 3 stories

For Runs 1-4 Hours:

Update state after each phase transition
Generate progress report every 30 minutes
Checkpoint review at 5-story intervals
Trim progress file to 100 lines

For Runs 4+ Hours:

Aggressive context management
Keep only: current story (full) + completed stories (summary only)
Drop: full implementation details, raw test output
State sync every 15 minutes
Consider natural breakpoints for session handoff

Recovery After Interruption

When resuming:

Run python .claude/core/state.py recover
Check git status for uncommitted changes
Validate build/test before continuing
Jump to current story/phase from state

Quality Gates

Gate	When	Check	On Fail
G1	Before dev	Design complete, AC clear	Clarify with analyst/architect
G2	After dev	`{platform.commands.build}` passes	Retry developer with build error
G3	After build	`{platform.commands.test}` passes	Retry developer with test failures
G4	After test	Coverage threshold met	Add tests, retry
G5	Before story complete	Final build + test	DO NOT complete story, retry
G6	If security-sensitive	Security scan passes	Remediate, retry
G7	Before PR	Full build + test + lint	DO NOT create PR, return to Phase 2

Clarifying Question Guidelines

Pre-Analysis Questions (Step 1.1)

Always ask when:

Goal is vague or multi-part without priority
Technical constraints not specified
Integration requirements unclear
User data handling involved but not specified
Multiple valid interpretations exist

Ask fewer questions when:

Requirements are explicit and complete
User provided detailed specifications
Context makes approach obvious

Pre-Plan Questions (Step 1.4)

Always ask when:

Multiple valid technical approaches exist
Story order could significantly impact development
Dependencies between stories are complex
User has shown strong opinions about technology

Ask fewer questions when:

Standard patterns clearly apply
Single obvious approach exists
User explicitly requested autonomous execution

How to Ask

Be specific - ask about concrete decisions
Offer options when possible (2-4 choices)
Provide your recommendation with rationale
Always include "Proceed with your judgment" as an option
Use the AskUserQuestion tool for structured questions

Escalation Triggers

Trigger	Action
3 failed iterations on same story	Log blocker, output ESCALATION_REQUIRED
Security vulnerability found	Immediate escalation
5 stories completed	Optional checkpoint
External dependency unavailable	Log blocker, escalate
User intervention requested	Output HUMAN_INTERVENTION_NEEDED

User Intervention Patterns

Mid-Workflow User Intervention (await_user_fix)

When the workflow encounters a blocker that requires manual user action (e.g., missing API keys, environment configuration, external service setup), use the await_user_fix pattern:

┌─────────────────────────────────────────────────────────────────────────────────┐
│                      MID-WORKFLOW USER INTERVENTION                              │
│                                                                                  │
│   1. Workflow encounters blocker requiring user action                          │
│   2. Call await_user_fix() to pause workflow                                    │
│   3. Stop hook allows exit with instructions for user                           │
│   4. User fixes issue and signals completion                                    │
│   5. Workflow resumes from where it left off                                    │
│                                                                                  │
│   WORKFLOW              USER                  WORKFLOW                          │
│      │                   │                       │                              │
│      ├──▶ await_user_fix │                       │                              │
│      │   (exit allowed)  │                       │                              │
│      │                   │                       │                              │
│      │                   ├──▶ Fix issue          │                              │
│      │                   │                       │                              │
│      │                   ├──▶ user-fix-complete  │                              │
│      │                   │                       │                              │
│      │                   │    (restart claude)   │                              │
│      │                   │        ──────────────▶├──▶ Resume workflow           │
│      │                   │                       │                              │
└─────────────────────────────────────────────────────────────────────────────────┘

Invoking User Intervention:

# 1. Log the blocker first
python .claude/core/state.py add-blocker "Stripe API key not configured in .env" --severity high

# 2. Request user intervention with optional verification command
python .claude/core/state.py await-user-fix 0 "Add Stripe API key to .env file" --check-command "dotnet build" --timeout 60

# Output includes instructions for the user:
# {
#   "status": "awaiting_user",
#   "description": "Add Stripe API key to .env file",
#   "check_command": "dotnet build",
#   "instructions": [
#     "1. Fix the issue: Add Stripe API key to .env file",
#     "2. Run verification: dotnet build",
#     "3. Resume workflow: python .claude/core/state.py user-fix-complete"
#   ]
# }

User Signals Fix Complete:

# User runs this after fixing the issue
python .claude/core/state.py user-fix-complete --notes "Added test API key from Stripe dashboard"

Workflow Resumes:

On next session start, the workflow automatically detects the fix and continues:

# Session recovery detects resolved blocker
python .claude/core/state.py recover

# Get context for resuming
python .claude/core/state.py resume-context

Automated External Dependency Scanning

Before starting development on each story, scan for external dependencies:

# Scan story for external dependency keywords
python .claude/core/state.py scan-dependencies S1

# Output:
# Detected 2 external dependencies:
#   [payment] payment_processing
#     Keyword: stripe
#     Mock: Use test/sandbox API keys or mock payment service
#     Requires secrets: True
#
#   [email] email_service
#     Keyword: sendgrid
#     Mock: Use local mailhog/papercut or mock email sender
#     Requires secrets: True

Dependency Categories and Mock Strategies:

Type	Keywords	Mock Strategy
payment	stripe, paypal, braintree	Test/sandbox API keys
email	sendgrid, mailgun, smtp	Local mailhog/papercut
oauth	oauth, google auth, facebook login	Test OAuth credentials
storage	s3, azure blob, cloudinary	Local minio/azurite
sms	twilio, nexmo, vonage	SMS service test mode
ai_ml	openai, gpt-4, anthropic	Recorded responses
database_cloud	rds, cosmosdb, atlas	Local database container
messaging	rabbitmq, kafka, azure service bus	Local container

When Dependencies Are Detected:

## External Dependencies Detected for Story {story_id}

| Type | Category | Requires Secrets |
|------|----------|------------------|
| {type} | {category} | {yes/no} |

### Mock Strategy for Development/Testing:
{For each dependency:}
- **{type}:** {mock_strategy}

### If Secrets Required:
1. Check if test credentials exist in environment
2. If not available: Call await-user-fix to request setup
3. Never hardcode credentials in code

Resume-After-Blocker Protocol

When resuming after a user-fixed blocker, follow this protocol:

# 1. Get full resume context
python .claude/core/state.py resume-context
# Returns:
# {
#   "workflow_id": "abc123",
#   "goal": "Implement payment processing",
#   "current_phase": "development",
#   "recently_resolved_blockers": [
#     {"index": 0, "description": "Stripe API key", "resolvedAt": "..."}
#   ],
#   "current_story": {
#     "id": "S2",
#     "title": "Payment integration",
#     "status": "in_progress",
#     "tdd_phase": "green",
#     "attempts": 2
#   },
#   "resume_instructions": [...]
# }

# 2. Verify the fix if a check command was specified
python .claude/core/state.py check-user-fix
# Returns: {"is_fixed": true, "verified": true, "can_resume": true}

# 3. Resume from current TDD phase
python .claude/core/state.py tdd-status S2
# Returns: "green" (was in middle of making tests pass)

# 4. Re-run tests to verify fix didn't break anything
BUILD_CMD=$(python .claude/core/platform.py get-command build)
TEST_CMD=$(python .claude/core/platform.py get-command test)
eval "$BUILD_CMD" && eval "$TEST_CMD"

# 5. Continue with normal story execution loop

Resume Context in Developer Prompt:

When resuming, include this context in the developer agent prompt:

## RESUMING AFTER BLOCKER

**Resolved Blocker:** {description}
**Resolution Notes:** {user's notes from user-fix-complete}
**Verification:** {PASSED/FAILED}

**Picking Up Where We Left Off:**
- Story: [{story_id}] {story_title}
- TDD Phase: {tdd_phase} (continue from here)
- Attempts: {attempts}
- Previous Failures: {summary of previous failures}

**First Step After Resume:**
1. Verify the fix works by running build/tests
2. If verified, continue from current TDD phase
3. If not verified, report new blocker

---
Continue implementing story {story_id}...

Stop Hook User Intervention Handling

The Stop hook (stop.py) handles user intervention specially:

Before checking completion: Check for awaiting_user status
If awaiting user: Allow exit with clear instructions
User message includes:
- What needs to be fixed
- How to verify the fix
- Command to signal completion
- How to restart workflow

# Stop hook logic (simplified):
def main():
    # Priority 1: Check for user intervention
    is_awaiting, intervention = check_user_intervention(input_data)
    if is_awaiting:
        # Allow exit but provide instructions
        return {'decision': 'allow', 'userMessage': get_intervention_instructions()}

    # Priority 2: Check for completion
    is_complete, reason = check_completion(input_data)
    if is_complete:
        return {'decision': 'allow'}

    # Priority 3: Block exit, continue workflow
    return {'decision': 'block', 'continuePrompt': get_continuation_context()}

Commands Reference

/workflow [goal] - Start full autonomous workflow
/status - Check current progress
/implement [story] - Implement single story with iteration loop
/review [files] - Run code review

Summary: Key Behavioral Rules

Never exit without WORKFLOW_COMPLETE marker (Stop hook enforces this)
Always ask clarifying questions before analysis and before planning
Never skip build/test verification - mandatory at every iteration
Platform detection is dynamic - never hardcode platform names or commands
Iterate on failures - use previous_failures context for retries
Persist state - enable recovery from any interruption
Progress reports - visibility every 3 stories or 30 minutes
PR only after full verification - Gate G7 must pass
Scan for external dependencies before developing each story
Use await_user_fix for blockers requiring manual intervention
Resume gracefully after user-fixed blockers with full context

Install Skill

SKILL.md

Multi-Agent Autonomous Workflow

Execution Model: Long-Running Until Complete

Phase 0: Setup & Platform Detection

Step 0.1: Session Recovery Check

Step 0.2: Discover Available Platforms (DYNAMIC)

Step 0.3: Match Platform to Codebase (DYNAMIC)

Step 0.4: Load Platform Configuration

Phase 1: Analysis & Planning

Step 1.1: Pre-Analysis Clarification (REQUIRED CHECKPOINT)

Step 1.1.1: Persist Clarifications

Step 1.2: Initialize Workflow State

Step 1.3: Invoke Analyst Subagent

Step 1.4: Pre-Plan Clarification (REQUIRED CHECKPOINT)

Step 1.5: Invoke Architect Subagent

Step 1.6: Gate G1 - Verify Design Complete

Phase 2: Story Execution Loop

TDD Phase Enforcement

Failure Categorization

Mandatory Verification Commands

Retry Context Template

Reviewer Severity Levels

External Dependency Detection

Progress Reports

Phase 2.5: Lint Fix Loop (Non-Story Fixes)

Phase 3: Completion & PR Creation

Step 3.1: Verify All Stories Complete

Step 3.2: Final Build & Test Verification (MANDATORY - Gate G7)

Step 3.3: Invoke DevOps (if needed)

Step 3.4: Create Pull Request

Step 3.5: Complete Workflow State

Step 3.6: Output Completion Marker (CRITICAL)

Architecture

State Management

Persistence Commands

State File Location

Long-Running Execution Strategies

Stop Hook Behavior

Context Management for Extended Runs

Recovery After Interruption

Quality Gates

Clarifying Question Guidelines

Pre-Analysis Questions (Step 1.1)

Pre-Plan Questions (Step 1.4)

How to Ask

Escalation Triggers

User Intervention Patterns

Mid-Workflow User Intervention (await_user_fix)

Automated External Dependency Scanning

Resume-After-Blocker Protocol

Stop Hook User Intervention Handling

Commands Reference

Summary: Key Behavioral Rules