name	spec-driven-development
description	Implement the complete spec-driven development workflow from instructions through requirements, design, and implementation planning. Use this skill when starting new features or major refactorings that benefit from structured planning before coding.

Spec-Driven Development

A rigorous six-phase workflow that captures requirements, designs solutions, defines behavioral contracts through tests, and plans implementations before writing code. This approach ensures alignment, reduces rework, and creates living documentation.

Critical Rule

NEVER IMPLEMENT WITHOUT AUTHORIZATION

After completing each phase, you MUST:

Present the completed work
Explicitly ask for user approval
Wait for clear confirmation
NEVER proceed automatically to the next phase

This is not optional. Each phase requires explicit user authorization.

When to Use This Skill

Use spec-driven development for:

New features: Any non-trivial feature requiring design decisions
Major refactorings: Architectural changes affecting multiple components
API design: Public interfaces that need careful consideration
Complex bugs: Issues requiring investigation and design changes
Team coordination: Work requiring clear communication and approval

Skip this process for:

Trivial bug fixes (typos, simple logic errors)
Documentation updates
Configuration tweaks
Minor refactorings with clear solutions

Directory Structure

specs/
 README.md                      # Feature directory listing
 [feature-name]/
     instructions.md            # Phase 1: Raw requirements
     requirements.md            # Phase 2: Structured requirements
     design.md                  # Phase 3: Technical design
     behaviors.test.ts          # Phase 4: Behavioral tests (executable specs)
     plan.md                    # Phase 5: Implementation plan

The specs/README.md maintains a simple checkbox list of all features:

# Feature Specifications

- [x] **[payment-intents](./payment-intents/)** - Payment intent workflow
- [ ] **[user-authentication](./user-authentication/)** - User auth system
- [ ] **[data-sync](./data-sync/)** - Real-time data synchronization

Six-Phase Workflow

Phase 1: Capture Instructions

Goal: Document raw user requirements without interpretation.

Create: specs/[feature-name]/instructions.md

Contents:

Raw user requirements (exactly as provided)
User stories ("As a [role], I want [feature] so that [benefit]")
Acceptance criteria (what defines success)
Constraints and dependencies
Out of scope items (what this does NOT include)

Template:

# [Feature Name] - Instructions

## Overview

[Brief description of what the user wants]

## User Stories

- As a [role], I want [feature] so that [benefit]
- As a [role], I want [feature] so that [benefit]

## Acceptance Criteria

- [ ] [Concrete, testable criterion]
- [ ] [Concrete, testable criterion]

## Constraints

- [Technical constraint]
- [Business constraint]
- [Timeline constraint]

## Dependencies

- [Existing feature or system]
- [External service]

## Out of Scope

- [What this feature does NOT include]

After completion:

Add entry to specs/README.md
Present instructions to user
Ask: "Does this accurately capture your requirements? Should I proceed to Phase 2 (Requirements)?"
STOP and wait for approval

Phase 2: Derive Requirements

REQUIRES APPROVAL FROM PHASE 1

Goal: Transform raw instructions into structured, technical requirements.

Create: specs/[feature-name]/requirements.md

Contents:

Functional requirements (what the system must do)
Non-functional requirements (performance, security, scalability)
Technical constraints (libraries, patterns, compatibility)
Dependencies on other features or systems
Data requirements (schemas, storage, validation)
Error handling requirements

Template:

# [Feature Name] - Requirements

## Functional Requirements

### FR-1: [Requirement Name]
**Priority**: High | Medium | Low
**Description**: [What must happen]
**Acceptance**: [How to verify]

### FR-2: [Requirement Name]
**Priority**: High | Medium | Low
**Description**: [What must happen]
**Acceptance**: [How to verify]

## Non-Functional Requirements

### NFR-1: Performance
- [Specific metric, e.g., "Response time < 100ms"]
- [Throughput requirement]

### NFR-2: Security
- [Authentication requirement]
- [Authorization requirement]

### NFR-3: Scalability
- [Concurrent users]
- [Data volume]

## Technical Constraints

- Must use [specific library or pattern]
- Must integrate with [existing system]
- Must support [platform or environment]

## Dependencies

### Internal
- [Feature or service name]: [Why needed]

### External
- [Library or API]: [Why needed]

## Data Requirements

### Schema
```typescript
export interface DataModel {
  readonly id: string
  readonly name: string
  readonly value: number
  readonly createdAt: Date
}

Validation

[Field-level validation rules]
[Cross-field validation rules]

Storage

[Where data is stored]
[Persistence strategy]

Error Handling

[Error scenario 1]: [Required handling]
[Error scenario 2]: [Required handling]

Traceability

Addresses instructions: [Section references]


**Ask Questions**: Use `AskUserQuestion` tool if:
- Requirements are ambiguous or unclear
- Multiple valid approaches exist
- Trade-offs need user input
- Domain knowledge is missing
- Priority conflicts arise

**After completion**:
1. Present requirements to user
2. Ask: "Do these requirements accurately reflect the system needs? Should I proceed to Phase 3 (Design)?"
3. **STOP and wait for approval**

---

### Phase 3: Create Design

**REQUIRES APPROVAL FROM PHASE 2**

**Goal**: Make architectural decisions and design the solution.

**Create**: `specs/[feature-name]/design.md`

**Contents**:
- Architecture decisions (patterns, structure)
- API design (functions, types, interfaces)
- Data models (schemas with Effect Schema)
- Effect patterns to use (services, layers, streams)
- Error handling strategy (error types, recovery)
- Testing strategy (unit, integration, property tests)

**Template**:

```markdown
# [Feature Name] - Design

## Architecture Overview

[High-level description of the solution]

Data flow: `ComponentA → ComponentB → ComponentC` (use `→` for dependencies, `||` for parallel)

## Architecture Decisions

### AD-1: [Decision Name]
**Context**: [Why this decision is needed]
**Decision**: [What was decided]
**Rationale**: [Why this approach]
**Alternatives**: [What was considered but rejected]
**Consequences**: [Implications of this decision]

### AD-2: [Decision Name]
[Same structure]

## API Design

### Public Interface

```typescript
import { Effect, Context } from "effect"

// Declare types used in examples
declare const Input1: unique symbol
declare const Output1: unique symbol
declare const Error1: unique symbol
declare const Deps1: unique symbol
declare const Input2: unique symbol
declare const Output2: unique symbol
declare const Error2: unique symbol
declare const Deps2: unique symbol

export interface FeatureService {
  readonly operation1: (input: typeof Input1) => Effect.Effect<typeof Output1, typeof Error1, typeof Deps1>
  readonly operation2: (input: typeof Input2) => Effect.Effect<typeof Output2, typeof Error2, typeof Deps2>
}

Type Definitions

import { Data } from "effect"

// Domain types
export interface DomainType {
  readonly field1: string
  readonly field2: number
}

// Error types
export class FeatureError extends Data.TaggedError("FeatureError")<{
  readonly reason: string
}> {}

Data Models

Schemas

import { Schema } from "effect"

export const InputSchema = Schema.Struct({
  field1: Schema.String,
  field2: Schema.Number
})

export interface Input extends Schema.Schema.Type<typeof InputSchema> {}

Validation Rules

field1: Must be non-empty, max 100 characters
field2: Must be positive integer

Effect Patterns

Services

FeatureService: Main service providing feature operations
RepositoryService: Data access layer
ValidationService: Input validation

Layers

// Layer dependencies: FeatureServiceLive → {RepositoryServiceLive → DatabaseLive, ValidationServiceLive → ConfigLive}

Error Handling

import { Effect, Data } from "effect"

// Declare error types
declare class ValidationError extends Data.TaggedError("ValidationError")<{
  readonly message: string
}> {}

declare class DatabaseError extends Data.TaggedError("DatabaseError")<{
  readonly message: string
}> {}

declare class BusinessRuleError extends Data.TaggedError("BusinessRuleError")<{
  readonly message: string
}> {}

// Error hierarchy
export type FeatureError =
  | ValidationError
  | DatabaseError
  | BusinessRuleError

// Recovery strategy example
declare const operation: Effect.Effect<string, FeatureError, never>

const recovered = operation.pipe(
  Effect.catchTags({
    ValidationError: (_e: ValidationError) => Effect.succeed("retry with corrected input"),
    DatabaseError: (_e: DatabaseError) => Effect.succeed("fallback to cache"),
    BusinessRuleError: (_e: BusinessRuleError) => Effect.succeed("notify user")
  })
)

Streams (if applicable)

import { Stream } from "effect"

// Declare types
declare const Update: unique symbol
declare type Update = typeof Update
declare const CustomError: unique symbol
declare type CustomError = typeof CustomError
declare const Deps: unique symbol
declare type Deps = typeof Deps

declare const source: EventSource
declare function parse(event: MessageEvent): Update
declare function validate(update: Update): boolean

// Real-time updates
const updates: Stream.Stream<Update, CustomError, Deps> =
  Stream.fromEventSource(source).pipe(
    Stream.map(parse),
    Stream.filter(validate)
  )

Component Structure

Structure files for parallel implementation: one file = one task, interface separate from implementation.

File mapping by phase:

P1 (interfaces): {Schema.ts, Error.ts, Service.ts, Repository.ts}
P2 (implementations): {ServiceLive.ts, RepositoryLive.ts, Validation.ts}
P3 (tests): {Service.test.ts, Repository.test.ts, Validation.test.ts}

Execution: P1 ; P2 ; P3 — interfaces first enables parallel implementation against stable types.

Dependencies

// Internal
import type { DatabaseService } from "../database"
import type { LoggerService } from "../logger"

// External
import { Schema, Effect, Layer, Stream } from "effect"

Testing Strategy

Unit Tests

Test each service method in isolation
Use test layers for dependencies
Property-based testing for validation

Integration Tests

Test service with real database (test instance)
Test error scenarios
Test stream behavior

Test Structure

import { Effect, Layer, Context } from "effect"
import { describe, it, expect } from "vitest"

// Declare service interface
interface FeatureService {
  readonly operation: (input: string) => Effect.Effect<string, never, never>
}

const FeatureService = Context.GenericTag<FeatureService>("FeatureService")

// Declare layers
declare const FeatureServiceLive: Layer.Layer<FeatureService, never, DatabaseService | ConfigService>
declare const DatabaseTest: Layer.Layer<DatabaseService, never, never>
declare const ConfigTest: Layer.Layer<ConfigService, never, never>

// Declare types
declare const DatabaseService: Context.Tag<DatabaseService, {}>
declare const ConfigService: Context.Tag<ConfigService, {}>
declare const validInput: string
declare const expectedOutput: string

describe("FeatureService", () => {
  const TestLive = FeatureServiceLive.pipe(
    Layer.provide(DatabaseTest),
    Layer.provide(ConfigTest)
  )

  it("should handle valid input", () =>
    Effect.gen(function* () {
      const service = yield* FeatureService
      const result = yield* service.operation(validInput)
      expect(result).toEqual(expectedOutput)
    }).pipe(Effect.provide(TestLive), Effect.runPromise)
  )
})

Performance Considerations

[Caching strategy]
[Batch operations]
[Resource pooling]

Security Considerations

[Authentication checks]
[Authorization rules]
[Data sanitization]

Traceability

Addresses requirements: [FR-1, FR-2, NFR-1, etc.]
Implements instructions: [Section references]


**Ask Questions**: Use `AskUserQuestion` for:
- Architecture choices (monolithic vs modular)
- Technology selections (which libraries)
- Error handling approaches
- Performance trade-offs
- Security requirements

**After completion**:
1. Present design to user
2. Ask: "Does this design meet your expectations? Should I proceed to Phase 4 (Behavioral Tests)?"
3. **STOP and wait for approval**

---

### Phase 4: Define Behavioral Tests

**REQUIRES APPROVAL FROM PHASE 3**

**Goal**: Write tests that serve as executable specifications of expected behavior.

**Create**: `specs/[feature-name]/behaviors.test.ts`

**Philosophy**:
Tests are behavioral interfaces. They define the API surface and expected behaviors more precisely than prose. Unlike text descriptions, tests:
- Are executable specifications that can be incrementally implemented
- Define the exact API surface users will interact with
- Force concrete thinking about inputs, outputs, and edge cases
- Become passing tests as implementation progresses
- Serve as permanent, verifiable documentation

**Important**: These tests are NOT expected to run or type-check initially. They are written to define behavior, not to pass. Use `declare` statements for types/services that don't exist yet, and `Layer.mock` for partial layer implementations. The focus is clarity of intent, not correctness of implementation.

**Contents**:
- Happy path tests for each major operation
- Error scenario tests (what errors should occur when)
- Edge case tests (boundaries, empty inputs, limits)
- Integration behavior tests (how components interact)

**Guidelines**:

1. **Use `declare`** for all types, services, and layers that don't exist yet
2. **Use `Layer.mock`** for incremental test layer implementation:
   - `Layer.mock(Tag, { method: impl })` creates a type-safe partial implementation
   - Only implement the methods needed for current tests
   - Unimplemented methods throw "not implemented" errors at runtime
   - Add method implementations incrementally as you write more tests
   - This enables layer creation using partials without type errors
3. **Focus on behavior, not implementation** - describe what should happen, not how
4. **Cover the essential paths**:
   - Happy path (normal successful operation)
   - Validation failures (bad input)
   - Business rule violations (constraints, conflicts)
   - Not found scenarios
   - Edge cases specific to the domain
5. **Keep tests readable** - they serve as documentation for future implementers
6. **Don't over-specify** - test observable behavior, not internal details

**Ask Questions**: Use `AskUserQuestion` if:
- Expected behavior for edge cases is unclear
- Error handling semantics need clarification
- There are multiple valid ways to handle a scenario
- Business rules need user input

**After completion**:
1. Present behavioral tests to user
2. Ask: "Do these tests accurately capture the expected behaviors? Should I proceed to Phase 5 (Plan)?"
3. **STOP and wait for approval**

---

### Phase 5: Generate Plan

**REQUIRES APPROVAL FROM PHASE 4**

**Goal**: Break down implementation into concrete, ordered tasks optimized for parallel agent execution.

**Create**: `specs/[feature-name]/plan.md`

**Critical: Parallel Decomposition**

Tasks must be structured for concurrent agent execution. Use standard notation:
- `A || B` — A and B execute in parallel (no file overlap)
- `A ; B` — A completes before B starts (dependency)
- `{T1, T2, T3}` — task set (all parallel within set)

**Execution model**: `Phase1 ; Phase2 ; Phase3` where each phase is `{T1 || T2 || ... || Tn}`

**Constraints**:
- File isolation: `∀ Ti, Tj ∈ Phase: files(Ti) ∩ files(Tj) = ∅`
- Interface-first: Phase 1 defines interfaces, Phase 2+ implements against them
- One file = one task (split further if task spans multiple files)
- Target |Phase| ≥ 5 tasks for maximum parallelism

**Contents**: Task sets per phase, file ownership per task, phase dependencies, validation commands

**Template**:

```markdown
# [Feature Name] - Implementation Plan

## Execution Structure

P1 ; P2 ; P3 (sequential phases, parallel tasks within each)

## Phase 1: Interfaces {T1.1 || T1.2 || T1.3 || T1.4}

- T1.1: `FeatureSchema.ts` — domain types
- T1.2: `FeatureError.ts` — error types
- T1.3: `FeatureService.ts` — service interface + Tag
- T1.4: `FeatureRepository.ts` — repository interface + Tag

Gate: typechecks pass

## Phase 2: Implementations {T2.1 || T2.2 || T2.3}

- T2.1: `FeatureRepositoryLive.ts` — implements FeatureRepository
- T2.2: `FeatureServiceLive.ts` — implements FeatureService
- T2.3: `FeatureValidation.ts` — validation functions

Gate: typechecks pass

## Phase 3: Tests {T3.1 || T3.2 || T3.3}

- T3.1: `FeatureRepository.test.ts`
- T3.2: `FeatureService.test.ts`
- T3.3: `FeatureValidation.test.ts`

Gate: tests pass

After completion:

Present plan to user
Ask: "Does this implementation plan look correct? Should I proceed to Phase 6 (Implementation)?"
STOP and wait for approval

Phase 6: Execute Implementation

REQUIRES APPROVAL FROM PHASE 5

Goal: Implement the solution exactly as planned.

No new files: Implementation follows the plan exactly.

Process:

Execute tasks per phase following the plan
After each file: format, typecheck, fix errors
After each phase: run tests
Update plan.md progress markers

Quality gates: typechecks pass after each file, tests pass after each phase

After completion:

Present implementation summary
Show test results
Highlight any deviations from plan (with justification)
Ask: "Implementation complete. Would you like me to create a PR or make any changes?"

Approval Checkpoints

Each phase requires explicit user approval before proceeding:

1→2: "Does this capture your requirements? Proceed to Requirements?"
2→3: "Do these requirements reflect system needs? Proceed to Design?"
3→4: "Does this design meet expectations? Proceed to Behavioral Tests?"
4→5: "Do these tests capture expected behaviors? Proceed to Plan?"
5→6: "Is this plan correct? Proceed to Implementation?"
6→✓: "Implementation complete. Create PR or make changes?"

Never skip checkpoints—changes cascade through dependent phases.

When to Ask Questions

Use the AskUserQuestion tool liberally throughout:

Phase 1 (Instructions)

Clarify ambiguous requirements
Resolve conflicting user stories
Understand domain terminology
Identify edge cases

Phase 2 (Requirements)

Prioritize requirements
Resolve technical constraints
Choose between valid approaches
Define success metrics

Phase 3 (Design)

Select architecture patterns
Choose libraries or frameworks
Decide error handling strategies
Resolve performance trade-offs

Phase 4 (Behavioral Tests)

Clarify expected behavior for edge cases
Determine error handling semantics
Resolve ambiguous business rules
Define success/failure criteria

Phase 5 (Plan)

Identify parallelization opportunities
Resolve file ownership conflicts between tasks
Sequence only truly dependent phases
Identify risks
Plan contingencies

Phase 6 (Implementation)

Handle unexpected issues
Adjust for missing dependencies
Resolve test failures
Document deviations

Question Quality:

Provide context for why you're asking
Offer 2-4 concrete options
Explain trade-offs of each option
Recommend a default choice

Example:

// Example of tool call structure (not executable TypeScript)
declare function AskUserQuestion(params: {
  questions: Array<{
    question: string
    header: string
    multiSelect: boolean
    options: Array<{
      label: string
      description: string
    }>
  }>
}): void

// Usage:
AskUserQuestion({
  questions: [{
    question: "How should we handle concurrent updates to the counter?",
    header: "Concurrency",
    multiSelect: false,
    options: [
      {
        label: "Last-write-wins",
        description: "Simple but may lose updates under high concurrency"
      },
      {
        label: "Optimistic locking",
        description: "Retry on conflict, guarantees no lost updates"
      },
      {
        label: "CRDT merge",
        description: "Automatic conflict resolution, complex but robust"
      }
    ]
  }]
})

Quality Standards

Each specification document must:

Clarity

Use precise, unambiguous language
Define all domain terms
Include concrete examples
Avoid vague words ("should", "might", "probably")

Completeness

Address all user requirements
Cover error scenarios
Document edge cases
Include success criteria

Traceability

Link back to previous phases
Reference source requirements
Map to implementation tasks
Enable impact analysis

Effect Alignment

Use Effect patterns (services, layers, streams)
Leverage Effect error handling
Design for composition
Follow Effect best practices

Testability

Define measurable acceptance criteria
Specify test scenarios
Include performance benchmarks
Enable automated validation

Documentation

Add inline code examples
Include usage scenarios
Document design rationale
Explain trade-offs

Common Pitfalls

Skipping Phases

Don't: Jump straight to implementation Do: Follow all six phases in order

Assuming Requirements

Don't: Fill in gaps with assumptions Do: Ask questions using AskUserQuestion

Over-designing

Don't: Design for hypothetical future requirements Do: Design for stated requirements with room to extend

Under-planning

Don't: Create vague tasks like "implement feature" Do: Break into concrete, testable subtasks

Poor Parallelization

Don't: Create tasks that touch overlapping files Do: Structure tasks so each owns distinct files, enabling 5+ parallel agents

Sequential When Parallel Is Possible

Don't: Chain tasks that could run concurrently (Task 2 depends on Task 1 when they touch different files) Do: Group truly independent tasks into phases that execute in parallel

Ignoring Feedback

Don't: Proceed when user requests changes Do: Update specs and get re-approval

Poor Traceability

Don't: Lose connection between phases Do: Explicitly reference previous phase decisions

Integration with Other Skills

Spec-driven development works with:

domain-modeling: Use when designing domain types in Phase 3
service-implementation: Apply during Phase 6 implementation
layer-design: Reference when creating layers in Phase 3
typeclass-design: Use for generic abstractions in Phase 3
effect-testing: Apply test patterns in Phase 4 (behavioral tests) and Phase 6 (implementation)

Examples

Small Feature: Add Logging

Phase 1: "Add debug logging to payment flow"
Phase 2: Log levels, what to log, PII handling
Phase 3: Logger service design, Effect integration
Phase 4: Tests for log output, error scenarios
Phase 5: Update files, add logger calls
Phase 6: Implement, verify logs appear and tests pass

Medium Feature: User Authentication

Phase 1: Login, registration, password reset stories
Phase 2: Security requirements, session management
Phase 3: Service design, token strategy, error types
Phase 4: Tests for login flows, token validation, error cases
Phase 5: Multi-phase plan (auth service, session, middleware)
Phase 6: Implement all components, make tests pass

Large Feature: Real-time Sync

Phase 1: Sync requirements across devices
Phase 2: Conflict resolution, consistency guarantees
Phase 3: CRDT design, stream architecture, error recovery
Phase 4: Tests for conflict scenarios, sync behaviors, edge cases
Phase 5: Phased rollout (local, network, UI integration)
Phase 6: Iterative implementation with progress updates

Success Criteria

A successful spec-driven development cycle:

Alignment: Final implementation matches original user intent
Quality: All behavioral tests pass, code follows patterns
Documentation: Specs accurately describe implementation
Traceability: Clear path from instructions to code
Maintainability: Future developers understand design rationale
Confidence: User approved at each phase checkpoint
Executable Specs: Behavioral tests serve as living documentation that verifies behavior

Remember: The goal is not perfect specs, but shared understanding and documented decisions that guide implementation and enable future maintenance. Behavioral tests bridge the gap between prose documentation and running code—they start as specifications and end as verified behavior.

Install Skill

SKILL.md

Spec-Driven Development

Critical Rule

When to Use This Skill

Directory Structure

Six-Phase Workflow

Phase 1: Capture Instructions

Phase 2: Derive Requirements

Validation

Storage

Error Handling

Traceability

Type Definitions

Data Models

Schemas

Validation Rules

Effect Patterns

Services

Layers

Error Handling

Streams (if applicable)

Component Structure

Dependencies

Testing Strategy

Unit Tests

Integration Tests

Test Structure

Performance Considerations

Security Considerations

Traceability

Phase 6: Execute Implementation

Approval Checkpoints

When to Ask Questions

Phase 1 (Instructions)

Phase 2 (Requirements)

Phase 3 (Design)

Phase 4 (Behavioral Tests)

Phase 5 (Plan)

Phase 6 (Implementation)

Quality Standards

Clarity

Completeness

Traceability

Effect Alignment

Testability

Documentation

Common Pitfalls

Skipping Phases

Assuming Requirements

Over-designing

Under-planning

Poor Parallelization

Sequential When Parallel Is Possible

Ignoring Feedback

Poor Traceability

Integration with Other Skills

Examples

Small Feature: Add Logging

Medium Feature: User Authentication

Large Feature: Real-time Sync

Success Criteria