Claude Code Plugins

Community-maintained marketplace

Feedback

typed-holes-refactor

@rand/cc-polymath
26
0

Refactor codebases using Design by Typed Holes methodology - iterative, test-driven refactoring with formal hole resolution, constraint propagation, and continuous validation. Use when refactoring existing code, optimizing architecture, or consolidating technical debt through systematic hole-driven development.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name typed-holes-refactor
description Refactor codebases using Design by Typed Holes methodology - iterative, test-driven refactoring with formal hole resolution, constraint propagation, and continuous validation. Use when refactoring existing code, optimizing architecture, or consolidating technical debt through systematic hole-driven development.

Typed Holes Refactoring

Systematically refactor codebases using the Design by Typed Holes meta-framework: treat architectural unknowns as typed holes, resolve them iteratively with test-driven validation, and propagate constraints through dependency graphs.

Core Workflow

Phase 0: Hole Discovery & Setup

1. Create safe working branch:

git checkout -b refactor/typed-holes-v1
# CRITICAL: Never work in main, never touch .beads/ in main

2. Analyze current state and identify holes:

python scripts/discover_holes.py
# Creates REFACTOR_IR.md with hole catalog

The Refactor IR documents:

  • Current State Holes: What's unknown about the current system?
  • Refactor Holes: What needs resolution to reach the ideal state?
  • Constraints: What must be preserved/improved/maintained?
  • Dependencies: Which holes block which others?

3. Write baseline characterization tests:

Create tests/characterization/ to capture exact current behavior:

# tests/characterization/test_current_behavior.py
def test_api_contracts():
    """All public APIs must behave identically post-refactor"""
    for endpoint in discover_public_apis():
        old_result = run_current(endpoint, test_inputs)
        save_baseline(endpoint, old_result)

def test_performance_baselines():
    """Record current performance - don't regress"""
    baselines = measure_all_operations()
    save_json("baselines.json", baselines)

Run tests on main branch - they should all pass. These are your safety net.

Phase 1-N: Iterative Hole Resolution

For each hole (in dependency order):

1. Select next ready hole:

python scripts/next_hole.py
# Shows holes whose dependencies are resolved

2. Write validation tests FIRST (test-driven):

# tests/refactor/test_h{N}_resolution.py
def test_h{N}_resolved():
    """Define what 'resolved correctly' means"""
    # This should FAIL initially
    assert desired_state_achieved()

def test_h{N}_equivalence():
    """Ensure no behavioral regressions"""
    old_behavior = load_baseline()
    new_behavior = run_refactored()
    assert old_behavior == new_behavior

3. Implement resolution:

  • Refactor code to make tests pass
  • Keep characterization tests passing
  • Commit incrementally with clear messages

4. Validate resolution:

python scripts/validate_resolution.py H{N}
# Checks: tests pass, constraints satisfied, main untouched

5. Propagate constraints:

python scripts/propagate.py H{N}
# Updates dependent holes based on resolution

6. Document and commit:

git add .
git commit -m "Resolve H{N}: {description}

- Tests: tests/refactor/test_h{N}_*.py pass
- Constraints: {constraints satisfied}
- Propagates to: {dependent holes}"

Phase Final: Reporting

Generate comprehensive delta report:

python scripts/generate_report.py > REFACTOR_REPORT.md

Report includes:

  • Hole resolution summary with validation evidence
  • Metrics delta (LOC, complexity, coverage, performance)
  • Behavioral analysis (intentional changes documented)
  • Constraint validation (all satisfied)
  • Risk assessment and migration guide

Key Principles

1. Test-Driven Everything

  • Write validation criteria BEFORE implementing
  • Tests define "correct resolution"
  • Characterization tests are sacred - never let them fail

2. Hole-Driven Progress

  • Resolve holes in dependency order
  • Each resolution propagates constraints
  • Track everything formally in Refactor IR

3. Continuous Validation

Every commit must validate:

  • ✅ Characterization tests pass (behavior preserved)
  • ✅ Resolution tests pass (hole resolved correctly)
  • ✅ Constraints satisfied
  • ✅ Main branch untouched
  • .beads/ intact in main

4. Safe by Construction

  • Work only in refactor branch
  • Main is read-only reference
  • Beads are untouchable historical artifacts

5. Formal Completeness

Design complete when:

  • All holes resolved and validated
  • All constraints satisfied
  • All phase gates passed
  • Metrics improved or maintained

Hole Quality Framework

SMART Criteria for Good Holes

Every hole must be:

  • Specific: Clear, bounded question with concrete answer

    • ✓ Good: "How should error handling work in the API layer?"
    • ✗ Bad: "How to improve the code?"
  • Measurable: Has testable validation criteria

    • ✓ Good: "Reduce duplication from 60% to <15%"
    • ✗ Bad: "Make code better"
  • Achievable: Can be resolved with available information

    • ✓ Good: "Extract parsing logic to separate module"
    • ✗ Bad: "Predict all future requirements"
  • Relevant: Blocks meaningful progress on refactoring

    • ✓ Good: "Define core interface (blocks 5 other holes)"
    • ✗ Bad: "Decide variable naming convention"
  • Typed: Clear type/structure for resolution

    • ✓ Good: interface Architecture = { layers: Layer[], rules: Rule[] }
    • ✗ Bad: "Some kind of structure?"

Hole Estimation Framework

Size holes using these categories:

Size Duration Characteristics Examples
Nano 1-2 hours Simple, mechanical changes Rename files, update imports
Small 4-8 hours Single module refactor Extract class, consolidate functions
Medium 1-3 days Cross-module changes Define interfaces, reorganize packages
Large 4-7 days Architecture changes Layer extraction, pattern implementation
Epic >7 days SPLIT THIS HOLE Too large, break into smaller holes

Estimation Red Flags:

  • More than 3 dependencies → Likely Medium+
  • Unclear validation → Add time for discovery
  • New patterns/tools → Add learning overhead

Hole Splitting Guidelines

Split a hole when:

  1. Estimate exceeds 7 days
  2. More than 5 dependencies
  3. Validation criteria unclear
  4. Multiple distinct concerns mixed

Splitting strategy:

Epic hole: "Refactor entire authentication system"
→ Split into:
  R10_auth_interface: Define new auth interface (Medium)
  R11_token_handling: Implement JWT tokens (Small)
  R12_session_management: Refactor sessions (Medium)
  R13_auth_middleware: Update middleware (Small)
  R14_auth_testing: Comprehensive test suite (Medium)

After splitting:

  • Update dependencies in REFACTOR_IR.md
  • Run python scripts/propagate.py to update graph
  • Re-sync with beads: python scripts/holes_to_beads.py

Common Hole Types

Architecture Holes

"?R1_target_architecture": "What should the ideal structure be?"
"?R2_module_boundaries": "How should modules be organized?"
"?R3_abstraction_layers": "What layers/interfaces are needed?"

Validation: Architecture tests, dependency analysis, layer violation checks

Implementation Holes

"?R4_consolidation_targets": "What code should merge?"
"?R5_extraction_targets": "What code should split out?"
"?R6_elimination_targets": "What code should be removed?"

Validation: Duplication detection, equivalence tests, dead code analysis

Quality Holes

"?R7_test_strategy": "How to validate equivalence?"
"?R8_migration_path": "How to safely transition?"
"?R9_rollback_mechanism": "How to undo if needed?"

Validation: Test coverage metrics, migration dry-runs, rollback tests

See HOLE_TYPES.md for complete catalog.

Constraint Propagation Rules

Rule 1: Interface Resolution → Type Constraints

When: Interface hole resolved with concrete types
Then: Propagate type requirements to all consumers

Example:
  Resolve R6: NodeInterface = BaseNode with async run()
  Propagates to:
    → R4: Parallel execution must handle async
    → R5: Error recovery must handle async exceptions

Rule 2: Implementation → Performance Constraints

When: Implementation resolved with resource usage
Then: Propagate limits to dependent holes

Example:
  Resolve R4: Parallelization with max_concurrent=3
  Propagates to:
    → R8: Rate limit = provider_limit / 3
    → R7: Memory budget = 3 * single_operation_memory

Rule 3: Validation → Test Requirements

When: Validation resolved with test requirements
Then: Propagate data needs upstream

Example:
  Resolve R9: Testing needs 50 examples
  Propagates to:
    → R7: Metrics must support batch evaluation
    → R8: Test data collection strategy needed

See CONSTRAINT_RULES.md for complete propagation rules.

Success Indicators

Weekly Progress

  • 2-4 holes resolved
  • All tests passing
  • Constraints satisfied
  • Measurable improvements

Red Flags (Stop & Reassess)

  • ❌ Characterization tests fail
  • ❌ Hole can't be resolved within constraints
  • ❌ Constraints contradict each other
  • ❌ No progress for 3+ days
  • ❌ Main branch accidentally modified

Validation Gates

Gate Criteria Check
Gate 1: Discovery Complete All holes cataloged, dependencies mapped python scripts/check_discovery.py
Gate 2: Foundation Holes Core interfaces resolved, tests pass python scripts/check_foundation.py
Gate 3: Implementation All refactor holes resolved, metrics improved python scripts/check_implementation.py
Gate 4: Production Ready Migration tested, rollback verified python scripts/check_production.py

Claude-Assisted Workflow

This skill is designed for effective Claude/LLM collaboration. Here's how to divide work:

Phase 0: Discovery

Claude's Role:

  • Run discover_holes.py to analyze codebase
  • Suggest holes based on code analysis
  • Generate initial REFACTOR_IR.md structure
  • Write characterization tests to capture current behavior
  • Set up test infrastructure

Your Role:

  • Confirm holes are well-scoped
  • Prioritize which holes to tackle first
  • Review and approve REFACTOR_IR.md
  • Define critical constraints

Phase 1-N: Hole Resolution

Claude's Role:

  • Write resolution tests (TDD) BEFORE implementation
  • Implement hole resolution to make tests pass
  • Run validation scripts: validate_resolution.py, check_foundation.py
  • Update REFACTOR_IR.md with resolution details
  • Propagate constraints: python scripts/propagate.py H{N}
  • Generate commit messages documenting changes

Your Role:

  • Make architecture decisions (which pattern, which approach)
  • Assess risk and determine constraint priorities
  • Review code changes for correctness
  • Approve merge to main when complete

Phase Final: Reporting

Claude's Role:

  • Generate comprehensive REFACTOR_REPORT.md
  • Document all metrics deltas
  • List all validation evidence
  • Create migration guides
  • Prepare PR description

Your Role:

  • Final review of report accuracy
  • Approve for production deployment
  • Conduct post-refactor retrospective

Effective Prompting Patterns

Starting a session:

"I need to refactor [description]. Use typed-holes-refactor skill.
Start with discovery phase."

Resolving a hole:

"Resolve H3 (target_architecture). Write tests first, then implement.
Use [specific pattern/approach]."

Checking progress:

"Run check_completeness.py and show me the dashboard.
What's ready to work on next?"

Generating visualizations:

"Generate dependency graph showing bottlenecks and critical path.
Use visualize_graph.py with --analyze."

Claude's Limitations

Claude CANNOT:

  • Make subjective architecture decisions (you must decide)
  • Determine business-critical constraints (you must specify)
  • Run tests that require external services (mock or you run them)
  • Merge to main (you must approve and merge)

Claude CAN:

  • Analyze code and suggest holes
  • Write comprehensive test suites
  • Implement resolutions within your constraints
  • Generate reports and documentation
  • Track progress across sessions (via beads + REFACTOR_IR.md)

Multi-Session Continuity

At session start:

"Continue typed-holes refactoring. Import beads state and
show current status from REFACTOR_IR.md."

Claude will:

  • Read REFACTOR_IR.md to understand current state
  • Check which holes are resolved
  • Identify next ready holes
  • Resume where previous session left off

You should:

  • Keep REFACTOR_IR.md and .beads/ committed to git
  • Export beads state at session end: bd export -o .beads/issues.jsonl
  • Use /context before starting to ensure Claude has full context

Beads Integration

Why beads + typed holes?

  • Beads tracks issues across sessions (prevents lost work)
  • Holes track refactoring-specific state (dependencies, constraints)
  • Together: Complete continuity for long-running refactors

Setup

# Install beads (once)
go install github.com/steveyegge/beads/cmd/bd@latest

# After running discover_holes.py
python scripts/holes_to_beads.py

# Check what's ready
bd ready --json

Workflow Integration

During hole resolution:

# Start work on a hole
bd update bd-5 --status in_progress --json

# Implement resolution
# ... write tests, implement code ...

# Validate resolution
python scripts/validate_resolution.py H3

# Close bead
bd close bd-5 --reason "Resolved H3: target_architecture" --json

# Export state
bd export -o .beads/issues.jsonl
git add .beads/issues.jsonl REFACTOR_IR.md
git commit -m "Resolve H3: Define target architecture"

Syncing holes ↔ beads:

# After updating REFACTOR_IR.md manually
python scripts/holes_to_beads.py  # Sync changes to beads

# After resolving holes
python scripts/holes_to_beads.py  # Update bead statuses

Cross-session continuity:

# Session start
bd import -i .beads/issues.jsonl
bd ready --json  # Shows ready holes
python scripts/check_completeness.py  # Shows overall progress

# Session end
bd export -o .beads/issues.jsonl
git add .beads/issues.jsonl
git commit -m "Session checkpoint: 3 holes resolved"

Bead advantages:

  • Tracks work across days/weeks
  • Shows dependency graph: bd deps bd-5
  • Prevents context loss
  • Integrates with overall project management

Scripts Reference

All scripts are in scripts/:

  • discover_holes.py - Analyze codebase and generate REFACTOR_IR.md
  • next_hole.py - Show next resolvable holes based on dependencies
  • validate_resolution.py - Check if hole resolution satisfies constraints
  • propagate.py - Update dependent holes after resolution
  • generate_report.py - Create comprehensive delta report
  • check_discovery.py - Validate Phase 0 completeness (Gate 1)
  • check_foundation.py - Validate Phase 1 completeness (Gate 2)
  • check_implementation.py - Validate Phase 2 completeness (Gate 3)
  • check_production.py - Validate Phase 3 readiness (Gate 4)
  • check_completeness.py - Overall progress dashboard
  • visualize_graph.py - Generate hole dependency visualization
  • holes_to_beads.py - Sync holes with beads issues

Run any script with --help for detailed usage.

Meta-Consistency

This skill uses its own principles:

Typed Holes Principle Application to Refactoring
Typed Holes Architectural unknowns cataloged with types
Constraint Propagation Design constraints flow through dependency graph
Iterative Refinement Hole-by-hole resolution cycles
Test-Driven Validation Tests define correctness
Formal Completeness Gates verify design completeness

We use the system to refactor the system.

Advanced Topics

For complex scenarios, see:

Quick Start Example

# 1. Setup
git checkout -b refactor/typed-holes-v1
python scripts/discover_holes.py

# 2. Write baseline tests
# Create tests/characterization/test_*.py

# 3. Resolve first hole
python scripts/next_hole.py  # Shows H1 is ready
# Write tests/refactor/test_h1_*.py (fails initially)
# Refactor code until tests pass
python scripts/validate_resolution.py H1
python scripts/propagate.py H1
git commit -m "Resolve H1: ..."

# 4. Repeat for each hole
# ...

# 5. Generate report
python scripts/generate_report.py > REFACTOR_REPORT.md

Troubleshooting

Characterization tests fail

Symptom: Tests that captured baseline behavior now fail

Resolution:

  1. Revert changes: git diff to see what changed
  2. Investigate: What behavior changed and why?
  3. Decision tree:
    • Intentional change: Update baselines with documentation
      # Update baseline with reason
      save_baseline("v2_api", new_behavior,
                    reason="Switched to async implementation")
      
    • Unintentional regression: Fix the code, tests must pass

Prevention: Run characterization tests before AND after each hole resolution.

Hole can't be resolved

Symptom: Stuck on a hole for >3 days, unclear how to proceed

Resolution:

  1. Check dependencies: Are they actually resolved?

    python scripts/visualize_graph.py --analyze
    # Look for unresolved dependencies
    
  2. Review constraints: Are they contradictory?

    • Example: C1 "preserve all behavior" + C5 "change API contract" → Contradictory
    • Fix: Renegotiate constraints with stakeholders
  3. Split the hole: If hole is too large

    # Original: R4_consolidate_all (Epic, 10+ days)
    # Split into:
    R4a_consolidate_parsers (Medium, 2 days)
    R4b_consolidate_validators (Small, 1 day)
    R4c_consolidate_handlers (Medium, 2 days)
    
  4. Check for circular dependencies:

    python scripts/visualize_graph.py
    # Look for cycles: R4 → R5 → R6 → R4
    
    • Fix: Break cycle by introducing intermediate hole or redefining dependencies

Escalation: If still stuck after 5 days, consider alternative refactoring approach.

Contradictory Constraints

Symptom: Cannot satisfy all constraints simultaneously

Example:

  • C1: "Preserve exact current behavior" (backward compatibility)
  • C5: "Reduce response time by 50%" (performance improvement)
  • Current behavior includes slow, synchronous operations

Resolution Framework:

  1. Identify the conflict:

    C1 requires: Keep synchronous operations
    C5 requires: Switch to async operations
    → Contradiction: Can't be both sync and async
    
  2. Negotiate priorities:

    Option C1 C5 Tradeoff
    A: Keep sync No performance gain
    B: Switch to async Breaking change
    C: Add async, deprecate sync ⚠️ Migration burden
  3. Choose resolution strategy:

    • Relax constraint: Change C1 to "Preserve behavior where possible"
    • Add migration period: C implemented over 2 releases
    • Split into phases: Phase 1 (C1), Phase 2 (C5)
  4. Document decision:

    ## Constraint Resolution: C1 vs C5
    
    **Decision**: Relax C1 to allow async migration
    **Rationale**: Performance critical for user experience
    **Migration**: 3-month deprecation period for sync API
    **Approved by**: [Stakeholder], [Date]
    

Circular Dependencies

Symptom: visualize_graph.py shows cycles

Example:

R4 (consolidate parsers) → depends on R6 (define interface)
R6 (define interface) → depends on R4 (needs parser examples)

Resolution strategies:

  1. Introduce intermediate hole:

    H0_parser_analysis: Analyze existing parsers (no dependencies)
    R6_interface: Define interface using H0 analysis
    R4_consolidate: Implement using R6 interface
    
  2. Redefine dependencies:

    • Maybe R4 doesn't actually need R6
    • Or R6 only needs partial R4 (split R4)
  3. Accept iterative refinement:

    R6_interface_v1: Initial interface (simple)
    R4_consolidate: Implement with v1 interface
    R6_interface_v2: Refine based on R4 learnings
    

Prevention: Define architecture holes before implementation holes.

No Progress for 3+ Days

Symptom: Feeling stuck, no commits, uncertain how to proceed

Resolution checklist:

  • Review REFACTOR_IR.md: Are holes well-defined (SMART criteria)?

    • If not: Rewrite holes to be more specific
  • Check hole size: Is current hole >7 days estimate?

    • If yes: Split into smaller holes
  • Run dashboard: python scripts/check_completeness.py

    • Are you working on a blocked hole?
    • Switch to a ready hole instead
  • Visualize dependencies: python scripts/visualize_graph.py --analyze

    • Identify bottlenecks
    • Look for parallel work opportunities
  • Review constraints: Are they achievable?

    • Renegotiate if necessary
  • Seek external review:

    • Share REFACTOR_IR.md with colleague
    • Get feedback on approach
  • Consider alternative: Maybe this refactor isn't feasible

    • Document why
    • Propose different approach

Reset protocol: If still stuck, revert to last working state and try different approach.

Estimation Failures

Symptom: Hole taking 3x longer than estimated

Analysis:

  1. Why did estimate fail?

    • Underestimated complexity
    • Unforeseen dependencies
    • Unclear requirements
    • Technical issues (tool problems, infrastructure)
  2. Immediate actions:

    • Update REFACTOR_IR.md with revised estimate
    • If >7 days, split the hole
    • Update beads: bd update bd-5 --note "Revised estimate: 5 days (was 2)"
  3. Future improvements:

    • Use actual times to calibrate future estimates
    • Add buffer for discovery (20% overhead)
    • Note uncertainty in IR: "Estimate: 2-4 days (high uncertainty)"

Learning: Track actual vs estimated time in REFACTOR_REPORT.md for future reference.


Begin with Phase 0: Discovery. Always work in a branch. Test first, refactor second.