Claude Code Plugins

Community-maintained marketplace

Feedback

legacy-code-safety

@TheBushidoCollective/han
38
0

Use when modifying, removing, or refactoring code that lacks test coverage. Emphasizes the danger of untested changes and the RGR workflow to add characterization tests before modifications.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name legacy-code-safety
description Use when modifying, removing, or refactoring code that lacks test coverage. Emphasizes the danger of untested changes and the RGR workflow to add characterization tests before modifications.
allowed-tools Read, Write, Edit, Bash, Grep, Glob

Legacy Code Safety

"Changing untested code is like performing surgery blindfolded."

DANGER: The Risk of Modifying Untested Code

Modifying code without tests is one of the HIGHEST-RISK activities in software development.

When you change untested code:

  • You CANNOT know what behaviors you're breaking
  • You have NO safety net to catch mistakes
  • You're relying on hope instead of evidence
  • You're creating technical debt on top of technical debt

In dynamically-typed languages (Python, JavaScript, Ruby, PHP), this risk is EXPONENTIALLY higher:

  • No compiler to catch type errors
  • No static analysis to verify correctness
  • Tests are your ONLY safety mechanism
  • Runtime is when you discover your mistakes

The Golden Rule

NEVER modify untested code without adding tests first.

If code lacks tests:

  1. STOP
  2. Add characterization tests
  3. Verify tests pass
  4. THEN make your changes

The RGR Workflow for Legacy Code

This is NOT standard TDD. This is characterization testing - capturing current behavior before changing it.

Standard TDD (New Features)

  1. RED: Write failing test for NEW behavior
  2. GREEN: Implement to make test pass
  3. REFACTOR: Clean up implementation

Legacy Code RGR (Existing Code)

  1. RED: Write test that SHOULD pass (captures current behavior)
  2. GREEN: Verify it passes (confirms you captured reality correctly)
  3. REFACTOR: NOW you can safely modify with a safety net

The Critical Difference

TDD: You know what the behavior SHOULD be, write test for ideal

Legacy RGR: You DON'T know all behaviors, capture what EXISTS

Why "RED" for Passing Tests?

The "RED" step means "you're in the red zone" - working with dangerous untested code. Even though you expect the test to pass, you're at risk until you have that passing test as proof.

Characterization Testing Process

Step 1: Identify All Behaviors

Before writing tests, understand what the code currently does:

// Legacy function - no tests
function calculateShipping(order) {
  let cost = 9.99

  if (order.total > 100) {
    cost = 0
  }

  if (order.weight > 50) {
    cost += 25
  }

  if (order.destination === 'HI' || order.destination === 'AK') {
    cost += 15
  }

  return cost
}

Questions to answer:

  • What inputs does it accept?
  • What outputs does it produce?
  • What are the edge cases?
  • Are there side effects?
  • What happens with invalid input?

Step 2: Write Tests for Current Behavior

Document what EXISTS, not what should exist:

describe('calculateShipping (characterization)', () => {
  // Capture base case
  test('standard shipping', () => {
    const order = { total: 50, weight: 10, destination: 'CA' }
    expect(calculateShipping(order)).toBe(9.99)
  })

  // Capture free shipping threshold
  test('free shipping over $100', () => {
    const order = { total: 150, weight: 10, destination: 'CA' }
    expect(calculateShipping(order)).toBe(0)
  })

  // Capture weight surcharge
  test('heavy package surcharge', () => {
    const order = { total: 50, weight: 75, destination: 'CA' }
    expect(calculateShipping(order)).toBe(34.99) // 9.99 + 25
  })

  // Capture regional surcharge
  test('Hawaii surcharge', () => {
    const order = { total: 50, weight: 10, destination: 'HI' }
    expect(calculateShipping(order)).toBe(24.99) // 9.99 + 15
  })

  // Discovered edge case: free shipping + weight
  test('free shipping with heavy package', () => {
    const order = { total: 150, weight: 75, destination: 'CA' }
    expect(calculateShipping(order)).toBe(25) // 0 + 25 (is this right?)
  })

  // Discovered edge case: all conditions
  test('all surcharges with free shipping threshold', () => {
    const order = { total: 150, weight: 75, destination: 'HI' }
    expect(calculateShipping(order)).toBe(40) // 0 + 25 + 15
  })
})

Step 3: Verify Tests Pass

These tests MUST pass immediately:

npm test

If they don't pass:

  • You misunderstood the current behavior
  • Fix the test, not the code
  • The test should document reality

Step 4: NOW You Can Refactor

With tests in place, you have a safety net:

// NOW we can safely refactor
function calculateShipping(order: Order): number {
  const BASE_COST = 9.99
  const FREE_SHIPPING_THRESHOLD = 100
  const HEAVY_PACKAGE_WEIGHT = 50
  const HEAVY_PACKAGE_SURCHARGE = 25
  const REMOTE_DESTINATIONS = ['HI', 'AK']
  const REMOTE_SURCHARGE = 15

  let cost = order.total > FREE_SHIPPING_THRESHOLD ? 0 : BASE_COST

  if (order.weight > HEAVY_PACKAGE_WEIGHT) {
    cost += HEAVY_PACKAGE_SURCHARGE
  }

  if (REMOTE_DESTINATIONS.includes(order.destination)) {
    cost += REMOTE_SURCHARGE
  }

  return cost
}

Run tests again:

npm test  # All tests still pass

You've improved the code WITHOUT breaking existing behavior.

When This Skill Applies

Use this skill when:

  • Deleting code that has no tests
  • Modifying functions without test coverage
  • Refactoring untested modules
  • Changing behavior in legacy systems
  • Test coverage is below 80% for code you're touching
  • You inherit code from previous developers
  • You're working in a codebase with poor test discipline

Even if the change seems "simple", if there are no tests, use this workflow.

Language-Specific Risk Levels

HIGH RISK: Dynamic/Weakly-Typed Languages

Python, JavaScript, Ruby, PHP:

  • NO compile-time type checking
  • NO static analysis of type correctness
  • Tests are your ONLY safety net
  • Runtime is when errors appear
  • User discovers bugs, not compiler

Extra caution required:

  • Write MORE tests
  • Test edge cases thoroughly
  • Check for None/null/undefined
  • Verify types explicitly
  • Add type hints/annotations when possible

MEDIUM RISK: Static/Strongly-Typed Languages

Go, Rust, TypeScript (strict mode), Java:

  • Type system catches some errors
  • Compiler validates correctness
  • Still need tests for BEHAVIOR
  • Types don't test business logic

Type safety helps, but:

  • Types verify structure, not logic
  • Tests still required for correctness
  • Don't rely solely on type system

The Strangler Fig Pattern

For large legacy systems, don't try to test everything at once:

What Is It?

Named after the strangler fig vine that grows around a tree:

  1. Vine grows alongside tree
  2. Eventually replaces tree entirely
  3. Original tree can be removed

Applied to code:

  1. Wrap legacy code with new interface
  2. Add tests to the wrapper
  3. Gradually move logic to new code
  4. Eventually remove legacy code

Implementation

Phase 1: Wrap

// Legacy code (untested, scary)
function legacyCalculatePrice(item) {
  // ... 500 lines of spaghetti ...
}

// New wrapper (tested)
function calculatePrice(item: Item): number {
  // Add validation and tests here
  validateItem(item)

  // Call legacy for now
  return legacyCalculatePrice(item)
}

Phase 2: Test the Boundary

describe('calculatePrice wrapper', () => {
  test('validates input', () => {
    expect(() => calculatePrice(null)).toThrow()
  })

  test('handles normal items', () => {
    const item = { id: 1, price: 10 }
    expect(calculatePrice(item)).toBe(10)
  })

  // Characterization tests for legacy behavior
  test('matches legacy for discounts', () => {
    const item = { id: 1, price: 10, discount: 0.2 }
    expect(calculatePrice(item)).toBe(8)
  })
})

Phase 3: Replace Internals

function calculatePrice(item: Item): number {
  validateItem(item)

  // New implementation (tested, clean)
  const basePrice = item.price
  const discount = item.discount ?? 0
  return basePrice * (1 - discount)

  // Legacy code removed!
}

Phase 4: Expand

Now that one function is safe, repeat for next function.

Benefits

  • Incremental progress
  • Always have working system
  • Tests added gradually
  • Risk minimized
  • Can stop anytime

When to Use

  • Large untested codebase
  • Can't stop for months to add tests
  • Need to deliver features while improving
  • High-risk production system

Pre-Modification Checklist

Before touching ANY untested code:

  • Identified all entry points (who calls this code?)
  • Identified all exit points (what does this code call?)
  • Understood current behavior (what does it do now?)
  • Written characterization tests for current behavior
  • All characterization tests pass
  • Documented any surprising behaviors found
  • Considered using Strangler Fig for large changes

ONLY AFTER ALL CHECKS PASS:

  • Make your intended changes
  • Run tests (verify behavior preserved or intentionally changed)
  • Add tests for NEW behavior if applicable
  • Document breaking changes if any

Red Flags

Stop immediately if:

  • "I'll just make a quick change, no need for tests"
  • "It's a small change, what could go wrong?"
  • "I don't have time to write tests"
  • "The code is too complex to test"
  • "I'll add tests later" (you won't)

These are rationalizations for dangerous behavior.

Common Objections (and Responses)

"But tests will take longer than the change!"

Response: The initial change is fast. Finding and fixing the bugs you introduced takes much longer. Tests save time.

"The code is too complex to test!"

Response: If it's too complex to test, it's too complex to change safely. Simplify first, or use Strangler Fig pattern.

"We need to ship this feature NOW!"

Response: Shipping a broken feature is slower than shipping a tested feature late. Technical debt compounds.

"This code has worked fine without tests for years!"

Response: Until now. The moment you change it, that safety is gone. Past stability doesn't predict future stability.

"I understand the code, I won't break it!"

Response: Confidence is not a substitute for evidence. Even experts make mistakes.

Success Metrics

How to know you've done this right:

Green flags:

  • All characterization tests pass before changes
  • All tests still pass after changes
  • New behavior has new tests
  • Test coverage increased
  • Code is more readable
  • Confidence is high

Red flags:

  • Tests fail after changes (and you don't know why)
  • Skipped edge cases "to save time"
  • Changed behavior without updating tests
  • Test coverage decreased or stagnant
  • More confused than before

Integration with Other Skills

Use these related skills:

  • test-driven-development: For new behavior, use TDD after characterization tests exist
  • refactoring: With tests in place, refactor safely
  • boy-scout-rule: While adding tests, improve code quality
  • proof-of-work: Document test results as evidence
  • debugging: When characterization tests reveal unexpected behavior

Tools and Techniques

Code Coverage Tools

Use coverage tools to identify untested code:

# JavaScript/TypeScript
npm test -- --coverage

# Python
pytest --cov=mypackage

# Ruby
bundle exec rspec --require spec_helper

# Go
go test -cover ./...

Target coverage:

  • Critical paths: 100%
  • Business logic: 90%+
  • Overall codebase: 80%+

Mutation Testing

Verify tests actually catch bugs:

# JavaScript
npm install -D stryker

# Python
pip install mutmut

Mutation testing changes your code slightly and verifies tests fail. If tests still pass after mutation, they're not effective.

Approval Testing

For complex outputs (HTML, JSON, reports):

test('generates invoice', () => {
  const invoice = generateInvoice(order)
  expect(invoice).toMatchSnapshot()
})

First run captures current output. Future runs verify it hasn't changed.

Real-World Example

Scenario

Legacy payment processor, 2000 lines, no tests, needs to add new payment method.

Step 1: Identify Scope

Don't test everything - test what you'll touch:

// Will modify this
function processPayment(payment, method) {
  // ... many lines ...
}

// Will NOT modify these
function logTransaction(transaction) { }
function sendEmail(user, content) { }

Step 2: Add Characterization Tests

describe('processPayment (characterization)', () => {
  test('processes credit card payment', () => {
    const payment = { amount: 100, currency: 'USD' }
    const method = { type: 'credit_card', number: '4111...' }

    const result = processPayment(payment, method)

    expect(result.success).toBe(true)
    expect(result.transactionId).toBeDefined()
    expect(result.amount).toBe(100)
  })

  // More tests for existing payment methods...
})

Step 3: Tests Pass

npm test
# ✅ All tests passing

Step 4: Add New Feature

function processPayment(payment, method) {
  // Existing code preserved
  if (method.type === 'credit_card') {
    return processCreditCard(payment, method)
  }

  // NEW: Add cryptocurrency support
  if (method.type === 'crypto') {
    return processCrypto(payment, method)
  }

  throw new Error(`Unknown payment method: ${method.type}`)
}

Step 5: Add Tests for New Behavior

test('processes cryptocurrency payment', () => {
  const payment = { amount: 100, currency: 'USD' }
  const method = { type: 'crypto', address: '0x123...' }

  const result = processPayment(payment, method)

  expect(result.success).toBe(true)
  expect(result.transactionId).toBeDefined()
})

Step 6: All Tests Pass

npm test
# ✅ All tests passing (old + new)

Result

  • New feature added safely
  • Existing behavior preserved
  • Tests ensure no regression
  • Future changes safer

Remember

  1. Untested code is dangerous code - especially in dynamic languages
  2. Characterize before changing - Capture current behavior first
  3. RGR workflow for legacy - Different from TDD, specific to untested code
  4. Strangler Fig for large systems - Gradual improvement beats big bang
  5. Tests are proof - Confidence without evidence is just hope
  6. No shortcuts - "Quick changes" create slow bugs

Legacy code is not an excuse for reckless changes. It's a reason for EXTRA caution.