name	test-driven-development
description	Test-driven development workflow for writing reliable, maintainable code
license	MIT
compatibility	Works with Claude Code 1.0+
environments	[object Object]

Test-Driven Development

You are an expert in test-driven development (TDD) with deep knowledge of testing frameworks, test design patterns, and sustainable development practices.

Core TDD Principles

The Red-Green-Refactor Cycle

Red: Write a failing test that exposes desired behavior
Green: Write minimal code to make the test pass
Refactor: Improve the code while keeping tests green

Key Benefits

Confidence: Changes don't break existing behavior
Documentation: Tests serve as executable examples
Design: Writing tests first improves API design
Debugging: Tests localize problems quickly
Refactoring: Safety net for code improvements

TDD Workflow

1. Start with a Test

Before writing implementation:

Define interface: How will code be used?
Specify behavior: What should it do in concrete terms?
Consider edge cases: Empty inputs, nulls, boundaries
Run test: Verify it fails (red)

2. Make It Pass

Write minimal implementation:

Hardcode return value: If test expects 42, return 42
Skip implementation details: Focus on passing test
Don't worry about quality: Yet, refactoring comes next
Run test: Verify it passes (green)

3. Refactor

Improve the code:

Remove duplication: DRY principle
Improve names: Make intent clear
Extract methods/classes: Better organization
Optimize: If needed, after correctness
Run tests: Ensure still green

4. Repeat

Continue cycle:

Add next test for new behavior
Or add test for edge case
Or add test for error condition
Each cycle adds one small piece of functionality

Test Design Patterns

Arrange-Act-Assert (AAA)

def test_calculate_total():
    # Arrange: Set up test data
    cart = ShoppingCart()
    cart.add_item("widget", price=10.00, quantity=2)

    # Act: Execute the function
    total = cart.calculate_total()

    # Assert: Verify expected outcome
    assert total == 20.00

Given-When-Then

def test_user_authentication():
    # Given: A user with valid credentials
    user = User(username="alice", password="secret123")

    # When: Attempting to authenticate
    result = auth_service.authenticate(user.username, user.password)

    # Then: Authentication should succeed
    assert result.is_authenticated
    assert result.token is not None

Test Data Builders

class UserBuilder:
    def __init__(self):
        self.username = "testuser"
        self.email = "test@example.com"
        self.active = True

    def with_username(self, username):
        self.username = username
        return self

    def inactive(self):
        self.active = False
        return self

    def build(self):
        return User(self.username, self.email, self.active)

# Usage
def test_inactive_user_cannot_login():
    user = UserBuilder().inactive().build()
    assert not login(user)

What to Test

Test Behavior, Not Implementation

Good: Tests verify public API contracts

def test_add_item_to_cart():
    cart = ShoppingCart()
    cart.add_item("widget", quantity=1)
    assert cart.item_count == 1

Bad: Tests depend on private implementation

def test_add_item_to_cart():
    cart = ShoppingCart()
    cart.add_item("widget", quantity=1)
    assert len(cart._items) == 1  # Fragile! Implementation detail

Test Categories

Happy Path

Normal expected usage
Typical inputs and scenarios
Primary functionality

Edge Cases

Empty collections, null/None values
Boundary conditions (0, -1, MAX_INT)
Single item in collections

Error Cases

Invalid inputs (negative when positive expected)
Resource exhaustion (disk full, network timeout)
Constraint violations

Integration Points

Database interactions
External API calls
File system operations

What NOT to Test

Don't test:

Third-party libraries (assume they work)
Language/framework features
Trivial getters/setters
Private methods (test via public API)
Constants

Test Organization

Structure by Behavior

tests/
├── unit/
│   ├── test_shopping_cart.py
│   ├── test_pricing.py
│   └── test_inventory.py
├── integration/
│   ├── test_checkout_flow.py
│   └── test_payment_processing.py
└── e2e/
    └── test_user_journeys.py

Fixture Reuse

# conftest.py
@pytest.fixture
def empty_cart():
    return ShoppingCart()

@pytest.fixture
def cart_with_items():
    cart = ShoppingCart()
    cart.add_item("widget", quantity=2)
    cart.add_item("gadget", quantity=1)
    return cart

# Test file uses fixtures
def test_remove_item(cart_with_items):
    cart_with_items.remove_item("widget")
    assert cart_with_items.item_count == 1

Testing Anti-Patterns

Fragile Tests

Tests that break easily:

Overspecification: Testing implementation details
Brittle assertions: Exact string matching on messages
Tight coupling: Tests require specific internal structure

Slow Tests

Tests that take too long:

No mocking: Hitting real databases, APIs
Heavy fixtures: Setting up complex state
No isolation: Tests depend on execution order

Unclear Tests

Tests that are hard to understand:

Poor naming: test1(), testItWorks()
Magic values: assert x == 7 (why 7?)
Multiple assertions: Testing multiple things in one test

False Negatives

Tests that fail but code works:

Flaky tests: Timing issues, non-deterministic
Platform-dependent: Different behavior on different OS
Date/time dependent: Fail on certain days

Advanced TDD Techniques

Test Doubles

Mocks

def test_sends_notification(mocker):
    mock_sms = mocker.patch('services.send_sms')
    user = User(phone="555-1234")
    user.notify("Hello")
    mock_sms.assert_called_once_with("555-1234", "Hello")

Stubs

def test_with_stub():
    class FakePaymentGateway:
        def charge(self, amount):
            return True  # Always succeeds

    checkout = Checkout(payment_gateway=FakePaymentGateway())
    assert checkout.process_order(100)

Fakes

class InMemoryUserRepository:
    def __init__(self):
        self.users = {}

    def save(self, user):
        self.users[user.id] = user

    def find_by_id(self, user_id):
        return self.users.get(user_id)

Parameterized Tests

@pytest.mark.parametrize("input,expected", [
    ("valid@email.com", True),
    ("invalid", False),
    ("@no-local.com", False),
    ("no@domain", False),
])
def test_email_validation(input, expected):
    assert is_valid_email(input) == expected

Property-Based Testing

@given(st.lists(st.integers(), min_size=0))
def test_reverse_twice(nums):
    assert list(reversed(list(reversed(nums)))) == nums

@given(st.integers(), st.integers())
def test_addition_commutative(a, b):
    assert a + b == b + a

TDD for Different Contexts

API Endpoints

def test_create_user(client):
    response = client.post("/users", json={
        "username": "alice",
        "email": "alice@example.com"
    })
    assert response.status_code == 201
    assert response.json()["username"] == "alice"

Database Operations

def test_user_created_in_db(db_session):
    user = User(username="bob")
    db_session.add(user)
    db_session.commit()

    retrieved = db_session.query(User).filter_by(username="bob").first()
    assert retrieved is not None

Async Code

async def test_async_fetch():
    result = await fetch_data("https://api.example.com")
    assert result["status"] == "success"

Common Pitfalls

Skipping the Red Phase

Problem: Write code first, then tests Consequence: Tests pass but don't verify desired behavior Solution: Always write failing test first

Writing Too Much Code

Problem: Implement full feature before all tests Consequence: Lose feedback, harder to debug Solution: One test at a time, minimal implementation

Ignoring Refactoring

Problem: Code becomes messy with tests passing Consequence: Technical debt accumulates Solution: Budget time for refactoring each cycle

Tests Too Coupled

Problem: Tests break when implementation changes Consequence: High maintenance cost Solution: Test behavior through public interface

When NOT to Use TDD

Exploratory Coding

Researching unknown APIs
Prototyping/proof-of-concept
Learning new technology

UI/UX Work

Visual design iteration
User experience exploration
CSS/styling experiments

One-Off Scripts

Data migration
Quick utilities
Throwaway code

Crisis Situations

Production outage (fix first, test later)
Security incident
Critical bug with no tests

TDD Checklist

Before considering code complete:

All tests pass
Tests cover happy path, edge cases, errors
Tests are readable and maintainable
No code coverage gaps in critical paths
Tests run quickly (<1 second per test suite)
Tests are deterministic (no flakiness)
Tests are independent (can run in any order)
Code is refactored and clean

Measuring TDD Success

Coverage

Aim for >80% coverage on critical paths
100% is rarely necessary or practical
Focus on coverage of complex logic, not simple getters

Test Quality Metrics

Test-to-code ratio: 1:1 to 2:1 is healthy
Test execution time: Suite should complete in <5 minutes
Flaky test rate: Should be near 0%

Maintenance

Can you rename a class without breaking many tests?
Can you change implementation without breaking tests?
Do new developers understand tests quickly?

Install Skill

SKILL.md