| name | test-driven-development |
| description | eXtreme Programming Test-Driven Development (XP-style TDD) Skill. This skill provides both reference documentation AND execution capabilities for the full PLAN -> RED -> GREEN -> REFACTOR workflow. |
eXtreme Programming: Test-Driven Development (XP-style TDD) Skill
Capability
This skill enables test-driven development (TDD) using the Red-Green-Refactor cycle. It supports comprehensive language coverage (Rust, Python, TypeScript, Go, Java, C#, C++) with language-specific test runners, conventions, and best practices.
- Test Design: Plan test architecture before implementation
- Red-Green-Refactor: Execute the complete TDD cycle
- Multi-language Support: 7+ languages with appropriate frameworks
- Property-based Testing: Enhance with generative testing
When to Use
- Implementing new features with high confidence
- Ensuring code correctness from the start
- Building maintainable, well-tested codebases
- Preventing regressions
- Documenting behavior through tests
- Practicing extreme programming (XP)
Workflow Overview
[<start>Start] -> [Phase 1: PLAN]
[Phase 1: PLAN|
Design test architecture
Identify edge cases
Map requirements to tests
] -> [Phase 2: RED]
[Phase 2: RED|
Write failing test
Verify test fails meaningfully
] -> [Phase 3: GREEN]
[Phase 3: GREEN|
Implement minimal code
Make test pass
] -> [Phase 4: REFACTOR]
[Phase 4: REFACTOR|
Clean up code
Maintain green tests
] -> [More Features?]
[More Features?] yes -> [Phase 2: RED]
[More Features?] no -> [<end>Complete]
Phase 1: PLAN (Test Design)
Process
Understand Requirements
- Parse user's task/requirement
- Identify testable behaviors and edge cases
- Use sequential-thinking to plan Red-Green-Refactor cycles
- Map requirements to test cases (error-first approach)
Artifact Detection (Conditional)
- Check for existing test artifacts:
fd -g '*test*' -g '*spec*' -e ts -e py -e rs -e java -e go $ARGUMENTS test -f package.json && rg '"vitest"|"jest"|"mocha"' package.json test -f Cargo.toml && rg 'proptest|quickcheck' Cargo.toml - If artifacts exist: analyze coverage gaps, plan extensions
- If no artifacts: proceed to design test suite
- Check for existing test artifacts:
Design Test Architecture
- Design unit tests for core logic
- Plan integration test boundaries
- Identify property-based test candidates
- Output: Test suite design with case signatures
Prepare Run Phase
- Define target:
.outline/tests/or language-specific location - Specify verification: test runner commands
- Create traceability: requirement -> test case -> assertion
- Define target:
Thinking Tool Integration
Use sequential-thinking for:
- Planning Red-Green-Refactor cycles
- Test case prioritization
- Dependency ordering
Use actor-critic-thinking for:
- Evaluating test coverage
- Challenging test effectiveness
- Edge case identification
Use shannon-thinking for:
- Coverage gap analysis
- Flaky test risk assessment
- Property-based test candidates
Test Design Template
// Target: .outline/tests/{module}_test.{ext}
// ============================================
// From requirement: {requirement text}
// ============================================
// Unit Test: {description}
// Arrange: {setup description}
// Act: {action description}
// Assert: {expected outcome}
test_case_1() {
// RED: This test should fail initially
// GREEN: Minimal implementation to pass
// REFACTOR: Improve without breaking
}
// Property-Based Test: {invariant description}
// For all valid inputs, {property} should hold
property_test_1() {
// Generator: {input generation strategy}
// Property: {invariant to verify}
}
// Edge Case: {boundary condition}
edge_case_1() {
// Boundary: {specific edge case}
// Expected: {behavior at boundary}
}
Phase 2: RED (Write Failing Test)
Setup
# Create .outline/tests directory structure
mkdir -p .outline/tests/{unit,integration,property}
Run Tests (Expect Failure)
# TypeScript/JavaScript
npx vitest run || echo "RED: Tests failing as expected"
# Python
pytest .outline/tests/ -v || echo "RED: Tests failing as expected"
# Rust
cargo test || echo "RED: Tests failing as expected"
# Go
go test ./... || echo "RED: Tests failing as expected"
Example: Shopping Cart (Rust)
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_empty_cart_has_zero_total() {
let cart = ShoppingCart::new();
assert_eq!(0, cart.total());
}
#[test]
fn test_adding_item_increases_total() {
let mut cart = ShoppingCart::new();
cart.add_item(Item::new("Widget", 10));
assert_eq!(10, cart.total());
}
#[test]
fn test_quantity_multiplies_price() {
let mut cart = ShoppingCart::new();
cart.add_item(Item::new("Widget", 10).quantity(3));
assert_eq!(30, cart.total());
}
}
Run: cargo test -> FAILS (ShoppingCart doesn't exist)
Phase 3: GREEN (Implement Minimum Code)
Implement Just Enough to Pass
pub struct Item {
name: String,
price: i32,
quantity: i32,
}
impl Item {
pub fn new(name: &str, price: i32) -> Self {
Self { name: name.to_string(), price, quantity: 1 }
}
pub fn quantity(mut self, qty: i32) -> Self {
self.quantity = qty;
self
}
}
pub struct ShoppingCart {
items: Vec<Item>,
}
impl ShoppingCart {
pub fn new() -> Self { Self { items: Vec::new() } }
pub fn add_item(&mut self, item: Item) { self.items.push(item); }
pub fn total(&self) -> i32 {
self.items.iter().map(|item| item.price * item.quantity).sum()
}
}
Verify Tests Pass
# TypeScript/JavaScript
npx vitest run || exit 13
# Python
pytest .outline/tests/ -v || exit 13
# Rust
cargo test || exit 13
# Go
go test ./... || exit 13
Phase 4: REFACTOR (Clean Up)
Improve Code While Maintaining Green Tests
#[derive(Debug, Clone)]
pub struct Item {
name: String,
price: i32,
quantity: i32,
}
impl Item {
pub fn new(name: impl Into<String>, price: i32) -> Self {
Self { name: name.into(), price, quantity: 1 }
}
pub fn with_quantity(mut self, qty: i32) -> Self {
self.quantity = qty;
self
}
pub fn subtotal(&self) -> i32 { self.price * self.quantity }
}
#[derive(Debug, Default)]
pub struct ShoppingCart { items: Vec<Item> }
impl ShoppingCart {
pub fn new() -> Self { Self::default() }
pub fn add_item(&mut self, item: Item) { self.items.push(item); }
pub fn total(&self) -> i32 { self.items.iter().map(Item::subtotal).sum() }
pub fn item_count(&self) -> usize { self.items.len() }
}
Verify Tests Still Pass with Coverage
# TypeScript/JavaScript
npx vitest run --coverage || exit 13
# Python
pytest --cov=$MODULE --cov-report=html || exit 14
# Rust
cargo tarpaulin --out Html || exit 14
# Go
go test -coverprofile=coverage.out ./... || exit 14
Exit Codes
| Code | Meaning | Action |
|---|---|---|
| 0 | All tests pass, coverage met | Continue with next feature |
| 11 | Framework missing | Install test framework |
| 12 | No test files | Run plan phase, create tests |
| 13 | Tests failed | Fix implementation or test |
| 14 | Coverage low | Add more tests |
| 15 | Refactoring broke tests | Revert changes, refactor carefully |
Language Support Matrix
| Language | Test Framework | Command | Watch Mode |
|---|---|---|---|
| Rust | cargo test | cargo test |
cargo watch -x test |
| Python | pytest | pytest |
pytest-watch |
| TypeScript | vitest/jest | vitest run |
vitest --watch |
| Go | go test | go test ./... |
gotestsum --watch |
| Java | JUnit 5 | mvn test |
IDE integration |
| C# | xUnit | dotnet test |
dotnet watch test |
| C++ | GoogleTest | ctest |
fd *.cpp | entr ctest |
Test Framework Matrix
| Language | Unit Test | Property Test | Mock |
|---|---|---|---|
| Rust | cargo test | proptest | mockall |
| Python | pytest | hypothesis | pytest-mock |
| TypeScript | vitest | fast-check | vi.mock |
| Go | go test | gopter | gomock |
| Java | JUnit 5 | jqwik | Mockito |
| C# | xUnit | FsCheck | Moq |
| C++ | GoogleTest | rapidcheck | GMock |
Commands Reference
Basic Commands
cargo test # Rust
pytest # Python
vitest run # TypeScript
go test ./... # Go
mvn test # Java
dotnet test # C#
ctest # C++
Intermediate Commands
cargo test test_name -- --show-output
pytest --cov=module --cov-report=term
vitest --coverage --watch
go test -v -cover ./...
mvn test -Dtest=TestClass#method
dotnet test --collect:"XPlat Code Coverage"
ctest --verbose --output-on-failure
Advanced Commands
cargo watch -x 'tarpaulin --out Lcov --all-features'
pytest-watch -v -m "not slow" -n auto
vitest --ui --typecheck --browser.enabled=true
gotestsum --watch -- -race -bench=. ./...
mvn verify pitest:mutationCoverage
dotnet watch test --logger "console;verbosity=detailed"
ctest --repeat until-fail:10 --parallel 4
Best Practices
1. Test Naming
Rust: test_<feature>_<condition>_<expected>
Python: test_<feature>_when_<condition>_then_<expected>
TypeScript: 'should <expected> when <condition>'
Go: Test<Feature><Condition>
Java: @DisplayName("Should <expected> when <condition>")
C#: Should<Expected>When<Condition>
C++: TEST(Suite, Should<Expected>When<Condition>)
2. AAA Pattern
def test_discount_applies_to_total():
# ARRANGE
cart = ShoppingCart()
cart.add_item(Item("Widget", 100))
discount = Discount(percent=10)
# ACT
total = cart.calculate_total(discount)
# ASSERT
assert total == 90
3. One Assertion Per Test
4. Test Independence
5. Fast Tests
- Avoid file I/O
- Mock external dependencies
- Use in-memory databases
- Parallelize execution
Troubleshooting Guide
Common Issues
| Symptom | Cause | Resolution |
|---|---|---|
| Exit 11 | Test framework missing | Install: npm i -D vitest, pip install pytest, cargo add --dev proptest |
| Exit 12 | No test files | Run plan phase first |
| Exit 13 | Tests failed | Debug specific test |
| Exit 14 | Coverage too low | Add tests for uncovered paths |
| Flaky test | Non-determinism | Fix time/random/network dependencies |
| Slow tests | Too much I/O or sleep | Use mocks, reduce wait times |
| Test passes alone, fails in suite | Shared state pollution | Isolate test fixtures |
Flaky Test Detection
# Run multiple times to detect flakiness
for i in {1..10}; do npm test 2>&1 | tail -1; done
Debugging Commands
# Run single test
pytest test_module.py::TestClass::test_method -v
npx vitest run -t "specific test name"
cargo test test_name -- --nocapture
# Run with debugging output
pytest -v --tb=long --capture=no
npx vitest run --reporter=verbose
cargo test -- --nocapture
# Find slow tests
pytest --durations=10
Property-Based Testing
Rust (proptest)
use proptest::prelude::*;
proptest! {
#[test]
fn test_quantity_always_positive(qty in 1i32..1000) {
let item = Item::new("Widget", 10).with_quantity(qty);
assert!(item.subtotal() > 0);
}
}
Python (hypothesis)
from hypothesis import given
import hypothesis.strategies as st
@given(st.integers(min_value=1, max_value=1000))
def test_quantity_always_positive(qty):
item = Item("Widget", 10, quantity=qty)
assert item.subtotal() > 0
TypeScript (fast-check)
import * as fc from 'fast-check';
it('quantity always gives positive subtotal', () => {
fc.assert(
fc.property(fc.integer({min: 1, max: 1000}), (qty) => {
const item = new Item('Widget', 10, qty);
expect(item.subtotal()).toBeGreaterThan(0);
})
);
});
When NOT to Use Test-Driven Development
| Scenario | Better Alternative |
|---|---|
| Proving mathematical invariants | Proof-driven (Lean 4) |
| Ensuring type safety | Type-driven (Idris 2) |
| Verifying state machine correctness | Validation-first (Quint) |
| Exploratory prototyping | Spike first, then write tests |
| UI visual testing | Screenshot/visual regression tools |
| Already have formal specs | Generate tests from specs |
Complementary Approaches
- TDD + Property-based: Unit tests for examples, properties for invariants
- TDD + Contract: Tests verify behavior, contracts enforce interfaces
- TDD + Proof-driven: Tests explore, proofs guarantee critical properties