name	systematic-debugging
description	Use when encountering any bug, test failure, or unexpected behavior - four-phase framework (root cause investigation, pattern analysis, hypothesis testing, implementation) that ensures understanding before attempting solutions

Systematic Debugging

Overview

Random fixes waste time and create new bugs. Quick patches mask underlying issues.

Core principle: ALWAYS find root cause before attempting fixes. Symptom fixes are failure.

The Iron Law

NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST

If you haven't completed Phase 1, you cannot propose fixes.

When to Use

Use for ANY technical issue:

Test failures
Bugs in production
Unexpected behavior
Performance problems
Build failures
Integration issues

Use this ESPECIALLY when:

Under time pressure (emergencies make guessing tempting)
"Just one quick fix" seems obvious
You've already tried multiple fixes
Previous fix didn't work

The Four Phases

Complete each phase before proceeding to the next.

Phase 1: Root Cause Investigation

BEFORE attempting ANY fix:

Read Error Messages Carefully
- Don't skip past errors or warnings
- Read stack traces completely
- Note line numbers, file paths, error codes
Reproduce Consistently
- Can you trigger it reliably?
- What are the exact steps?
- If not reproducible → gather more data, don't guess
Check Recent Changes
- Git diff, recent commits
- New dependencies, config changes
- Environmental differences

Gather Evidence in Multi-Component Systems

For EACH component boundary:
  - Log what data enters component
  - Log what data exits component
  - Verify environment/config propagation

Run once to gather evidence showing WHERE it breaks
THEN investigate that specific component

Trace Data Flow
- Where does bad value originate?
- What called this with bad value?
- Keep tracing up until you find the source
- Fix at source, not at symptom
SUB-SKILL: Use root-cause-tracing for backward tracing technique

Phase 2: Pattern Analysis

Find Working Examples
- Locate similar working code in same codebase
- What works that's similar to what's broken?
Compare Against References
- Read reference implementation COMPLETELY
- Don't skim - read every line
Identify Differences
- What's different between working and broken?
- List every difference, however small
Understand Dependencies
- What other components does this need?
- What settings, config, environment?

Phase 3: Hypothesis and Testing

Form Single Hypothesis
- State clearly: "I think X is the root cause because Y"
- Be specific, not vague
Test Minimally
- Make the SMALLEST possible change to test hypothesis
- One variable at a time
- Don't fix multiple things at once
Verify Before Continuing
- Did it work? Yes → Phase 4
- Didn't work? Form NEW hypothesis
- DON'T add more fixes on top

Phase 4: Implementation

Create Failing Test Case
- Simplest possible reproduction
- Automated test if possible
- SUB-SKILL: Use test-driven-development
Implement Single Fix
- Address the root cause identified
- ONE change at a time
- No "while I'm here" improvements
Verify Fix
- Test passes now?
- No other tests broken?
If 3+ Fixes Failed: Question Architecture
- Each fix reveals new problem in different place?
- Fixes require "massive refactoring"?
- STOP and discuss with user before attempting more fixes

Red Flags - STOP and Follow Process

If you catch yourself thinking:

"Quick fix for now, investigate later"
"Just try changing X and see if it works"
"I don't fully understand but this might work"
Proposing solutions before tracing data flow

ALL of these mean: STOP. Return to Phase 1.

Quick Reference

Phase	Key Activities	Success Criteria
1. Root Cause	Read errors, reproduce, gather evidence	Understand WHAT and WHY
2. Pattern	Find working examples, compare	Identify differences
3. Hypothesis	Form theory, test minimally	Confirmed or rejected
4. Implementation	Create test, fix, verify	Bug resolved, tests pass

Integration

Required sub-skills:

root-cause-tracing - When error is deep in call stack
test-driven-development - For creating failing test case

Complementary skills:

defense-in-depth - Add validation at multiple layers
condition-based-waiting - Replace arbitrary timeouts
verification-before-completion - Verify fix before claiming success

systematic-debugging

Install Skill

SKILL.md