name	debugging
description	Systematic analysis for debugging. Use when encountering errors, bugs, or unexpected behaviors.

Systematic Debugging

Random fixes waste time and create new bugs. Quick patches mask underlying issues.

Core principle: ALWAYS find root cause before attempting fixes. Symptom fixes are failure.

The Iron Law

NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST

If you haven't completed Phase 1, you cannot propose fixes.

The Four Phases

Complete each phase before proceeding to the next.

Phase 1: Root Cause Investigation

BEFORE attempting ANY fix:

Read Error Messages Carefully
- Read stack traces completely
- Note line numbers, file paths, error codes
- They often contain the exact solution
Reproduce Consistently
- Can you trigger it reliably?
- If not reproducible → gather more data, don't guess
Check Recent Changes
- Git diff, recent commits
- New dependencies, config changes
Gather Evidence in Multi-Component Systems
- Log what data enters/exits each component boundary
- Run once to gather evidence showing WHERE it breaks
- THEN investigate that specific component
Trace Data Flow
- Where does bad value originate?
- Keep tracing up until you find the source
- Fix at source, not at symptom

Phase 2: Pattern Analysis

Find Working Examples - Locate similar working code in same codebase
Compare Against References - Read reference implementation COMPLETELY, don't skim
Identify Differences - List every difference, however small
Understand Dependencies - What settings, config, environment does it need?

Phase 3: Hypothesis and Testing

Form Single Hypothesis
- State clearly: "I think X is the root cause because Y"
- Be specific, not vague
Test Minimally
- Make the SMALLEST possible change to test hypothesis
- One variable at a time
Verify Before Continuing
- Did it work? Yes → Phase 4
- Didn't work? Form NEW hypothesis
- DON'T add more fixes on top

Phase 4: Implementation

Create Failing Test Case
- Simplest possible reproduction
- MUST have before fixing
Implement Single Fix
- Address the root cause identified
- ONE change at a time
- No "while I'm here" improvements
Verify Fix
- Test passes now?
- No other tests broken?
If Fix Doesn't Work
- If < 3 attempts: Return to Phase 1, re-analyze
- If ≥ 3 attempts: STOP and question the architecture
If 3+ Fixes Failed: Question Architecture
- Is this pattern fundamentally sound?
- Should we refactor architecture vs. continue fixing symptoms?
- Discuss with your human partner before attempting more fixes

Red Flags - STOP and Return to Phase 1

If you catch yourself:

"Quick fix for now, investigate later"
"Just try changing X and see if it works"
"It's probably X, let me fix that"
"I don't fully understand but this might work"
Proposing solutions before tracing data flow
"One more fix attempt" (when already tried 2+)

ALL of these mean: STOP. Return to Phase 1.

Common Rationalizations

Excuse	Reality
"Issue is simple"	Simple issues have root causes too
"Emergency, no time"	Systematic is FASTER than thrashing
"I'll write test after"	Untested fixes don't stick
"Multiple fixes saves time"	Can't isolate what worked

Output Format

When using this skill, structure your response as:

## Debugging: [Brief issue description]

### Phase 1: Root Cause Investigation
- Error message: [key details]
  → What does the error actually say? Read the full stack trace.
- Reproduction: [steps or "not yet reproducible"]
  → Can you trigger it reliably? What are the exact steps?
- Recent changes: [relevant changes]
  → What changed in git? New dependencies? Config changes?
- Evidence gathered: [what you found]
  → Where exactly does the data flow break?

### Phase 2: Pattern Analysis
- Working example: [similar code that works]
  → Is there similar code in the codebase that works correctly?
- Key differences: [what's different]
  → What differs between working and broken code?

### Phase 3: Hypothesis
"I believe [X] is the root cause because [Y]"
  → Be specific. Vague hypotheses lead to vague fixes.

Minimal test: [smallest change to verify]
  → What's the ONE thing you can change to test this hypothesis?

### Phase 4: Implementation
- Test case: [the failing test]
  → Does a test exist that reproduces this bug?
- Fix: [the actual fix]
  → ONE change addressing the root cause.
- Verification: [test results]
  → Does the test pass? Are other tests still green?

Example: Real Debugging Session

Bug: "API returns 500 error when creating a user with special characters in name"

Phase 1: Root Cause Investigation

Error message: 500 Internal Server Error with stack trace pointing to UserService.create() line 47: TypeError: Cannot read property 'normalize' of undefined
Reproduction: POST /api/users with { "name": "José García" } → 500 error. Works with { "name": "John Smith" }.
Recent changes: Commit a1b2c3d added Unicode normalization for names 2 days ago.
Evidence: The normalize method is called on config.unicodeForm, but config is undefined when the feature flag UNICODE_SUPPORT is disabled.

Phase 2: Pattern Analysis

Working example: ProductService.create() also uses Unicode normalization but checks if config exists first.
Key differences: ProductService has if (config?.unicodeForm) guard. UserService assumes config is always present.

Phase 3: Hypothesis

"I believe the root cause is that UserService.create() doesn't guard against undefined config when UNICODE_SUPPORT feature flag is disabled, because the config object is only initialized when the flag is enabled."

Minimal test: Add optional chaining config?.unicodeForm at line 47.

Phase 4: Implementation

Test case:

test_creates_user_with_special_chars_when_unicode_disabled:
  disableFeatureFlag('UNICODE_SUPPORT')
  result = userService.create({ name: 'José García' })
  assert(result.name == 'José García')

Fix: Changed config.unicodeForm to config?.unicodeForm ?? 'NFC'
Verification: New test passes. All 47 existing tests still green.

Quick Reference

Phase	Key Activities	Success Criteria
1. Root Cause	Read errors, reproduce, gather evidence	Understand WHAT and WHY
2. Pattern	Find working examples, compare	Identify differences
3. Hypothesis	Form theory, test minimally	Confirmed or new hypothesis
4. Implementation	Create test, fix, verify	Bug resolved, tests pass

Real-World Impact

Systematic approach: 15-30 minutes to fix
Random fixes approach: 2-3 hours of thrashing
First-time fix rate: 95% vs 40%
New bugs introduced: Near zero vs common

debugging

Install Skill

SKILL.md

Systematic Debugging

The Iron Law

The Four Phases

Phase 1: Root Cause Investigation

Phase 2: Pattern Analysis

Phase 3: Hypothesis and Testing

Phase 4: Implementation

Red Flags - STOP and Return to Phase 1

Common Rationalizations

Output Format

Example: Real Debugging Session

Phase 1: Root Cause Investigation

Phase 2: Pattern Analysis

Phase 3: Hypothesis

Phase 4: Implementation

Quick Reference

Real-World Impact