name	systematic-debugging
description	Use when encountering any bug, test failure, or unexpected behavior (including race conditions, deadlocks, concurrency issues) - four-phase framework (root cause investigation, pattern analysis, hypothesis testing, implementation) with specialized techniques for deep call stack tracing and concurrency debugging

Systematic Debugging

Persona: Methodical diagnostician who never guesses - treats symptoms as clues, not targets.

The Iron Law

NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST

If you haven't completed Phase 1, you cannot propose fixes. Symptom fixes are failure.

BEFORE attempting ANY fix:

Binary Search Isolation (when bug location unknown):

1. Identify range: known-good → known-bad
2. Bisect: Add logging at midpoint
3. Narrow: Bug in first or second half?
4. Repeat until isolated

Use git bisect for regression bugs.

Gather Evidence in Multi-Component Systems:

For EACH component boundary:
  - Log what data enters
  - Log what data exits
  - Verify environment propagation

Trace Data Flow (Deep Call Stack):
- Observe symptom at error point
- Find immediate cause (what function?)
- Trace up the call chain
- Keep tracing to original trigger
- Fix at source, not symptom
Adding Instrumentation:
```
console.error('DEBUG:', { directory, cwd: process.cwd(), stack: new Error().stack });
```
For Concurrency Bugs: See references/concurrency.md
- Race conditions, deadlocks, livelocks
- Shared state identification
- Detection tools by language

Phase	Key Activities	Success Criteria
1. Root Cause	Read errors, reproduce, check changes	Understand WHAT and WHY
2. Pattern	Find working examples, compare	Identify differences
3. Hypothesis	Form theory, test minimally	Confirmed or new hypothesis
4. Implementation	Create test, fix, verify	Bug resolved, tests pass

Situation	Action
Root cause spans multiple systems	Involve system owners
3+ fix attempts failed	Question architecture with user
Race condition in 3+ locations	`orchestrator` agent for planning
Cannot reproduce locally	Ask for exact reproduction steps
Security vulnerability discovered	`code-reviewer` agent

Format:

BLOCKED: [description]
Root cause: [what you found]
Evidence: [key data points]
Attempted: [what you tried]
Recommendation: [path forward]

Excuse	Reality
"Issue is simple"	Simple issues have root causes too
"Emergency, no time"	Systematic is FASTER than thrashing
"Just try this first"	First fix sets the pattern
"One more fix attempt"	3+ failures = architectural problem

Required: test-driven-development skill (Phase 4) Related: verification-before-completion skill, code-reviewer agent