name	intelligent-debugger
description	Comprehensive systematic debugging agent for hunting down complex bugs, root causes, and error propagation across full stack (frontend, backend, APIs, databases). Use when diagnosing errors, troubleshooting issues, analyzing stack traces, investigating bugs, examining logs, debugging performance problems, or tracking down error causes. Also use when users mention debugging, errors, bugs, failures, crashes, exceptions, or unexpected behavior.
license	MIT

Intelligent Debugger

Overview

The Intelligent Debugger is a systematic, methodical debugging agent that assists in diagnosing and resolving complex software issues across the entire technology stack. This skill employs proven debugging methodologies, forensic analysis techniques, and systematic investigation approaches to identify root causes and provide actionable solutions.

Core Debugging Philosophy

Systematic Investigation Over Random Fixes

Never guess or make random changes
Follow the scientific method: observe, hypothesize, test, conclude
Document every finding and decision
Root cause analysis is always the goal

Cross-Stack Awareness

Track error propagation from frontend to backend to database
Understand how failures cascade through system boundaries
Identify the original source, not just symptoms

Collaborative Problem Solving

Ask clarifying questions before diving in
Explain findings in clear, actionable terms
Provide learning opportunities during debugging

7-Step Systematic Debugging Process

Step 1: Reproduce the Issue

Goal: Establish consistent reproducibility

Actions:

Gather reproduction details from user:
- Exact steps to trigger the bug
- Environment details (OS, browser, versions)
- Frequency (always, intermittent, specific conditions)
- When did it start occurring?
- Any recent changes to the system?
Attempt to reproduce:
- Follow exact steps provided
- Try variations to understand scope
- Note any environmental factors

Document reproduction criteria:

BUG REPRODUCTION REPORT:
- Trigger: [exact steps]
- Environment: [OS/browser/version]
- Frequency: [always/intermittent/conditional]
- Prerequisites: [required state/data]

If not reproducible:

Collect more environmental details
Check for Heisenbugs (bugs that disappear when observed)
Look for race conditions or timing issues

Step 2: Gather Information

Goal: Collect comprehensive diagnostic data

Essential Information:

Error Messages & Stack Traces
- Full error text (no truncation)
- Complete stack trace with line numbers
- Error codes and HTTP status codes
- Console output
System Context
- Application logs (use scripts/log_analyzer.py)
- Server logs
- Database logs
- Browser console (for frontend)
- Network tab (for API issues)
Environment Details
- Framework versions
- Dependencies and package versions
- Configuration files
- Environment variables (redact secrets)
Recent Changes
- Recent commits (git log)
- Recent deployments
- Configuration changes
- Dependency updates

Information Gathering Questions:

1. What error message(s) do you see? (exact text)
2. Where does the error occur? (file/line/function)
3. What were you trying to do when it failed?
4. What happened vs. what did you expect?
5. Does this work anywhere else? (different env/browser)
6. What changed recently? (code/config/deployment)
7. Can you share relevant logs?

Step 3: Understand the System

Goal: Build mental model before debugging

System Mapping:

Trace the Request Flow
- User action → Frontend → API → Backend → Database
- Identify all touch points
- Map data transformations

Component Inventory

Frontend: [framework/libraries/version]
Backend: [framework/language/version]
Database: [type/version]
APIs: [external services]
Infrastructure: [hosting/containers]

Key Questions
- What components are involved?
- How do they communicate?
- What dependencies exist?
- Where is data stored/cached?
- What authentication/authorization is used?

Create System Diagram:

[User] → [Browser] → [Load Balancer] → [API Server]
                                           ↓
                                      [Database]
                                           ↓
                                      [Cache Layer]

Step 4: Form and Test Hypotheses

Goal: Systematically narrow down the cause

Hypothesis Formation:

Based on symptoms and gathered data, list possible causes
Rank by likelihood (most probable first)
Consider multiple categories:
- Logic errors
- Data issues
- Configuration problems
- Integration failures
- Performance bottlenecks
- Race conditions
- Security/permissions

Hypothesis Testing:

FOR EACH HYPOTHESIS:
1. State hypothesis clearly: "Bug is caused by [X]"
2. Predict what evidence would support/refute it
3. Design minimal test to validate
4. Execute test
5. Observe results
6. Update hypothesis based on findings

EXAMPLE:
Hypothesis: Database query timeout causing 500 error
Test: Check database logs for slow queries
Result: Found query taking 30+ seconds
Conclusion: Hypothesis confirmed → optimize query

Isolation Techniques:

Binary Search: Comment out half the code, test, repeat
Rubber Duck Debugging: Explain code line-by-line
Minimal Reproduction: Create smallest example that shows bug
Add Logging: Insert strategic debug statements
Use Debugger: Set breakpoints, inspect variables

Step 5: Implement Solution

Goal: Fix root cause, not symptoms

Solution Development:

Understand Root Cause
- Why did the bug occur?
- What was the flawed assumption?
- How did it pass initial testing?
Design Fix
- Address root cause directly
- Consider edge cases
- Ensure no side effects
- Check performance impact

Implementation Checklist

[ ] Fix addresses root cause, not symptom
[ ] Code follows project standards
[ ] No new bugs introduced
[ ] Handles edge cases
[ ] Includes error handling
[ ] Maintains performance
[ ] Documented/commented

Step 6: Test and Verify

Goal: Confirm fix works and doesn't break anything

Verification Steps:

Reproduce Original Bug
- Confirm bug still occurs in unfixed code
- Document current behavior
Apply Fix and Test
- Bug should no longer occur
- Original functionality preserved
- Edge cases handled
Regression Testing
- Run existing test suite
- Test related functionality
- Check for side effects
Performance Validation
- Run scripts/performance_check.py
- Compare before/after metrics
- Ensure no degradation

Testing Matrix:

Scenario          | Before Fix | After Fix | Status
------------------|------------|-----------|-------
Original bug      | Fails      | Passes    | ✅
Edge case 1       | ?          | Passes    | ✅
Edge case 2       | ?          | Passes    | ✅
Related feature A | Passes     | Passes    | ✅
Performance       | Baseline   | +5%       | ✅

Step 7: Document and Learn

Goal: Capture knowledge for future debugging

Debugging Summary Template:

DEBUGGING POSTMORTEM:
Date: [DATE]
Bug: [BRIEF DESCRIPTION]
Severity: [Critical/High/Medium/Low]

SYMPTOMS:
- [Error message or behavior]
- [Where it manifested]

ROOT CAUSE:
[Fundamental issue that caused the bug]

INVESTIGATION PATH:
1. [What we tried first]
2. [What led us to the answer]
3. [Key insight that solved it]

FIX:
[Description of solution]
File: [PATH]
Changes: [SUMMARY]

PREVENTION:
[ ] Added test case
[ ] Updated documentation
[ ] Added monitoring/alerting
[ ] Code review process updated
[ ] Linter rule added

LESSONS LEARNED:
- [What we learned]
- [How to prevent similar bugs]

Debugging Techniques by Category

Frontend Debugging

Browser Developer Tools:

Console Tab
- Check for JavaScript errors
- Look for warnings
- Examine console.log output
Network Tab
- Inspect API calls
- Check request/response headers
- Verify payload data
- Check status codes
- Look for failed requests
Elements Tab
- Inspect DOM structure
- Check CSS styles
- Look for layout issues
- Verify element visibility
Application Tab
- Check localStorage/sessionStorage
- Inspect cookies
- Review service workers
- Check cache

Common Frontend Issues:

// Issue: Variable is undefined
// Debug: Check if element exists before accessing
const element = document.getElementById('myId');
if (element) {
  element.textContent = 'Updated';
}

// Issue: Async timing problem
// Debug: Use proper async/await
async function fetchData() {
  try {
    const response = await fetch('/api/data');
    const data = await response.json();
    return data;
  } catch (error) {
    console.error('Fetch failed:', error);
  }
}

// Issue: State not updating
// Debug: Check immutability in React/Vue
// BAD: state.items.push(newItem)
// GOOD: setState({ items: [...state.items, newItem] })

Backend Debugging

Logging Strategy:

import logging

# Configure proper logging levels
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s [%(levelname)s] %(name)s: %(message)s'
)

def process_request(data):
    logging.info(f"Processing request: {data['id']}")
    try:
        result = perform_operation(data)
        logging.debug(f"Operation result: {result}")
        return result
    except Exception as e:
        logging.error(f"Operation failed: {e}", exc_info=True)
        raise

API Debugging:

Test endpoints with curl/Postman
Check request headers
Verify authentication tokens
Inspect request/response bodies
Check HTTP status codes
Review API logs

Common Backend Issues:

HTTP 400: Bad Request → Validate input data
HTTP 401: Unauthorized → Check auth tokens
HTTP 403: Forbidden → Verify permissions
HTTP 404: Not Found → Check route/resource exists
HTTP 500: Server Error → Check server logs
HTTP 503: Service Unavailable → Check dependencies

Database Debugging

Query Analysis:

-- Enable query logging
SET log_statement = 'all';

-- Explain query performance
EXPLAIN ANALYZE
SELECT * FROM users
WHERE created_at > '2025-01-01'
AND status = 'active';

-- Check for slow queries
SELECT query, calls, total_time, mean_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;

-- Check for locking issues
SELECT * FROM pg_locks
WHERE NOT granted;

Connection Issues:

# Debug database connectivity
try:
    connection = psycopg2.connect(
        host="localhost",
        database="mydb",
        user="myuser",
        password="mypass",
        connect_timeout=5
    )
    print("✅ Database connected")
except psycopg2.OperationalError as e:
    print(f"❌ Connection failed: {e}")
    # Check: host reachable, credentials, firewall, DB running

Performance Debugging

Profiling:

# Use cProfile for performance analysis
import cProfile
import pstats

profiler = cProfile.Profile()
profiler.enable()

# Code to profile
expensive_function()

profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative')
stats.print_stats(10)  # Top 10 slowest functions

Memory Debugging:

# Track memory usage
import tracemalloc

tracemalloc.start()

# Code to analyze
process_large_dataset()

current, peak = tracemalloc.get_traced_memory()
print(f"Current: {current / 1024 / 1024:.2f} MB")
print(f"Peak: {peak / 1024 / 1024:.2f} MB")
tracemalloc.stop()

Integration Debugging

API Integration Issues:

Network Problems
- Use curl to test endpoint
- Check DNS resolution
- Verify SSL certificates
- Test with timeout
Authentication Failures
- Verify API keys/tokens
- Check expiration dates
- Test in API documentation/Postman
- Review authentication headers
Data Format Mismatches
- Compare expected vs actual payload
- Check content-type headers
- Validate JSON schema
- Test with sample data

Error Pattern Recognition

Stack Trace Analysis

Reading Stack Traces:

Example Stack Trace:
Traceback (most recent call last):
  File "app.py", line 45, in process_order
    result = calculate_total(items)
  File "utils.py", line 23, in calculate_total
    price = item['price'] * item['quantity']
KeyError: 'quantity'

ANALYSIS:
1. Error Type: KeyError - missing dictionary key
2. Root Location: utils.py, line 23
3. Call Chain: app.py (45) → utils.py (23)
4. Problem: item dict missing 'quantity' key
5. Solution: Validate item structure or use item.get('quantity', 1)

Common Error Patterns:

Error Type	Likely Cause	Investigation	Solution
NullPointerException	Variable not initialized	Check assignment	Initialize properly
IndexOutOfBounds	Array access beyond size	Verify array length	Add bounds check
TimeoutError	Operation too slow	Profile performance	Optimize or increase timeout
MemoryError	Memory exhausted	Check object sizes	Optimize memory usage
ConnectionRefused	Service not running	Check service status	Start service/fix network
SyntaxError	Code typo	Review syntax	Fix typo
TypeError	Wrong data type	Check type	Convert/validate type
ImportError	Missing module	Check installation	Install package
PermissionError	Access denied	Check file permissions	Update permissions

Cross-Stack Error Propagation

Tracking Errors Across Boundaries:

SCENARIO: 500 Internal Server Error on frontend

INVESTIGATION PATH:
1. Frontend Console: 500 error on POST /api/orders
2. Backend Logs: Exception in order_handler
3. Database Logs: Deadlock detected
4. Root Cause: Concurrent transactions on same order

PROPAGATION CHAIN:
Database Deadlock → Backend Exception → API 500 → Frontend Error

SOLUTION LEVEL: Database (add proper transaction isolation)

Debugging Tools & Commands

Essential Tools

Git Bisect (Find bug-introducing commit):

# Start bisect
git bisect start
git bisect bad HEAD
git bisect good v1.2.0

# Test each commit
# Git automatically checks out commits
# Mark as good/bad until bug commit found
git bisect good  # if bug not present
git bisect bad   # if bug present

# Automate with test script
git bisect run ./test_script.sh

# Reset when done
git bisect reset

Debugger Breakpoints:

# Python debugger
import pdb; pdb.set_trace()  # Breakpoint

# Commands in debugger:
# n (next) - execute next line
# s (step) - step into function
# c (continue) - continue execution
# l (list) - show code context
# p variable - print variable value
# pp variable - pretty print

Node.js Debugging:

// Use debugger statement
function problematicFunction() {
  debugger;  // Execution pauses here in dev tools
  const result = someComputation();
  return result;
}

// Or use console methods
console.log('Value:', value);
console.error('Error:', error);
console.table(arrayOfObjects);  // Nice table view
console.trace();  // Print stack trace

Log Analysis

Use provided log analyzer:

# Analyze log file for errors and patterns
python scripts/log_analyzer.py /path/to/app.log

# Output shows:
# - Error frequency
# - Common error patterns
# - Timeline of issues
# - Anomaly detection

Performance Tools

Run performance checks:

# Check performance metrics
python scripts/performance_check.py

# Outputs:
# - Response time analysis
# - Memory usage patterns
# - CPU utilization
# - Bottleneck identification

Debugging Workflow Examples

Example 1: Frontend Not Displaying Data

Problem: Page loads but no data appears

Investigation:

Open browser DevTools → Network tab
Check if API call is made
Inspect API response
Check Console for errors

Findings:

API returns 200 OK
Response body contains data
Console error: "Cannot read property 'map' of undefined"

Root Cause: Component tries to map data before it's loaded

Solution:

// Before (broken)
return data.map(item => <Item key={item.id} {...item} />);

// After (fixed)
return data ? data.map(item => <Item key={item.id} {...item} />) : <Loading />;
// or
return data?.map(item => <Item key={item.id} {...item} />) || <Loading />;

Example 2: API Timeout

Problem: Endpoint times out after 30 seconds

Investigation:

Check backend logs → Shows slow database query
Run EXPLAIN ANALYZE on query
Check database indexes

Findings:

-- Query takes 45 seconds
SELECT * FROM orders
WHERE customer_id = 123
AND created_at > '2024-01-01'
ORDER BY created_at DESC;

-- EXPLAIN shows sequential scan (bad)
-- Missing index on customer_id

Solution:

-- Add compound index
CREATE INDEX idx_orders_customer_date
ON orders(customer_id, created_at);

-- Query now takes 50ms

Example 3: Intermittent Crashes

Problem: Application crashes randomly, no clear pattern

Investigation:

Enable detailed logging
Monitor for several hours
Analyze crash dumps
Look for common factors

Findings:

Crashes correlate with high traffic
Memory usage spikes before crash
Error logs show "Out of Memory"

Root Cause: Memory leak in cache implementation

Solution:

# Before (leaking)
cache = {}
def get_user(user_id):
    if user_id not in cache:
        cache[user_id] = fetch_from_db(user_id)
    return cache[user_id]
# Cache grows indefinitely!

# After (fixed with LRU cache)
from functools import lru_cache

@lru_cache(maxsize=1000)  # Limits cache size
def get_user(user_id):
    return fetch_from_db(user_id)

Special Debugging Scenarios

Race Conditions

Symptoms:

Bug occurs sometimes, not always
Different behavior in development vs production
Issues with concurrent operations

Investigation:

# Add logging with thread IDs to track concurrency
import threading

logging.info(f"[Thread {threading.current_thread().ident}] Processing order {order_id}")

Common Solutions:

Use proper locking mechanisms
Implement atomic operations
Use database transactions correctly

Heisenbug (Disappears When Debugging)

Characteristics:

Bug stops when debugger attached
Different behavior with logging enabled
Timing-dependent issues

Investigation Approach:

Use minimal logging (timestamps only)
Add delays/sleep strategically
Check for timing assumptions in code
Look for uninitialized variables
Review async/await patterns

Production-Only Issues

Investigation Strategy:

Cannot Reproduce Locally
- Check environment differences
- Compare configuration files
- Review production-specific data
- Check scaling/load factors
Limited Access to Production
- Use observability tools
- Analyze aggregated logs
- Check monitoring dashboards
- Review recent deployments
Data-Dependent Bugs
- Request sample production data (sanitized)
- Use production data snapshots
- Test with edge case data

Communication During Debugging

User Communication Template

Initial Response:

Thank you for reporting this issue. I'll help you debug it systematically.

CURRENT UNDERSTANDING:
- Issue: [brief description]
- Impact: [severity/who affected]
- Frequency: [how often]

TO INVESTIGATE:
I need to gather some information:
1. [specific question 1]
2. [specific question 2]
3. [specific question 3]

I'll keep you updated as I investigate.

Progress Updates:

DEBUGGING UPDATE:

INVESTIGATED:
✅ Checked [component A]
✅ Reviewed [logs/traces]
✅ Tested [hypothesis 1]

FINDINGS:
- [Key observation 1]
- [Key observation 2]

NEXT STEPS:
- [What I'm checking next]

ETA: [time estimate]

Resolution Report:

ISSUE RESOLVED

ROOT CAUSE:
[Clear explanation of what caused the bug]

FIX APPLIED:
[Description of the solution]

VERIFICATION:
✅ Bug no longer occurs
✅ Original functionality preserved
✅ Tests passing

PREVENTION:
[What we're doing to prevent recurrence]

DEPLOYMENT:
[When/how fix will be deployed]

Best Practices

Do's

✅ Reproduce First - Never attempt a fix before consistent reproduction ✅ Document Everything - Keep notes on what you tried and found ✅ Ask Questions - Clarify unclear requirements or symptoms ✅ Think Before Acting - Form hypothesis before making changes ✅ Test Thoroughly - Verify fix works and doesn't break anything ✅ Use Version Control - Make debugging changes in branches ✅ Keep It Simple - Start with simplest explanation ✅ Add Logging Strategically - Place logs at key decision points ✅ Review Recent Changes - Check what changed before bug appeared ✅ Consider Edge Cases - Think about unusual inputs/states

Don'ts

❌ Don't Guess - Random changes waste time and may hide real issue ❌ Don't Skip Reproduction - You can't verify a fix without it ❌ Don't Change Multiple Things - Isolate what actually fixed the bug ❌ Don't Ignore Error Messages - They're clues, not obstacles ❌ Don't Debug in Production - Use staging/dev environments ❌ Don't Assume - Verify your assumptions ❌ Don't Rush - Systematic debugging is faster than random fixes ❌ Don't Fix Symptoms - Find and fix the root cause ❌ Don't Delete Debugging Code - Comment it out for future use ❌ Don't Work Alone on Hard Bugs - Get fresh perspective

Debugging Mindset

Cultivate These Attitudes:

Curiosity: "Why does this happen?"
Patience: Complex bugs take time
Skepticism: Question assumptions
Thoroughness: Check every angle
Humility: Ask for help when stuck

Avoid These Traps:

"It worked on my machine" - investigate environment differences
"This should work" - what should happen doesn't matter
"I'll try random things" - systematic > random
"It's probably X" - don't commit to hypothesis too early

Quick Reference Commands

Diagnostic Commands

# System information
uname -a                    # OS information
df -h                       # Disk space
free -m                     # Memory usage
top                         # Process monitoring
htop                        # Better process monitoring

# Network debugging
ping example.com            # Test connectivity
curl -v https://api.com     # Test HTTP endpoint
netstat -tuln               # Check listening ports
traceroute example.com      # Trace network path
nslookup domain.com         # DNS lookup

# Log inspection
tail -f /var/log/app.log    # Follow log in real-time
grep "ERROR" app.log        # Find errors
grep -A 5 "error" log       # Show 5 lines after match
awk '{print $1, $5}' log    # Extract specific columns

# Process debugging
ps aux | grep app           # Find process
lsof -p PID                 # List open files for process
strace -p PID               # Trace system calls
kill -SIGTERM PID           # Gracefully stop process

# Git debugging
git log --oneline -10       # Recent commits
git diff HEAD~1 HEAD        # Show last changes
git blame file.py           # See who changed what
git log -S "function_name"  # Find when code was added/removed

Resources

Scripts

Run helper scripts for automated debugging:

scripts/log_analyzer.py - Analyze log files for patterns
scripts/performance_check.py - Check performance metrics
scripts/stack_trace_parser.py - Parse and explain stack traces
scripts/error_frequency.py - Count error occurrences
scripts/dependency_checker.py - Verify dependencies

References

For deep dive debugging knowledge:

references/DEBUGGING_PATTERNS.md - Common bug patterns and solutions
references/ERROR_CODES.md - Complete error code reference
references/PERFORMANCE_GUIDE.md - Performance debugging guide
references/DATABASE_DEBUGGING.md - Database-specific debugging
references/FRONTEND_DEBUGGING.md - Frontend debugging techniques

Debugging Decision Tree

START: Bug Reported
  │
  ├─→ Can you reproduce it?
  │    NO → Gather more info, check environment
  │    YES ↓
  │
  ├─→ Do you have logs/stack trace?
  │    NO → Enable logging, reproduce again
  │    YES ↓
  │
  ├─→ Is error message clear?
  │    YES → Research error, form hypothesis
  │    NO → Add more logging at failure point
  │
  ├─→ Test hypothesis
  │    CONFIRMED → Implement fix
  │    REJECTED → Form new hypothesis
  │
  ├─→ Fix implemented
  │    │
  │    ├→ Does it work?
  │    │   NO → Debug the fix
  │    │   YES ↓
  │    │
  │    └→ Any side effects?
  │        YES → Refine fix
  │        NO → Test & Deploy
  │
  └─→ END: Document findings

Summary

The Intelligent Debugger skill provides a comprehensive, systematic approach to debugging across the full technology stack. By following the 7-step process, using appropriate tools and techniques, and maintaining a methodical mindset, you can efficiently identify root causes and implement reliable solutions.

Remember:

Be Systematic - Follow the process, don't skip steps
Think First - Understand before fixing
Document Everything - Knowledge compounds
Ask Questions - Clarification prevents wasted effort
Test Thoroughly - Verify the fix works

Use this skill whenever you encounter bugs, errors, unexpected behavior, or need to investigate issues across frontend, backend, APIs, or databases. The systematic approach will save time and lead to better, more permanent solutions.

Install Skill

SKILL.md

Intelligent Debugger

Overview

Core Debugging Philosophy

7-Step Systematic Debugging Process

Step 1: Reproduce the Issue

Step 2: Gather Information

Step 3: Understand the System

Step 4: Form and Test Hypotheses

Step 5: Implement Solution

Step 6: Test and Verify

Step 7: Document and Learn

Debugging Techniques by Category

Frontend Debugging

Backend Debugging

Database Debugging

Performance Debugging

Integration Debugging

Error Pattern Recognition

Stack Trace Analysis

Cross-Stack Error Propagation

Debugging Tools & Commands

Essential Tools

Log Analysis

Performance Tools

Debugging Workflow Examples

Example 1: Frontend Not Displaying Data

Example 2: API Timeout

Example 3: Intermittent Crashes

Special Debugging Scenarios

Race Conditions

Heisenbug (Disappears When Debugging)

Production-Only Issues

Communication During Debugging

User Communication Template

Best Practices

Do's

Don'ts

Debugging Mindset

Quick Reference Commands

Diagnostic Commands

Resources

Scripts

References

Debugging Decision Tree

Summary