| name | ms-essentials-debug |
| description | Advanced debugging skill for runtime errors with systematic stack trace analysis, error pattern recognition (NoneType, undefined, async issues, memory leaks, performance bottlenecks), root cause identification using 5-Whys methodology, memory profiling with tracemalloc, N+1 query detection, and test-first debugging workflow that follows TDD RED-GREEN-REFACTOR cycle to provide actionable fix suggestions |
MS Essentials Debug v1.0
Skill Metadata
| Field | Value |
|---|---|
| Skill Name | ms-essentials-debug |
| Version | 1.0.0 |
| Created | 2025-10-26 |
| Last Updated | 2025-10-26 |
| Language Coverage | Python, TypeScript, JavaScript |
| Allowed tools | Read, Write, Edit, Bash, TodoWrite, Grep, Glob |
| Auto-load | On demand during debugging scenarios |
| Trigger cues | Runtime errors, stack traces, "debug this", "why is this failing?" |
What It Does
Comprehensive debugging support for My-Spec projects with:
- Stack trace analysis: Parse and interpret error stack traces
- Error pattern recognition: Identify common error patterns (NoneType, undefined, async issues)
- Debugging steps: Provide systematic debugging workflow
- Root cause identification: Analyze code to find bug origins
- Fix suggestions: Recommend concrete fixes with code examples
- Test-first debugging: Write failing tests to reproduce bugs (TDD RED step)
Key capabilities:
- ✅ Python error analysis (TypeError, AttributeError, ImportError, etc.)
- ✅ TypeScript/JavaScript error analysis (undefined, null reference, promise rejection)
- ✅ Async/await debugging (unhandled promise rejection, race conditions)
- ✅ Test failure analysis (assertion errors, test isolation issues)
- ✅ Integration with TRUST 5 principles (Test-First debugging)
When to Use
Automatic triggers:
- Runtime errors, exceptions, crashes
- Stack trace analysis requests
- Test failures
- "Why is this failing?", "Debug this error", "Fix this bug"
Manual invocation:
- When encountering unexplained behavior
- When tests are failing mysteriously
- When error messages are unclear
- When debugging async/promise issues
- During code reviews with bug reports
Common scenarios:
- NoneType/undefined errors: Variable is None/undefined when it shouldn't be
- Async errors: Promise rejection, race conditions, callback hell
- Import/module errors: Missing dependencies, circular imports
- Test failures: Assertion mismatches, test isolation issues
- Type errors: Type mismatches in TypeScript/Python type hints
How It Works
1. Error Analysis Phase
Collect information:
- Full error message and stack trace
- Code context (the failing function/module)
- Input data or test case that triggered the error
- Recent changes (git diff)
Analyze stack trace:
# Example Python stack trace analysis
Traceback (most recent call last):
File "src/auth/service.py", line 45, in authenticate
user = self.user_repo.find_by_email(email)
File "src/auth/repository.py", line 23, in find_by_email
return self.db.query(User).filter(User.email == email).first()
AttributeError: 'NoneType' object has no attribute 'query'
Diagnosis:
- Root cause:
self.dbis None (database connection not initialized) - Location:
src/auth/repository.py:23 - Impact: Authentication flow completely broken
- Severity: CRITICAL (P0)
2. Root Cause Investigation
Ask 5 Whys:
- Why is
self.dbNone? → Database wasn't injected - Why wasn't it injected? → Constructor parameter missing
- Why is constructor parameter missing? → Dependency injection misconfigured
- Why is DI misconfigured? → Missing provider in container
- Why is provider missing? → New developer didn't update DI config
Check common causes:
- ✅ Uninitialized variables
- ✅ Missing null checks
- ✅ Incorrect dependency injection
- ✅ Missing configuration
- ✅ Race conditions in async code
- ✅ Test isolation issues (shared state)
3. Fix Strategy Development
Test-First Debugging (RED → GREEN → REFACTOR):
# Step 1: RED - Write failing test to reproduce bug
def test_authenticate_with_valid_credentials():
# Given: user exists in database
user_repo = UserRepository(db_connection)
auth_service = AuthService(user_repo)
# When: authenticate with valid credentials
result = auth_service.authenticate("test@example.com", "password123")
# Then: authentication succeeds
assert result.is_authenticated == True
# ^ This test fails with AttributeError: 'NoneType' object has no attribute 'query'
# Step 2: GREEN - Implement minimum fix
class UserRepository:
def __init__(self, db_connection):
if db_connection is None:
raise ValueError("db_connection is required")
self.db = db_connection # Fix: ensure db is set
def find_by_email(self, email: str) -> Optional[User]:
if self.db is None: # Defensive check
raise RuntimeError("Database connection not initialized")
return self.db.query(User).filter(User.email == email).first()
# Step 3: REFACTOR - Improve design
class UserRepository:
def __init__(self, db_connection: DatabaseConnection):
"""Inject database connection
Args:
db_connection: Must be a valid DatabaseConnection instance
Raises:
ValueError: If db_connection is None
"""
if db_connection is None:
raise ValueError("db_connection is required and cannot be None")
self.db: DatabaseConnection = db_connection
4. Verification Steps
- Run the failing test → Confirm it reproduces the bug (RED)
- Apply the fix → Implement the solution
- Run the test again → Verify it passes (GREEN)
- Run full test suite → Ensure no regressions
- Manual testing → Verify in actual application
- Code review → Check fix quality against TRUST 5
Failure Modes
When debugging fails or is blocked:
Missing context: Stack trace incomplete or error message truncated
- Solution: Enable full stack traces (
PYTHONTRACEBACK=1,NODE_OPTIONS=--trace-warnings)
- Solution: Enable full stack traces (
Intermittent bugs: Error doesn't reproduce consistently
- Solution: Add logging, use debugger with breakpoints, check for race conditions
Test isolation issues: Test passes alone but fails in suite
- Solution: Check for shared state, database cleanup between tests
Production-only bugs: Can't reproduce in development
- Solution: Check environment differences, enable production logging, use feature flags
Debugger not available: No debugger tools installed
- Solution: Fall back to print debugging, logging, or install language-specific debugger
Best Practices
✅ DO:
Write a failing test first (RED step)
- Reproduces the bug reliably
- Documents expected behavior
- Prevents regression
Use the debugger before making changes
- Set breakpoints at error location
- Inspect variable values
- Step through execution
Check the obvious first:
- Is the variable initialized?
- Is the function called correctly?
- Are types correct?
- Is configuration missing?
Document the fix in code comments
- Explain WHY the bug occurred
- Reference the issue/ticket number
- Add TODO if partial fix
Follow TRUST 5 principles:
- Test-First: Write test before fix
- Readable: Clear variable names, comments
- Unified: Type hints, consistent patterns
- Secured: Check for security implications
- Trackable: Update TAG chains if needed
❌ DON'T:
Don't guess and check randomly
- Understand the root cause first
- Use systematic debugging approach
Don't fix without a test
- How will you know it's fixed?
- How will you prevent regression?
Don't ignore error messages
- Error messages contain valuable clues
- Stack traces show execution path
Don't commit commented-out code
- Remove debug print statements
- Remove temporary workarounds
- Clean up before committing
Don't skip the REFACTOR step
- Fix works but code is messy? Refactor it
- Technical debt accumulates quickly
Examples
Example 1: Python NoneType Error
Error:
AttributeError: 'NoneType' object has no attribute 'email'
Debugging steps:
- Identify where the None value originated
- Check function return types
- Add null checks or Optional types
- Write test for None case
Fix:
# Before (no null check)
def get_user_email(user):
return user.email # Crashes if user is None
# After (defensive check)
def get_user_email(user: Optional[User]) -> str:
if user is None:
raise ValueError("User cannot be None")
return user.email
Example 2: JavaScript undefined Reference
Error:
TypeError: Cannot read property 'name' of undefined
Debugging steps:
- Check where the object comes from
- Verify API response structure
- Add optional chaining or null checks
- Write test for undefined case
Fix:
// Before (no null check)
function getUserName(user) {
return user.name; // Crashes if user is undefined
}
// After (optional chaining)
function getUserName(user?: User): string {
return user?.name ?? "Anonymous";
}
Example 3: Async/Promise Rejection
Error:
UnhandledPromiseRejectionWarning: Error: Connection timeout
Debugging steps:
- Add .catch() to promise chain
- Use try-catch with async/await
- Check network/timeout configuration
- Write test for error case
Fix:
// Before (unhandled rejection)
async function fetchData(url: string) {
const response = await fetch(url); // May throw
return response.json();
}
// After (proper error handling)
async function fetchData(url: string): Promise<Data> {
try {
const response = await fetch(url);
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${response.statusText}`);
}
return response.json();
} catch (error) {
console.error(`Failed to fetch ${url}:`, error);
throw new Error(`Data fetch failed: ${error.message}`);
}
}
Example 4: Memory Leak Debugging (Python)
Symptom: Memory usage grows unbounded over time (e.g., 100MB → 2GB in production)
Debugging steps:
- Use
tracemallocto identify leak source - Check for circular references
- Profile with
memory_profilerdecorator - Verify cleanup in
__del__or context managers
Investigation with tracemalloc:
import tracemalloc
# Start tracing
tracemalloc.start()
# Run your code
run_application()
# Get top memory consumers
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
print("[ Top 10 memory consumers ]")
for stat in top_stats[:10]:
print(stat)
Common memory leak patterns:
# ❌ LEAK: Class variable accumulates instances
class Cache:
_instances = [] # Never cleared!
def __init__(self, data):
self.data = data
self._instances.append(self) # Memory leak
# ✅ FIX: Use weak references
import weakref
class Cache:
_instances = weakref.WeakSet() # Auto-cleanup when no strong refs
def __init__(self, data):
self.data = data
self._instances.add(self)
# ❌ LEAK: Circular reference
class Node:
def __init__(self, value):
self.value = value
self.parent = None
self.children = []
def add_child(self, child):
child.parent = self # Circular reference: parent ↔ child
self.children.append(child)
# ✅ FIX: Use weak references for back-pointers
import weakref
class Node:
def __init__(self, value):
self.value = value
self.parent = None # Will be weakref
self.children = []
def add_child(self, child):
child.parent = weakref.ref(self) # Weak reference
self.children.append(child)
# ❌ LEAK: Global cache never expires
_cache = {}
def get_data(key):
if key not in _cache:
_cache[key] = expensive_operation(key) # Grows indefinitely
return _cache[key]
# ✅ FIX: Use LRU cache with size limit
from functools import lru_cache
@lru_cache(maxsize=128) # Automatically evicts oldest entries
def get_data(key):
return expensive_operation(key)
Memory profiling decorator:
from memory_profiler import profile
@profile
def memory_intensive_function():
data = [i for i in range(1000000)] # 8MB list
return sum(data)
# Run with: python -m memory_profiler script.py
Example 5: N+1 Query Problem (Performance Debugging)
Symptom: API endpoint takes >5 seconds to respond (should be <500ms)
Debugging steps:
- Use
console.time()/time.time()to measure sections - Check database query logs for repeated queries
- Use ORM query debugging (SQLAlchemy
echo=True, Django Debug Toolbar) - Profile with
cProfile(Python) or Chrome DevTools (TypeScript)
Identifying N+1 queries (Python/SQLAlchemy):
# ❌ N+1 QUERY PROBLEM (1 + N queries)
def get_users_with_posts():
users = db.query(User).all() # 1 query: SELECT * FROM users
for user in users:
# N queries: SELECT * FROM posts WHERE user_id = ?
posts = db.query(Post).filter(Post.user_id == user.id).all()
user.posts = posts
return users
# Queries executed:
# SELECT * FROM users; (returns 100 users)
# SELECT * FROM posts WHERE user_id = 1;
# SELECT * FROM posts WHERE user_id = 2;
# ... (100 more queries)
# Total: 101 queries!
# ✅ FIX: Use JOIN to fetch in single query
def get_users_with_posts():
return (
db.query(User)
.options(joinedload(User.posts)) # Eager loading with JOIN
.all()
)
# Queries executed:
# SELECT users.*, posts.*
# FROM users
# LEFT JOIN posts ON users.id = posts.user_id;
# Total: 1 query!
TypeScript/Prisma example:
// ❌ N+1 QUERY PROBLEM
async function getUsersWithPosts() {
const users = await prisma.user.findMany(); // 1 query
for (const user of users) {
user.posts = await prisma.post.findMany({ // N queries
where: { userId: user.id }
});
}
return users;
}
// ✅ FIX: Use include for eager loading
async function getUsersWithPosts() {
return prisma.user.findMany({
include: {
posts: true // Single query with JOIN
}
});
}
Measuring query performance:
import time
import logging
# Enable SQLAlchemy query logging
logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)
# Measure execution time
start = time.time()
users = get_users_with_posts()
duration = time.time() - start
print(f"Query took {duration:.2f}s")
# Before: Query took 5.32s (101 queries)
# After: Query took 0.12s (1 query)
General performance debugging checklist:
- Database queries: Check for N+1, missing indexes
- Memory: Profile with
tracemalloc,memory_profiler - CPU: Profile with
cProfile, Chrome DevTools - Network: Check for redundant API calls
- Caching: Add memoization for expensive operations
- Async: Use concurrent execution where possible
References
Constitution:
- Section I: Test-First Development (TDD RED → GREEN → REFACTOR)
- Section V: TRUST 5 Principles (Test-First, Readable, Unified, Secured, Trackable)
Skills:
ms-foundation-trust: TRUST 5 validationms-lang-python: Python debugging toolsms-lang-typescript: TypeScript debugging tools
External Resources:
- Python debugging:
pdbmodule,pytestdebugging flags - TypeScript debugging: Chrome DevTools, VS Code debugger
- Async debugging: Promise debugging, async stack traces
Changelog
- v1.0.0 (2025-10-26): Initial release for My-Spec workflow
- Stack trace analysis
- Error pattern recognition
- Test-first debugging workflow
- Python/TypeScript/JavaScript support
Works Well With
ms-essentials-review: Use after fixing bugs for code quality reviewms-foundation-trust: Validate TRUST 5 compliance after fixesms-workflow-tag-manager: Update TAG chains after bug fixesms-lang-python: Python-specific debugging techniquesms-lang-typescript: TypeScript-specific debugging techniques
Usage: Invoke this Skill when you encounter runtime errors, need to debug failing tests, or analyze stack traces. The Skill provides systematic debugging steps following Test-First principles from the Constitution.