| name | full-stack-debugger |
| description | This skill should be used when debugging full-stack issues that span UI, backend, and database layers. It provides a systematic workflow to detect errors, analyze root causes, apply fixes iteratively, and verify solutions through automated server restarts and browser-based testing. Ideal for scenarios like failing schedulers, import errors, database issues, or API payload problems where issues originate in backend code but manifest in the UI. |
Full Stack Debugger
Overview
The Full Stack Debugger enables systematic debugging of issues across the entire application stack (UI/Frontend, Backend/API, Database/State). It combines browser testing, log analysis, code examination, and automated server restart/verification to iteratively identify and fix issues one at a time until the system is fully operational.
This skill uses a proven workflow: Detection → Analysis → Fix → Restart → Verification → Iteration to systematically resolve issues that developers encounter during development and testing.
When to Use This Skill
Trigger this skill when observing:
- Error states in the UI (dashboard, buttons failing, status showing errors)
- Repeated failures in backend logs (task execution failures, import errors, database errors)
- Unexpected database state (rows showing failed status when they should succeed)
- API endpoints returning errors or unexpected responses
- Services failing to initialize or process tasks
- Cascading failures across multiple components
Debugging Workflow
Phase 1: Detection
Detect errors from multiple sources:
Browser UI Detection:
- Navigate to the affected page/feature in the browser
- Check for error messages, red warning states, or disabled functionality
- Read console error messages using DevTools
- Note the specific UI state and what action triggered the error
Backend Log Detection:
- Query recent error logs using
tail -200 /path/to/logs/errors.log - Search for error patterns related to the issue using
grep - Note error timestamps, error messages, and stack traces
- Look for repeated errors (indicates systemic issue)
Database State Detection:
- Query the database directly using sqlite3
- Check status of recent tasks, transactions, or records
- Look for failed, incomplete, or error states
- Note which records are affected and what their states are
Example: When debugging a scheduler failure:
- Navigate to System Health dashboard
- Observe scheduler showing "0 done" or "X failed"
- Check
/logs/errors.logfor error messages - Query
queue_taskstable to see failed task records
Phase 2: Analysis
Analyze root causes by reading code and logs:
Code Analysis:
- Read the error file/module indicated in error stack traces
- Check imports - look for missing
from X import Ystatements - Check class names - verify instantiation matches actual class names
- Look for syntax errors - unmatched quotes, unclosed parentheses
- Check function signatures - ensure payloads match expected parameters
- Read reference documentation (
references/common_errors.md) for error patterns
Log Analysis:
- Extract error messages from logs
- Look for patterns like
'optional'(missing import),unterminated string(syntax error),'attribute'(wrong class name) - Trace error propagation backward to find the originating issue
- Check timestamps - multiple errors at same time indicate batch failure
API/Payload Analysis:
- Check what payload the API is sending to task handlers
- Read the task handler code to see what fields it expects
- Compare actual payload vs expected payload
- Look for missing required fields
Example: When debugging "name 'Optional' is not defined":
- Find the file mentioned in error (
analysis_executor.py) - Read the imports section
- Notice
Optionalis used but not imported - Check line 14:
from typing import Dict, List, Any- missingOptional - Fix: Add
Optionalto the import statement
Phase 3: Fix (One Issue at a Time)
Apply fixes one issue per iteration:
Before Fixing:
- Verify this is the first/next issue to fix
- Read the relevant code section carefully
- Use the fix patterns from
references/fix_templates.md
Common Fix Patterns:
- Missing imports: Add to import statement (e.g.,
from typing import Optional) - Wrong class name: Update import and instantiation to match actual class
- Missing docstring quotes: Add opening
"""to docstring - Wrong payload fields: Add missing required fields to payload dictionary
- Syntax errors: Fix unmatched quotes, parentheses, brackets
After Fixing:
- Read back the changed code to verify syntax
- Check the edit was correct (line numbers, indentation)
- Only fix ONE issue, even if multiple exist - don't cascade fixes
- Document what was changed in a clear comment
Example Fix:
# BEFORE
from typing import Dict, List, Any
# AFTER
from typing import Dict, List, Any, Optional
Phase 4: Restart (Automated)
Restart the backend server after each fix:
# Kill existing processes
lsof -ti:8000 | xargs kill -9 2>/dev/null
# Clear Python bytecode cache
find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null
find . -type f -name "*.pyc" -delete 2>/dev/null
# Restart backend
sleep 3 && python -m src.main --command web > /tmp/backend_restart.log 2>&1 &
sleep 10 # Wait for startup
# Verify health
curl -m 5 http://localhost:8000/api/health
Phase 5: Verification
Verify the fix worked through multiple checks:
Health Check:
- Call
/api/healthendpoint - Verify
"status": "healthy" - If still failing, check logs for new errors
Browser Verification:
- Navigate to the affected UI page
- Trigger the action that previously failed
- Verify the error is gone
- Check for new errors in console
Database Verification:
- Query the affected records/tasks
- Verify status changed from failed/error to success/completed
- Check that metrics updated (e.g., scheduler shows "1 done" instead of "0 done")
Log Verification:
- Check recent logs for the same error
- Verify no new errors appeared
- Look for success messages or "completed" status
Example:
- Scheduler should show "1 done" instead of "0 done"
- Task record should show status="completed" instead of "failed"
- No error messages in logs
- WebSocket shows healthy status in UI
Phase 6: Iteration
If issues remain, repeat the cycle:
Continue if more issues exist:
- Check logs for remaining errors
- If yes, return to Phase 2 (Analysis)
- Fix the next issue (Phase 3)
- Restart (Phase 4)
- Verify (Phase 5)
Stop when all issues fixed:
- All schedulers show completed execution counts
- UI shows no error states
- Logs show no error patterns
- Tasks/records show success status
- Full verification complete
Common Error Patterns
See references/common_errors.md for patterns to recognize:
- Python syntax errors (unterminated strings, missing quotes)
- Import errors (
name 'X' is not defined,cannot import name 'Y') - Class/attribute errors (
'dict' object has no attribute 'symbol') - Type errors (passing wrong data type)
- Payload/configuration errors (missing required fields)
Fix Templates
See references/fix_templates.md for ready-to-use fix patterns:
- How to add missing imports
- How to fix class name mismatches
- How to fix docstring syntax
- How to add missing payload fields
- How to fix type errors
Tools Used
- Playwright Browser Tools: Navigate UI, verify changes
- Read/Grep Tools: Examine code and logs
- Bash: Server restart, cache clearing, health checks
- Edit Tool: Apply code fixes
- Database Queries: Verify task/record state
MCP Tools Integration
Use robo-trader-dev MCP tools for 95%+ token-efficient debugging:
| Task | MCP Tool | Token Savings | Usage |
|---|---|---|---|
| Analyze error logs | mcp__robo-trader-dev__analyze_logs |
98% | Pattern detection with time windows |
| System health check | mcp__robo-trader-dev__check_system_health |
97% | Database, queues, API, disk status |
| Diagnose DB locks | mcp__robo-trader-dev__diagnose_database_locks |
95% | Correlate logs with code patterns |
| Queue monitoring | mcp__robo-trader-dev__queue_status |
96% | Real-time queue backlog analysis |
| Coordinator status | mcp__robo-trader-dev__coordinator_status |
94% | Init status, error details |
| Error pattern fix | mcp__robo-trader-dev__suggest_fix |
90% | Known pattern matching with examples |
| Read code files | mcp__robo-trader-dev__smart_file_read |
85% | Progressive context (summary/targeted/full) |
| Find related files | mcp__robo-trader-dev__find_related_files |
88% | Import/git/similarity analysis |
Example debugging workflow:
# 1. Detect errors (MCP instead of tail/grep)
mcp__robo-trader-dev__analyze_logs(patterns=["ERROR", "TIMEOUT"], time_window="1h")
# 2. Check system health (MCP instead of curl loops)
mcp__robo-trader-dev__check_system_health(components=["database", "queues", "api_endpoints"])
# 3. Diagnose specific issue (MCP instead of sqlite3 + code reading)
mcp__robo-trader-dev__diagnose_database_locks(time_window="24h", include_code_references=True)
# 4. Get fix suggestions (MCP instead of manual pattern matching)
mcp__robo-trader-dev__suggest_fix(error_message="name 'Optional' is not defined", context_file="src/services/analyzer.py")
Integration with robo-trader architecture:
- Queue operations: Use
queue_statusto monitor PORTFOLIO_SYNC, DATA_FETCHER, AI_ANALYSIS - Coordinator debugging: Use
coordinator_statusfor BroadcastCoordinator, AIChatCoordinator init issues - Database access: Use
query_portfolioordiagnose_database_locksinstead of direct sqlite3 connections
Key Principles
- One issue at a time - Fix one problem per iteration to prevent cascading failures
- Verify immediately - Always restart and verify after each fix
- Multi-layer detection - Check UI, logs, and database for clues
- Iterative refinement - Continue until all issues resolved
- Automated restart - Always use clean restart (kill + cache clear + restart)
- Browser verification - Always test in actual UI, not just logs