| name | bugfixforever |
| description | State-of-the-art procedure for fixing bugs in software projects. Use this skill when a bug has been detected or declared by the user, agent, or another skill (not during early-stage work in progress). Enforces a disciplined test-driven approach - understand, reproduce experimentally, write failing tests, fix the code, and clean up. |
BugFixForever (BFF)
Overview
This skill guides a reliable, test-driven procedure for fixing bugs in software projects. The approach emphasizes experimental reproduction, comprehensive test coverage at appropriate abstraction levels, and clean commits.
When to Use This Skill
Activate this skill when:
- A user reports a bug or issue in the system
- An agent detects unexpected behavior or failures
- Another skill identifies a problem that needs fixing
- Do NOT interrupt early-stage work in progress for minor issues
The BFF Workflow
Follow these phases in order. Do not skip ahead to fixing code before establishing reproducibility and test coverage.
Phase 1: Understand the Bug
Gather context about the bug's behavior:
- What is the expected behavior vs. actual behavior?
- When does it occur? What triggers it?
- What is the impact? (User-facing, data corruption, performance, etc.)
- What has changed recently that might be related?
- Is this reproducible consistently or intermittent?
Review relevant code, recent commits, and existing tests to build mental model of where the bug likely exists.
Phase 2: Experimentally Reproduce the Bug
Prove the bug exists through experimentation. All experimental code must live on the filesystem - do NOT write one-off scripts directly into stdin of a scripting language.
Acceptable reproduction methods:
- Run the software normally (web server, CLI, etc.)
- Run existing tests to confirm failures
- Write temporary test files to isolate the issue
- Use a web browser and developer tools to reproduce user-facing issues
- Tail logs and add temporary logging statements
- Run the same test multiple times for intermittent issues
- Manipulate mocks/stubs to isolate components
Code inspection & debugging techniques:
- Add temporary logging/print statements to trace execution flow
- Use debuggers with breakpoints to step through code
- Add strategic assertions to catch bad state earlier
- Temporary console.log/print statements in production code (to be removed in cleanup phase)
System-level tools (when appropriate):
- System call tracing (strace, dtrace) for low-level issues
- Performance profiling tools for performance bugs
- Container/process inspection (docker logs, ps, top)
- Network analysis tools
Important: Create reusable files for all experimental code. Temporary test files, debugging scripts, and reproduction cases should be saved to the filesystem, not executed as one-liners.
Continue experimentation until the bug's behavior is proven and understood mechanistically.
Phase 3: Write Failing Tests
Once reproduction is confirmed, write one or more declarative tests into the project's test suites that fail because the bug exists.
The Drill-Down Approach:
For user-facing bugs or integration issues:
- Start at the top: Write an E2E or integration test that captures the user-visible symptom
- Drill down: Write a unit test that captures the root cause at the component/function level
For internal bugs discovered during development:
- A focused unit test may be sufficient
Test quality criteria:
- Tests should be declarative and clearly document what behavior is expected
- Tests should fail reliably when the bug exists
- Tests should be placed in appropriate test suites (not temporary files)
- Choose abstraction level that matches the bug's nature
Run the new tests to confirm they fail with clear, expected error messages.
Phase 4: Fix the Bug
Only after failing tests are in place, modify the production code to fix the bug.
Fix principles:
- Make the minimal change necessary to fix the root cause
- Preserve existing functionality (don't introduce regressions)
- Follow existing code patterns and conventions
- Consider edge cases and similar bugs that might exist
Run the new tests to confirm they now pass. Run the full test suite to ensure no regressions.
Phase 5: Cleanup and Commit
Remove all temporary artifacts and prepare a clean commit:
Cleanup tasks:
- Delete temporary test files, debugging scripts, and experimental code
- Remove temporary logging/print/console.log statements added during reproduction
- Remove extraneous comments added during debugging
- Update or remove outdated documentation affected by the fix
Commit preparation:
Use the no-more-commitment-issues skill to ensure commit hygiene:
- Clear, concise commit message explaining what was fixed and why
- Only include files relevant to the fix and its tests
- No debug statements, temporal language, or cruft
Examples
Example 1: User-facing bug in web application
User report: "When I click 'Save' on the profile page, nothing happens."
Phase 1: Understand - profile save button should update user data but appears non-functional
Phase 2: Reproduce experimentally:
# Run dev server
pnpm run dev
# Open browser, navigate to profile page, reproduce issue
# Check browser console - see "TypeError: Cannot read property 'id' of undefined"
# Add temporary console.log in src/components/Profile.tsx to trace data flow
Phase 3: Write failing tests (drill-down approach):
// E2E test capturing user-facing symptom
test('user can save profile changes', async ({ page }) => {
// ... test fails
});
// Unit test capturing root cause
test('ProfileService.save handles missing user ID', () => {
// ... test fails
});
Phase 4: Fix - add null check in ProfileService.save()
Phase 5: Remove temporary console.log, commit with no-more-commitment-issues skill
Example 2: Intermittent test failure
Agent observation: "Tests are flaky - UserAuthTest sometimes fails."
Phase 1: Understand - race condition suspected in authentication flow
Phase 2: Reproduce experimentally:
# Create temporary script to run test 100 times
# File: debug/test_flakiness.sh
for i in {1..100}; do
npm test -- UserAuthTest
done
# Add temporary logging to trace async timing issues
Phase 3: Write failing test that reliably exposes race condition
Phase 4: Fix - add proper async/await handling
Phase 5: Delete debug/test_flakiness.sh and temporary logging, commit
Key Principles
- Filesystem-first: All experimental code lives in files, not stdin one-liners
- Drill-down testing: Start with high-level tests for user impact, drill to unit tests for root cause
- Prove before fixing: Never fix code until bug is reproducible and tests are written
- Clean commits: Remove all temporary artifacts before committing