| name | test-diff-analyzer |
| description | Analyze test differences between runs to identify flaky tests and consistency issues. Use to find tests that fail intermittently. |
| category | testing |
| mcp_fallback | none |
Analyze Test Differences Between Runs
Compare test results across multiple runs to identify flaky tests.
When to Use
- Test passes locally but fails in CI
- Test sometimes passes, sometimes fails (flaky test)
- Need to understand test consistency issues
- Comparing test results before/after code changes
- Debugging intermittent test failures
Quick Reference
# Run tests and capture output
pixi run mojo test -I . tests/ > /tmp/test_run_1.log
# Compare two test runs
diff -u /tmp/test_run_1.log /tmp/test_run_2.log
# Extract failures from log
grep "FAILED" /tmp/test_run_*.log | sort | uniq -c
# Show tests that sometimes pass, sometimes fail
grep "FAILED\|PASSED" /tmp/test_run_*.log | cut -d: -f2 | sort | uniq -d
Analysis Workflow
- Collect baseline: Run tests locally N times
- Collect CI data: Get CI test results from recent runs
- Compare outputs: Diff between test runs
- Identify flaky tests: Tests with inconsistent results
- Find patterns: When does test fail vs pass
- Root cause: Timing, randomness, resource issues
- Remediation: Fix or isolate flaky test
Flaky Test Indicators
Timing Issues:
- Test passes when run in isolation
- Test fails when run with other tests
- Timeout values too aggressive
- Race conditions in setup/teardown
Randomness Issues:
- Random seed not fixed
- Hash ordering varies
- Dictionary/set iteration order
- Floating point precision
Resource Issues:
- Test passes locally but fails in CI
- Fails under resource constraints
- Out of memory errors intermittently
- Disk space dependent
Output Format
Report analysis with:
- Flaky Tests - Tests with inconsistent results
- Consistency Score - Pass rate across runs (e.g., 80% pass rate)
- Failure Patterns - When/how tests fail
- Impact - How many test runs affected
- Root Cause Hypothesis - What likely causes instability
- Recommendations - How to fix or isolate flaky test
Error Handling
| Problem | Solution |
|---|---|
| Different environment | Run in controlled environment (docker) |
| Insufficient data | Run more iterations to get pattern |
| No failure info | Enable debug output, increase verbosity |
| External dependencies | Mock or isolate external services |
| Timing-dependent | Add explicit waits or retry logic |
References
- See mojo-test-runner for test execution options
- See extract-test-failures for failure analysis
- See CLAUDE.md for test standards and TDD workflow