| name | test.result.analyzer |
| description | Parse ctest and sanitizer output, summarize failures, identify root causes, and track test coverage for Orpheus SDK builds. |
Test Result Analyzer
Purpose
The Test Result Analyzer skill parses test output from ctest, Google Test, and sanitizers (ASan, TSan, UBSan) to provide actionable summaries of test failures and quality issues. This skill helps maintain Orpheus SDK's 98%+ test coverage standard and ensures sanitizer-clean builds.
Core Capabilities:
- Parse ctest and Google Test output
- Extract sanitizer error reports (AddressSanitizer, ThreadSanitizer, UBSan)
- Identify failing tests by module
- Suggest likely root causes
- Track coverage trends
- Generate concise failure summaries
When to Use
Trigger Patterns:
- After running
ctestor test suites - CI/CD test failures
- Debugging test regressions
- Coverage analysis requests
- Keywords: "test", "ctest", "sanitizer", "ASan", "TSan", "UBSan", "coverage", "failures"
- File patterns:
Testing/Temporary/LastTest.log,build/test_output.txt
Do NOT Use When:
- Analyzing production logs (not test output)
- Performance profiling (use profilers, not test output)
- Real-time safety auditing (use
rt.safety.auditor)
Use Alternative Skills:
- For real-time violations →
rt.safety.auditor - For documentation →
orpheus.doc.gen - For CI troubleshooting →
ci.troubleshooter
Allowed Tools
read_file- Read test output logs and reportsbash- Run commands to extract test results (e.g.,ctest --output-on-failure)python- Parse structured test output (JSON, XML, text logs)
Access Level: 1 (Local Execution - read + command execution)
Rationale:
- Read test logs from build directory
- Execute ctest/test commands to generate output
- Parse output with Python for structured analysis
- No file modification needed (read-only analysis)
Explicitly Denied:
write_file- Analysis only, don't modify test files- Network tools - Offline analysis only
Expected I/O
Input:
- Type: Test output files or ctest command output
- Format: Text logs (ctest, Google Test), sanitizer reports
- Constraints:
- Valid test output from ctest or test runner
- Sanitizer output from ASan/TSan/UBSan-enabled builds
Output:
Type: Test Analysis Report (Markdown)
Format:
# Test Analysis Report - Build #X ## Summary - Total Tests: N - Passed: X (Y%) - Failed: Z (W%) - Coverage: X.X% (target: 98%) ## Failures ### Module X: [Module Name] **Test:** [test name] **Error:** [error message] **Analysis:** [root cause] **Location:** [file:line] **Suggested Fix:** [actionable fix] ## Sanitizer Reports ### AddressSanitizer: [CLEAN/ERRORS] ### ThreadSanitizer: [CLEAN/ERRORS] ### UBSan: [CLEAN/WARNINGS] ## Coverage Analysis - Overall: X.X% - By Module: [breakdown] ## Recommendations 1. [Actionable items]Validation:
- All failures include test name, error, and location
- Root cause analysis provided for each failure
- Sanitizer errors clearly extracted
- Coverage metrics accurate
Dependencies
Required:
- ctest (CMake test runner)
- Test output files (from build/Testing/Temporary/)
- Google Test framework (for Orpheus tests)
Optional:
lcovorgcov(for coverage analysis)python3with standard library (for parsing scripts)
Version Requirements:
- ctest 3.20+
- Python 3.8+ (if using parser scripts)
Examples
Example 1: Parse ctest Output
User: "Analyze the latest test run output"
Claude Process:
- Read ctest output:
cat build/Testing/Temporary/LastTest.log - Parse test results (total, passed, failed)
- Extract failing test details
- Identify patterns and root causes
- Generate report
Output:
# Test Analysis Report - Build 2025-10-18
## Summary
- Total Tests: 47
- Passed: 46 (97.9%)
- Failed: 1 (2.1%)
- Duration: 3.2s
- Coverage: 98.1% (target: 98% - OK)
## Failures
### Module 3: AudioMixer
**Test:** AudioMixerTests.MixStereo
**Error:** Expected 0.5, got 0.4999847
**Analysis:** Floating-point precision issue in stereo mixing calculation
**Location:** src/modules/m3/audio_mixer.cpp:156
**Suggested Fix:**
- Use epsilon comparison for floating-point equality
- Or use fixed-point arithmetic for deterministic results
```cpp
EXPECT_NEAR(result, 0.5, 1e-6); // Use NEAR instead of EQ
```
Sanitizer Reports
AddressSanitizer: CLEAN ✓
ThreadSanitizer: CLEAN ✓
UndefinedBehaviorSanitizer: CLEAN ✓
Coverage Analysis
- Overall: 98.1% (↑0.2% from last build)
- Module 1 (Foundation): 100%
- Module 2 (Core): 99.1%
- Module 3 (Mixer): 97.2% (↓0.5% - investigate)
- Module 4 (Effects): 98.8%
Recommendations
- Fix AudioMixerTests.MixStereo floating-point comparison
- Investigate Module 3 coverage decrease (was 97.7%, now 97.2%)
- Consider adding integration test for multi-channel mixing
### Example 2: Parse Sanitizer Error
**User:** "Check for AddressSanitizer errors in test output"
**Output:**
```markdown
# Test Analysis Report - Sanitizer Check
## AddressSanitizer: 1 ERROR
### Error: Heap-use-after-free
**Location:** src/modules/m2/sample_buffer.cpp:89
**Details:**
==12345==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000000010 READ of size 4 at 0x602000000010 thread T0 #0 SampleBuffer::read() src/modules/m2/sample_buffer.cpp:89 #1 TransportController::process() src/modules/m1/transport_controller.cpp:134
**Analysis:** Buffer is being accessed after deallocation.
Likely cause: Buffer lifetime management issue - buffer deleted while still referenced.
**Suggested Fix:**
1. Use shared_ptr or unique_ptr for ownership
2. Ensure buffer outlives all references
3. Add RAII wrapper for buffer lifecycle
4. Check destructor order in containing class
**Priority:** CRITICAL - Must fix before merge
Example 3: Coverage Trend Analysis
User: "Show test coverage trends for the last 5 builds"
Output:
# Test Coverage Trends
## Overall Coverage
Build #120: 97.9% Build #121: 98.0% Build #122: 98.1% ← Current Build #123: 98.1% Build #124: 98.2%
**Trend:** ↑ Increasing (good)
## Module Breakdown
- Module 1: Stable at 100%
- Module 2: 99.1% (↓ from 99.3%)
- Module 3: 97.2% (↓ from 97.7%) ⚠
- Module 4: 98.8% (stable)
## Recommendations
1. Module 3 coverage declining - investigate new code without tests
2. Overall trend positive, maintain 98%+ target
3. Module 2 slight decrease - review recent changes
Limitations
Known Edge Cases:
- Cannot parse non-standard test output formats
- Requires Google Test format for detailed analysis
- Sanitizer output parsing assumes standard format (may vary by version)
- Coverage analysis requires lcov/gcov integration
Performance Constraints:
- Large test logs (>10MB) may be slow to parse
- Full coverage analysis requires compilation with coverage flags
Security Boundaries:
- Read-only access to test output files
- No modification of test code or results
Scope Limitations:
- Analyzes test output, doesn't run tests (use
ctestdirectly for that) - Cannot fix tests, only suggest fixes
- Coverage metrics require build with coverage instrumentation
Validation Criteria
Success Metrics:
- Accuracy: Correctly parse all test failures and sanitizer errors
- Completeness: Extract all relevant information (test name, error, location)
- Actionability: Provide root cause analysis and suggested fixes
- Performance: Parse 1000 test results in <5 seconds
Failure Modes:
- Parse errors: Unrecognized test output format (update parser)
- Missing data: Incomplete error messages (request verbose test output)
- Incorrect analysis: Wrong root cause (improve pattern matching)
Recovery:
- For parse errors: Fall back to raw output display
- For missing data: Request re-run with
--output-on-failureflag - For incorrect analysis: Provide raw error for manual review
Related Skills
Dependencies:
- None (standalone skill)
Composes With:
rt.safety.auditor- Check if test failures are real-time violationsci.troubleshooter- Integrate test analysis into CI failure reportsorpheus.doc.gen- Document test fixes and improvements
Alternative Skills:
- Manual test review - Human expertise for complex failures
- External CI tools - GitHub Actions, Jenkins (for automation)
Maintenance
Owner: Orpheus Team Review Cycle: Monthly (update parsers as test framework evolves) Last Updated: 2025-10-18 Version: 1.0
Revision Triggers:
- Test framework changes (Google Test updates)
- New sanitizer versions
- Coverage tool updates
- New test output formats