| name | safety-pattern-auditor |
| description | Systematic audit of safety patterns to identify gaps, regressions, and improvement opportunities |
Safety Pattern Auditor Skill
Purpose: Conduct comprehensive audits of safety pattern coverage to find vulnerabilities before attackers do.
When to Use:
- Before releases (QA gate)
- After adding multiple patterns (sanity check)
- Monthly security reviews
- When vulnerabilities are reported
Duration: 1-3 hours depending on audit scope
Prerequisites
- Gap analyzer installed:
./scripts/analyze-pattern-gaps.py - Baseline captured:
./scripts/capture-safety-baseline.sh - Caro built:
cargo build --release - Test suite available:
.claude/beta-testing/test-cases.yaml
The 5-Phase Audit Workflow
Phase 1: Baseline Capture (10 min)
↓
Phase 2: Gap Analysis (20 min)
↓
Phase 3: Regression Testing (15 min)
↓
Phase 4: Priority Assessment (30 min)
↓
Phase 5: Report & Recommendations (30 min)
Phase 1: Baseline Capture (10 minutes)
Goal: Establish current state of pattern coverage.
Actions:
Capture pattern metrics:
./scripts/capture-safety-baseline.sh .claude/audits/baseline-$(date +%Y%m%d).jsonReview baseline:
cat .claude/audits/baseline-$(date +%Y%m%d).json | python3 -m json.toolKey metrics:
- Total pattern count
- Patterns by risk level (Critical/High/Medium)
- Test pass rates
Document starting point:
## Audit Baseline - 2026-01-08 - Total Patterns: 52 - Critical: 23 - High: 16 - Medium: 0 - Test Pass Rate: 4/4 (100%)
Output:
- Baseline JSON file created
- Metrics documented
- Starting point established
Phase 2: Gap Analysis (20 minutes)
Goal: Use automated tools to find missing coverage.
Actions:
Run gap analyzer:
./scripts/analyze-pattern-gaps.py src/safety/patterns.rs \ -o .claude/audits/gaps-$(date +%Y%m%d).mdReview gap report:
cat .claude/audits/gaps-$(date +%Y%m%d).mdCategorize gaps by detector type:
# Count gaps by type grep "**Type**: argument_order" .claude/audits/gaps-*.md | wc -l grep "**Type**: path_variant" .claude/audits/gaps-*.md | wc -l grep "**Type**: wildcard" .claude/audits/gaps-*.md | wc -l grep "**Type**: platform" .claude/audits/gaps-*.md | wc -lExtract critical gaps:
./scripts/analyze-pattern-gaps.py src/safety/patterns.rs \ --min-severity critical
Gap Categories:
Argument Order Gaps:
- Flags in different positions
- Combined vs separated flags
- Missing permutations
Path Variant Gaps:
- Trailing slashes
- Relative vs absolute paths
- Recursive parent references
Wildcard Gaps:
- Missing glob patterns
- Recursive wildcards (**)
- Hidden file patterns (.*)
Platform Gaps:
- PowerShell equivalents missing
- CMD command variants
- Cross-platform coverage
Output:
- Gap report generated
- Gaps categorized by type
- Critical gaps identified
- Total gap count documented
Phase 3: Regression Testing (15 minutes)
Goal: Ensure no regressions introduced recently.
Actions:
Run regression checker:
./scripts/check-safety-regressions.sh .claude/audits/baseline-*.jsonRun full test suite:
./target/release/caro test --backend static \ --suite .claude/beta-testing/test-cases.yamlCheck pattern compilation:
cargo test --lib safety::patterns --quietTest false positives (sample safe commands):
echo "cd .." | ./target/release/caro --backend static echo "ls -la" | ./target/release/caro --backend static echo "cp file.txt backup/" | ./target/release/caro --backend staticAll should be ALLOWED.
Red Flags:
- ❌ Pattern count decreased
- ❌ Risk level downgrades (Critical → High)
- ❌ Test pass rate dropped
- ❌ False positives (safe commands blocked)
- ❌ Compilation errors
Output:
- No regressions detected
- Test pass rate maintained
- False positive check passed
- Compilation successful
Phase 4: Priority Assessment (30 minutes)
Goal: Triage gaps and create action plan.
Actions:
Count gaps by severity:
./scripts/analyze-pattern-gaps.py src/safety/patterns.rs \ --format json | jq '.gaps | group_by(.severity) | map({severity: .[0].severity, count: length})'Assess each critical gap: For each critical gap, determine:
- Real-world likelihood (High/Medium/Low)
- Ease of exploitation (Easy/Medium/Hard)
- Existing coverage (None/Partial/Full)
- Fix complexity (Simple/Medium/Complex)
Create priority matrix:
Gap ID Severity Likelihood Ease Coverage Fix Priority GAP-001 Critical High Easy None Simple P0 GAP-002 Critical Medium Medium Partial Medium P1 GAP-015 High High Easy None Simple P1 Priority Levels:
- P0: Fix immediately (Critical + High likelihood + Easy exploit)
- P1: Fix this release (Critical or High + likely)
- P2: Fix next release (Medium/Low risk or complex fix)
- P3: Backlog (Low priority, document only)
Estimate effort:
P0 gaps: 3 × 1h = 3 hours P1 gaps: 8 × 1h = 8 hours P2 gaps: 15 × 1h = 15 hours Total: 26 hours to close all high-priority gaps
Output:
- Gaps prioritized (P0/P1/P2/P3)
- Priority matrix created
- Effort estimated
- Action plan drafted
Phase 5: Report & Recommendations (30 minutes)
Goal: Produce actionable audit report with clear next steps.
Actions:
Generate executive summary:
# Safety Pattern Audit Report - 2026-01-08 ## Executive Summary **Status**: 🟡 NEEDS ATTENTION - Total Patterns: 52 - Gaps Found: 202 (21 critical, 90 high, 87 medium, 4 low) - Regressions: 0 - False Positives: 0 - Test Coverage: 100% **Key Findings**: - 21 critical gaps require immediate attention - Argument order variations are the most common gap type - Platform coverage (PowerShell/CMD) is incomplete **Recommendation**: Address 3 P0 gaps before next release.Create detailed findings: Use the gap analyzer report + your assessments.
List actionable recommendations:
## Recommendations ### Immediate (P0) - Complete before release 1. **GAP-003**: Add parent directory deletion pattern - Command: `rm -rf ../` - Effort: 1 hour - Owner: Security team 2. **GAP-012**: Add dd argument order pattern - Command: `dd of=/dev/sda if=/dev/zero` - Effort: 1 hour - Owner: Security team 3. **GAP-018**: Add recursive wildcard pattern - Command: `rm -rf **` - Effort: 1 hour - Owner: Security team ### Short-term (P1) - Next 2 weeks [List P1 gaps with estimates] ### Long-term (P2) - Next quarter [List P2 gaps] ### Process Improvements - Run gap analyzer in CI/CD before merge - Add pre-commit hook to block pattern regressions - Monthly security audit scheduleTrack audit metrics over time:
## Trend Analysis | Date | Patterns | Critical Gaps | High Gaps | Total Gaps | |------|----------|---------------|-----------|------------| | 2025-12-01 | 45 | 28 | 102 | 215 | | 2026-01-01 | 48 | 24 | 95 | 205 | | 2026-01-08 | 52 | 21 | 90 | 202 | **Trend**: ✅ Improving (gaps decreasing)Save audit artifacts:
mkdir -p .claude/audits/2026-01-08/ mv .claude/audits/baseline-*.json .claude/audits/2026-01-08/ mv .claude/audits/gaps-*.md .claude/audits/2026-01-08/ cp priority-matrix.md .claude/audits/2026-01-08/
Output:
- Executive summary written
- Findings documented
- Recommendations prioritized
- Audit artifacts saved
- Report shared with team
Quick Reference: Audit Checklist
┌─────────────────────────────────────────────────────────┐
│ SAFETY PATTERN AUDIT CHECKLIST │
├─────────────────────────────────────────────────────────┤
│ │
│ ☐ Phase 1: Baseline captured │
│ ☐ Phase 2: Gap analyzer run │
│ ☐ Phase 3: Regression tests passed │
│ ☐ Phase 4: Gaps prioritized (P0/P1/P2/P3) │
│ ☐ Phase 5: Report written & shared │
│ │
│ ☐ No regressions detected │
│ ☐ No false positives found │
│ ☐ P0 gaps identified (<3 acceptable) │
│ ☐ Action plan with owners assigned │
│ ☐ Audit artifacts saved │
└─────────────────────────────────────────────────────────┘
Audit Frequency Recommendations
Pre-Release Audit (Mandatory)
- Run before every release
- Focus: Regressions + critical gaps
- Duration: 1 hour
- Gate: Must pass to release
Monthly Security Audit (Recommended)
- Full 5-phase audit
- Focus: Comprehensive gap analysis
- Duration: 2-3 hours
- Output: Trend analysis + roadmap
Post-Incident Audit (As needed)
- Triggered by vulnerability reports
- Focus: Similar patterns + root cause
- Duration: 1-2 hours
- Output: Immediate fixes + prevention
Audit Report Template
See examples/audit-report-template.md for a complete template.
Key sections:
- Executive Summary
- Baseline Metrics
- Gap Analysis Findings
- Regression Test Results
- Priority Matrix
- Recommendations
- Trend Analysis
- Next Steps
Success Criteria
Audit is complete when:
- ✅ All 5 phases executed
- ✅ Baseline captured and documented
- ✅ Gap report generated
- ✅ No regressions found
- ✅ Gaps prioritized (P0/P1/P2/P3)
- ✅ P0 gaps ≤ 3 (or documented exceptions)
- ✅ Report written with recommendations
- ✅ Artifacts saved for historical tracking
- ✅ Action items assigned with owners
Integration with Gap Analyzer
The auditor skill wraps the gap analyzer tool with:
- Baseline comparison over time
- Priority assessment framework
- Reporting and tracking
- Action planning
Gap Analyzer (automated):
- Finds technical gaps
- Categorizes by type
- Rates severity
Auditor Skill (human-guided):
- Assesses business impact
- Prioritizes fixes
- Tracks trends
- Creates action plans
Common Findings & Solutions
Finding: High critical gap count
Typical: 20+ critical gaps Solution: Focus on P0 subset (3-5 gaps), tackle incrementally
Finding: Argument order gaps dominate
Typical: 60%+ are argument order Solution: Use flexible regex patterns with optional groups
Finding: Platform coverage incomplete
Typical: 30+ PowerShell gaps Solution: Create platform-specific patterns with shell_specific field
Finding: Test coverage low
Typical: <50% pass rate Solution: Expand test suite before adding patterns
Getting Help
Resources:
- Gap Analyzer:
./scripts/analyze-pattern-gaps.py --help - Baseline Tool:
./scripts/capture-safety-baseline.sh - Regression Checker:
./scripts/check-safety-regressions.sh - Pattern Developer Skill:
.claude/skills/safety-pattern-developer/
If stuck:
- Review
examples/audit-report-template.md - Check previous audit artifacts in
.claude/audits/ - Consult gap analyzer design doc
- Ask in #safety-patterns channel
This skill provides systematic auditing to maintain high safety pattern quality and catch vulnerabilities early.