name	test-quality-enforcer
description	Enforces zero-tolerance test quality through two-phase testing strategy (Focus → Stable → Regression). Proactively activates when testing context detected. Blocks failures, enforces coverage thresholds (70% min, 85% actors, 80% business logic), and provides gamified feedback. Implements "testing is art in efficiency" - fast module feedback then comprehensive regression.

Test Quality Enforcer Skill

Purpose

Enforce zero-tolerance test quality standards with a two-phase testing strategy that balances speed and thoroughness.

Philosophy:

"Waiting is OK. Better to wait than rush and break things."
"Testing is an art in efficiency: Focus → Stable → Regression"
"Testing is fun!" (gamified, positive reinforcement)

Activation Triggers

Proactive Activation

Load this skill when testing context detected:

Keywords:

"run tests", "test this", "verify tests", "check tests"
"tests pass", "tests passing", "all green"
"mark complete", "task done", "ready to commit"
"coverage", "test results"

Phase Detection:

Feature implementation reaches Phase 4 (Component Testing)
Feature implementation reaches Phase 5 (Unit Testing)
About to mark testing todos complete

Activity Detection:

User runs test commands (mvn test)
Code changes detected in test files
Test output appears in conversation

Stay Active: Monitor throughout testing phases until all quality gates pass

Enforcement Philosophy

All Blocks Are Hard Blocks - No User Override

This skill does not have "soft warnings". When it blocks, it blocks.

Why: Testing integrity is binary. Either:

✅ Tests pass (Failures: 0, Errors: 0, coverage met)
❌ Tests don't pass

User autonomy applies to:

Whether to invoke this skill at all
Which testing approach to use outside of Claude

Once active, enforcement is strict:

❌ Step 0 must be completed (clear /tmp)
❌ Phase 1 must pass before Phase 2
❌ Zero failures/errors required
❌ Coverage thresholds must be met

Enforcement Scope

This skill guides CLAUDE's testing workflow, not the user's terminal

When CLAUDE runs tests:

✅ Must follow Phase 1 → Phase 2 sequence (efficiency)
✅ Must clear /tmp first (avoid stale data)
✅ Must verify zero errors before proceeding

When USER runs tests in their terminal:

✅ User has full autonomy (run any command)
✅ Skill does not interfere with user's terminal
✅ User can run full regression immediately: mvn test

Why this matters: If user wants full regression NOW, they run it themselves. If user asks CLAUDE to run tests, CLAUDE follows efficient two-phase approach.

Hard block applies to: CLAUDE's test execution only

Two-Phase Testing Strategy

Overview

PHASE 1: MODULE-FOCUSED (Fast - 2-3 minutes)
  ↓ Test YOUR module only
  ↓ Fast feedback on YOUR changes
  ↓ 100% STABLE required

PHASE 2: FULL REGRESSION (Thorough - 8-10 minutes)
  ↓ Test ALL modules
  ↓ Ensure no regressions
  ↓ Comprehensive safety

CRITICAL: Phase 2 only runs if Phase 1 passes 100%

PHASE 1: Module-Focused Testing (Fast Feedback)

Goal: Get YOUR module 100% stable before regression

Detect Working Module:

Auto-detect from:
- File paths being edited: test-probe-core/src/main/...
- Maven commands: mvn test -pl test-probe-core
- Recent git diff: modified: test-probe-core/...

Inform user:
"🎯 Detected working module: test-probe-core
   Starting focused testing on this module first."

The Sacred 4-Step Checklist:

Step 0: CLEAR THE SLATE

rm -f /tmp/*.log /tmp/*.txt

Enforcement:

❌ BLOCK if skipped
"⚠️ /tmp contains stale data from previous runs.
   This WILL cause false positives!

   Clearing /tmp... (non-negotiable)"

Step 1: COMPILE YOUR MODULE

mvn compile -pl test-probe-core -q

Enforcement:

❌ BLOCK if fails
"❌ Module doesn't compile. Fix errors in YOUR module first.

Compilation errors: [list errors]

Cannot proceed to testing until code compiles."

Success:

"✅ Your module compiles cleanly. (Step 1/4 complete)"

Step 2: UNIT TESTS - YOUR MODULE ONLY

mvn test -Punit-only -pl test-probe-core -q

Enforcement:

❌ BLOCK if failures/errors
"❌ Unit tests failing in YOUR module.

Results: Tests run: 247, Failures: 3, Errors: 5

Unit tests are the foundation (testing pyramid base).
Fix these before proceeding to integration tests.

Failed tests: [list with file:line numbers]"

Success:

"🎮 Module Unit Tests: 247/247 passing! ✅
   Coverage: 88% (exceeds 70% minimum!)

   (Step 2/4 complete)"

Step 3: COMPONENT TESTS - YOUR MODULE ONLY

mvn test -Pcomponent-only -pl test-probe-core

Enforcement - Failures:

❌ BLOCK if failures/errors
"❌ Component tests failing in YOUR module.

Results: Scenarios: 94, Passing: 91, Failing: 3

Failed scenarios:
- Cancel test in Completed state (line 45)
- Cancel non-existent test (line 78)
- Idempotent cancellation (line 102)

Fix integration issues before full regression."

Enforcement - Undefined Steps:

❌ BLOCK if undefined steps
"❌ Undefined Cucumber steps detected in YOUR module.

Undefined steps (3):
1. 'When the QueueActor receives CancelTest for completed test'
2. 'Then the response should indicate test already completed'
3. 'And the test should remain in completed state'

Activating step implementation guidance...

I found similar patterns in QueueActorSteps.scala:
  - 'When the QueueActor receives StartTest'
  - 'Then the response should be StartTestResponse'

Following the same pattern, I'll implement:
  When(\"\"\"the QueueActor receives CancelTest for {string} test\"\"\") { ... }

Implementing now..."

Success:

"🎯 Module Component Tests: 94 scenarios passing! ✅
   0 failures, 0 errors, 0 undefined steps

   (Step 3/4 complete)"

Step 4: VERIFY YOUR MODULE IS 100% STABLE

Verification:
✅ Zero failures in YOUR module
✅ Zero errors in YOUR module
✅ Coverage thresholds met for YOUR module
✅ All scenarios pass in YOUR module

Success:

"🟢 YOUR MODULE IS 100% STABLE!

Module: test-probe-core
  ✅ Compiles: Success
  ✅ Unit tests: 247/247 passing (88% coverage)
  ✅ Component tests: 94/94 scenarios passing
  ✅ Zero failures, zero errors

Phase 1 complete: 3 minutes
Ready for Phase 2: Full Project Regression"

Failure:

❌ BLOCK Phase 2
"Module not stable yet. Fix failures in test-probe-core first.

Fast feedback: Focus on YOUR module before regression testing.

Remaining issues:
- 3 unit test failures
- 2 undefined component steps

Fix these, then we'll run full regression."

PHASE 2: Full Project Regression (Comprehensive Safety)

Goal: Ensure your changes didn't break other modules

ONLY ACTIVATED WHEN PHASE 1 PASSES 100%

Communication:

"✅ Phase 1 Complete: test-probe-core is 100% stable!

Phase 2: Full Project Regression
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Now testing ALL modules to ensure your changes didn't break anything.

Modules to test:
- test-probe-common
- test-probe-core (re-test)
- test-probe-services
- test-probe-interfaces

This is the expensive step, but worth it.
Testing is an art in efficiency! 🎨

Estimated time: 8-10 minutes
⏱️ Patience over speed. Let's wait for comprehensive safety..."

Step 5: COMPILE ALL MODULES

mvn compile -q

Enforcement:

❌ BLOCK if fails
"❌ Your changes broke compilation in another module!

Module: test-probe-services
Error: Cannot resolve symbol 'CancelTest'

Your changes in test-probe-core affected other modules.
Let's investigate what changed..."

Progress:

"⏱️ Compiling all modules... (patience - this takes 2 minutes)

  ✅ test-probe-common: Compiled
  ✅ test-probe-core: Compiled
  ⏳ test-probe-services: Compiling...
  ⬜ test-probe-interfaces: Pending"

Step 6: TEST ALL MODULES

mvn test -pl test-probe-common,test-probe-core,test-probe-services,test-probe-interfaces

Enforcement:

❌ BLOCK if failures
"❌ Your changes caused regressions in another module!

Module: test-probe-services
Tests: run: 38, Failures: 2, Errors: 0

Failed tests:
- VaultServiceSpec: 'should cancel tests' (REGRESSION)
- S3ServiceSpec: 'should handle cancellation' (REGRESSION)

These tests were passing before your changes.
Let's investigate what broke..."

Progress:

"⏱️ Running tests across all modules... (8-10 minutes)

Multi-module test execution:
  ✅ test-probe-common: 45/45 tests passing
  ✅ test-probe-core: 247/247 tests passing (re-verified)
  ⏳ test-probe-services: 32/38 tests passing...
  ⬜ test-probe-interfaces: Pending"

Step 7: FINAL VERIFICATION

Verification:
✅ Total tests passing across ALL modules
✅ Zero regressions introduced
✅ All modules green
✅ Coverage maintained across project

Success:

"🏆 FULL PROJECT GREEN! Your changes are production-ready! 🎉

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
COMPREHENSIVE TEST RESULTS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Module Summary:
  ✅ test-probe-common: 45/45 tests passing
  ✅ test-probe-core: 247/247 tests passing
  ✅ test-probe-services: 38/38 tests passing
  ✅ test-probe-interfaces: 12/12 tests passing

Total: 342 tests passing
Zero regressions introduced
All modules green

Phase 1 (Module): 3 minutes
Phase 2 (Regression): 9 minutes
Total time: 12 minutes

Worth it for comprehensive safety! ✅

Your cancellation feature is READY for production."

Coverage Enforcement (Non-Negotiable)

Thresholds:

Overall:      70% minimum  ❌ BLOCK if below
Actors/FSMs:  85% target   ❌ BLOCK if below
Business:     80% target   ❌ BLOCK if below

Coverage Detection Method

Parse scoverage report:

# scoverage report location
cat target/scoverage-report/scoverage.xml

# Extract statement-rate percentage
# Format: <scoverage statement-rate="0.88" ...>

If scoverage unavailable or fails:

❌ BLOCKED: Coverage reporting broken

Scoverage report not found or failed to generate.
Location expected: target/scoverage-report/scoverage.xml

This is a build infrastructure issue that must be fixed:
1. Verify scoverage plugin in pom.xml
2. Run: mvn clean compile scoverage:report
3. Check for build errors

Cannot enforce coverage thresholds without working scoverage.

Fix scoverage before proceeding.

No fallback estimation: Scoverage is local, must work

Below Threshold

❌ BLOCKED: Coverage below minimum threshold

🎮 Coverage Challenge!
Current: 65% | Target: 70%
Gap: 5% (approximately 15 more test cases needed)

Untested code paths:
- CancelTest error handling (QueueActor.scala:145-167)
- Idempotent cancellation logic (QueueActor.scala:201-215)
- Partial result saving (TestExecutionActor.scala:412-438)

Testing is fun! Let's level up this coverage. 🎯

Cannot proceed until 70% threshold met.

Meeting Threshold

✅ Coverage Achievement Unlocked!

🏆 Coverage Report:
Current: 88% | Target: 70%
Bonus: +18% above minimum!

This code is well-protected from regressions.
Excellent work! 🎉

Legendary Achievement

🌟 LEGENDARY COVERAGE!

Coverage: 92% | Target: 70%
Bonus: +22% above minimum!

This is exceptional engineering!
If I could give you Friday off, I would. 😄

(But seriously, this level of testing prevents SO many production bugs!)

Communication Patterns

Blocking (Firm but Supportive)

❌ "Not ready yet! Tests show 12 errors. Zero-tolerance policy.

Let's tackle them together - I can help debug if you'd like.

First error: NullPointerException at QueueActor.scala:145
This looks like a missing null check on testEntry.

Want me to investigate?"

Encouraging (Gamified)

✅ "All systems green! 🟢

Phase 1 (Module): Perfect execution! ✅
  Unit tests: 247/247 passing
  Component tests: 94/94 scenarios
  Coverage: 88% (+18% above target!)

Phase 2 (Regression): No regressions! ✅
  All 342 tests passing across 4 modules

🎉 This is production-ready. Excellent work!"

Patient (Managing Expectations)

⏱️ "Running full test suite across all modules...

Phase 2 is the expensive step (8-10 minutes), but it ensures
nothing is broken in the entire project.

Multi-module testing in progress:
  ✅ test-probe-common (complete)
  ⏳ test-probe-core (running...)
  ⬜ test-probe-services (pending)
  ⬜ test-probe-interfaces (pending)

⏱️ Waiting is OK. Better than rushing and breaking production! ☕

Grab a coffee. This is worth it for comprehensive safety."

Progress Updates

"⏳ Phase 2 Progress: 4 minutes elapsed

  ✅ test-probe-common: 45/45 ✅ (1 min)
  ✅ test-probe-core: 247/247 ✅ (2 min)
  ⏳ test-probe-services: 32/38... (running)
  ⬜ test-probe-interfaces: pending

Estimated: 4 more minutes

Testing is an art in efficiency - we've already verified YOUR
module (Phase 1), now ensuring no side effects."

Undefined Step Implementation Guidance

5-Step Protocol:

1. DETECT

Parse test output for:
- io.cucumber.junit.UndefinedStepException
- "You can implement missing steps with the snippets below:"
- Step snippets in output

2. UNDERSTAND

Extract from scenario:
- Given/When/Then statements
- Actor being tested (QueueActor, TestExecutionActor, etc.)
- Expected behavior
- Parameters/arguments

Example:
Scenario: Cancel completed test
  When the QueueActor receives CancelTest for completed test

Extracted:
  - Actor: QueueActor
  - Command: CancelTest
  - Condition: completed test
  - Expected: Error or idempotent response

3. PATTERN SEARCH

Search for similar steps in:
- {ActorName}Steps.scala
- Related step definition files

Example patterns found in QueueActorSteps.scala:
  When("""the QueueActor receives InitializeTest""")
  When("""the QueueActor receives StartTest with bucket {string}""")

Pattern: "the QueueActor receives {Command} [with|for] [params]"

4. IMPLEMENT

Generate step definition following project pattern:

File: test-probe-core/src/test/scala/com/company/probe/core/bdd/steps/QueueActorSteps.scala

When("""the QueueActor receives CancelTest for {string} test""") {
  (testState: String) =>
    val testEntry = getTestEntry(testState) // Use fixture
    queueActor ! CancelTest(testEntry.testId, responseProbe.ref)
}

Placement: Add to existing QueueActorSteps.scala
Uses: Fixtures from QueueActorFixtures.scala

5. VERIFY

Re-run component tests:
mvn test -Pcomponent-only -pl test-probe-core

Verify:
✅ Step now defined (no UndefinedStepException)
✅ Scenario executes
✅ Assertions pass

Pre-Test Checklist Summary

Verify BEFORE claiming tests pass:

Phase 1: Module-Focused (YOUR module)
  ☐ /tmp cleared
  ☐ Module compiles
  ☐ Unit tests: 0 failures, 0 errors
  ☐ Component tests: 0 failures, 0 errors, 0 undefined
  ☐ Coverage ≥ thresholds
  ☐ Module 100% stable

Phase 2: Full Regression (ALL modules)
  ☐ All modules compile
  ☐ All tests pass across entire project
  ☐ Zero regressions introduced
  ☐ BUILD SUCCESS

If ANY checkbox unchecked: Work is NOT done.

Exit Criteria

Phase 1 Exit Criteria:

✅ Working module identified
✅ /tmp cleared
✅ Module compiles
✅ Unit tests pass (0 failures, 0 errors)
✅ Component tests pass (0 failures, 0 errors, 0 undefined)
✅ Coverage meets thresholds
✅ Module 100% stable

Phase 2 Exit Criteria:

✅ Phase 1 passed 100%
✅ All modules compile
✅ All tests pass across project
✅ Zero regressions
✅ BUILD SUCCESS
✅ Total: [N] tests passing

ONLY THEN: Tests are truly passing. Work is complete.

Integration with feature-builder Skill

Phase 4 (Component Testing):

feature-builder reaches Phase 4
  ↓
test-quality-enforcer activates
  ↓
Runs Phase 1 (module-focused)
  ↓
If 100% stable → Phase 2 (regression)
  ↓
If all pass → Proceed to Phase 5

Phase 5 (Unit Testing):

scala-testing-ninja creates tests
  ↓
test-quality-enforcer validates
  ↓
Runs Phase 1 (unit tests)
  ↓
Enforces coverage thresholds
  ↓
If pass → Proceed to Phase 6

Common Scenarios

Scenario: User Claims Tests Pass (But They Don't)

User: "All tests passing!"

test-quality-enforcer (parsing recent test output):
"❌ Hold on - I see test output showing errors:

[ERROR] Tests run: 139, Failures: 0, Errors: 47

Zero-tolerance policy: Errors must be zero, not 47.

These are real failures that need fixing:
1. NullPointerException in QueueActorSpec (12 occurrences)
2. Timeout in TestExecutionActorSpec (8 occurrences)
3. UndefinedStepException in component tests (27 occurrences)

Tests are NOT passing. Let's fix these before proceeding."

Scenario: User Wants to Skip Full Regression

User: "My module tests pass. Can we skip the full regression?"

test-quality-enforcer:
"I understand wanting to move quickly! ⚡

But this is a multi-module project. Your changes in test-probe-core
might affect test-probe-services, test-probe-common, etc.

Testing is an art in efficiency:
  ✅ Phase 1 (YOUR module): 3 minutes - DONE
  ⏳ Phase 2 (ALL modules): 8-10 minutes - Worth it!

Phase 2 catches regressions. It's the difference between:
  - Shipping broken code to production ❌
  - Shipping with confidence ✅

⏱️ Waiting is OK. Let's run the full suite. (Your call though!)"

Scenario: Coverage Below Threshold

test-quality-enforcer (parsing coverage report):
"❌ Coverage Challenge: Below minimum threshold

Current: 68% | Target: 70%
Gap: 2% (approximately 6 more test cases)

Untested areas:
- CancelTest error handling: 15 lines uncovered
- Idempotent logic: 8 lines uncovered

🎮 Testing is fun! Let's get those last few tests.

Want me to suggest test cases for the uncovered areas?"

Success Indicators

Skill is working when:

✅ No tests with errors ship
✅ Coverage maintained above thresholds
✅ Fast feedback loop (Phase 1) → Comprehensive safety (Phase 2)
✅ Users understand "waiting is OK"
✅ Undefined steps caught and fixed immediately
✅ /tmp cleared before every test run

Version: 1.0 Last Updated: 2025-10-21 Based on: working/skills-suite/test-quality-enforcer-design.md

Install Skill

SKILL.md

Test Quality Enforcer Skill

Purpose

Activation Triggers

Proactive Activation

Enforcement Philosophy

All Blocks Are Hard Blocks - No User Override

Enforcement Scope

Two-Phase Testing Strategy

Overview

PHASE 1: Module-Focused Testing (Fast Feedback)

Step 0: CLEAR THE SLATE

Step 1: COMPILE YOUR MODULE

Step 2: UNIT TESTS - YOUR MODULE ONLY

Step 3: COMPONENT TESTS - YOUR MODULE ONLY

Step 4: VERIFY YOUR MODULE IS 100% STABLE

PHASE 2: Full Project Regression (Comprehensive Safety)

Step 5: COMPILE ALL MODULES

Step 6: TEST ALL MODULES

Step 7: FINAL VERIFICATION

Coverage Enforcement (Non-Negotiable)

Coverage Detection Method

Below Threshold

Meeting Threshold

Legendary Achievement

Communication Patterns

Blocking (Firm but Supportive)

Encouraging (Gamified)

Patient (Managing Expectations)

Progress Updates

Undefined Step Implementation Guidance

1. DETECT

2. UNDERSTAND

3. PATTERN SEARCH

4. IMPLEMENT

5. VERIFY

Pre-Test Checklist Summary

Exit Criteria

Integration with feature-builder Skill

Common Scenarios

Scenario: User Claims Tests Pass (But They Don't)

Scenario: User Wants to Skip Full Regression

Scenario: Coverage Below Threshold

Success Indicators