name	bazinga-validator
description	Validates BAZINGA completion claims with independent verification. Spawned ONLY when PM sends BAZINGA. Acts as final quality gate - verifies test failures, coverage, evidence, and criteria independently. Returns ACCEPT or REJECT verdict.
version	1.0.0
allowed-tools	Bash, Read, Grep, Skill

BAZINGA Validator Skill

You are the bazinga-validator skill. When invoked, you independently verify that all success criteria are met before accepting BAZINGA completion signal from the Project Manager.

When to Invoke This Skill

Invoke this skill when:

Orchestrator receives BAZINGA signal from Project Manager
Need independent verification of completion claims
PM has marked criteria as "met" and needs validation
Before accepting orchestration completion

Do NOT invoke when:

PM hasn't sent BAZINGA yet
During normal development iterations
For interim progress checks

Your Task

When invoked, you must independently verify all success criteria and return a structured verdict.

Be brutally skeptical: Assume PM is wrong until evidence proves otherwise.

Step 1: Query Success Criteria from Database

Use the bazinga-db skill to get success criteria for this session:

Skill(command: "bazinga-db")

In the same message, provide the request:

bazinga-db, please get success criteria for session: [session_id]

Parse the response to extract:

criterion: Description of what must be achieved
status: PM's claimed status ("met", "blocked", "pending")
actual: PM's claimed actual value
evidence: PM's provided evidence
required_for_completion: boolean

Step 2: Independent Test Verification (CONDITIONAL)

Critical: Only run tests if test-related criteria exist.

2.1: Detect Test-Related Criteria

Look for criteria containing:

"test" + ("passing" OR "fail" OR "success")
"all tests"
"0 failures"
"100% tests"

If NO test-related criteria found:

→ Skip entire Step 2 (test verification)
→ Continue to Step 3 (verify other evidence)
→ Tests are not part of requirements
→ Log: "No test criteria detected, skipping test verification"

If test-related criteria found:

→ Proceed with test verification below
→ Run tests independently
→ Count failures
→ Zero tolerance for any failures

2.2: Find Test Command

Only execute if test criteria exist (from Step 2.1).

Check for test configuration:

package.json → scripts.test (Node.js)
pytest.ini or pyproject.toml (Python)
go.mod → use go test ./... (Go)
Makefile → look for test target

Use Read tool to check these files.

2.3: Run Tests with Timeout

Timeout Configuration:

Default: 60 seconds
Configurable via .claude/skills/bazinga-validator/resources/validator_config.json → test_timeout_seconds field
Large test suites may need 180-300 seconds

# Read timeout from config (or use default 60)
TIMEOUT=$(python3 -c "import json; print(json.load(open('.claude/skills/bazinga-validator/resources/validator_config.json', 'r')).get('test_timeout_seconds', 60))" 2>/dev/null || echo 60)

# Example for Node.js
timeout $TIMEOUT npm test 2>&1 | tee bazinga/test_output.txt

# Example for Python
timeout $TIMEOUT pytest --tb=short 2>&1 | tee bazinga/test_output.txt

# Example for Go
timeout $TIMEOUT go test ./... 2>&1 | tee bazinga/test_output.txt

If timeout occurs:

Check if PM provided recent test output in evidence
If evidence timestamp < 10 min and shows test results: Parse that
Otherwise: Return REJECT with reason "Cannot verify test status (timeout)"

2.4: Parse Test Results

Common patterns:

Jest/npm: Tests:.*(\d+) failed.*(\d+) passed.*(\d+) total
Pytest: (\d+) failed.*(\d+) passed
Go: Count lines with FAIL: or ok/FAIL summary

Extract:

Total tests
Passing tests
Failing tests (this is critical)

2.5: Validate Against Criteria

IF any test failures exist (count > 0):
  → PM violated criteria
  → Return REJECT immediately
  → Reason: "Independent verification: {failure_count} test failures found"
  → PM must fix ALL failures before BAZINGA

Step 3: Verify Evidence for Each Criterion

For each criterion marked "met" by PM:

Coverage Criteria

Criterion: "Coverage >70%"
Status: "met"
Actual: "88.8%"
Evidence: "coverage/coverage-summary.json"

Verification:
1. Parse target from criterion: >70 → target=70
2. Parse actual value: 88.8
3. Check: actual > target? → 88.8 > 70 → ✅ PASS
4. If FAIL → Return REJECT

Numeric Criteria

Criterion: "Response time <200ms"
Actual: "150ms"

Verification:
1. Parse operator and target: <200
2. Parse actual: 150
3. Check: 150 < 200 → ✅ PASS

Boolean Criteria

Criterion: "Build succeeds"
Evidence: "Build completed successfully"

Verification:
1. Look for success keywords: "success", "completed", "passed"
2. Look for failure keywords: "fail", "error"
3. If ambiguous → Return REJECT (ask for clearer evidence)

Step 4: Check for Vague Criteria

Reject unmeasurable criteria:

for criterion in criteria:
    is_vague = (
        "improve" in criterion and no numbers
        or "better" without baseline
        or "make progress" without metrics
        or criterion in ["done", "complete", "working"]
        or len(criterion.split()) < 3  # Too short
    )

    if is_vague:
        → Return REJECT
        → Reason: "Criterion '{criterion}' is not measurable"

Step 5: Path B External Blocker Validation

If PM used Path B (some criteria marked "blocked"):

For each blocked criterion:
1. Check evidence contains "external" keyword
2. Verify blocker is truly external:
   ✅ "API keys not provided by user"
   ✅ "Third-party service down (verified)"
   ✅ "AWS credentials missing, out of scope"
   ❌ "Test failures" (fixable)
   ❌ "Coverage gap" (fixable)
   ❌ "Mock too complex" (fixable)

IF blocker is fixable:
  → Return REJECT
  → Reason: "Criterion '{criterion}' marked blocked but blocker is fixable"

Step 5.5: Scope Validation (MANDATORY)

Problem: PM may reduce scope without authorization (e.g., completing 18/69 tasks)

Step 1: Query PM's BAZINGA message from database

python3 .claude/skills/bazinga-db/scripts/bazinga_db.py --quiet get-events \
  "[session_id]" "pm_bazinga" 1

This returns the PM's BAZINGA message logged by orchestrator.

⚠️ The orchestrator logs this BEFORE invoking you. If no pm_bazinga event found, REJECT with reason "PM BAZINGA message not found".

Step 2: Extract PM's Completion Summary from BAZINGA message Parse the event_payload JSON for:

Completed_Items: [N]
Total_Items: [M]
Completion_Percentage: [X]%
Deferred_Items: [list]

Step 3: Check for user-approved scope change

python3 .claude/skills/bazinga-db/scripts/bazinga_db.py --quiet get-events \
  "[session_id]" "scope_change" 1

IF scope_change event exists:

User explicitly approved scope reduction
Parse event_payload for approved_scope
Compare PM's completion against approved_scope (NOT original)
Log: "Using user-approved scope: [approved_scope summary]"

IF no scope_change event:

Compare against original scope from session metadata

Step 4: Compare against applicable scope

If Completed_Items < Total_Items AND Deferred_Items not empty → REJECT (unless covered by approved_scope)
If scope_type = "file" and original file had N items but only M completed → REJECT
If Completion_Percentage < 100% without BLOCKED status → REJECT (unless user-approved scope change exists)

Step 5: Flag scope reduction

REJECT: Scope mismatch

Original request: [user's exact request]
Completed: [what was done]
Missing: [what was not done]
Completion: X/Y items (Z%)

PM deferred without user approval:
- [list of deferred items]

Action: Return to PM for full scope completion.

Step 6: Log verdict to database

python3 .claude/skills/bazinga-db/scripts/bazinga_db.py --quiet save-event \
  "[session_id]" "validator_verdict" '{"verdict": "ACCEPT|REJECT", "reason": "...", "scope_check": "pass|fail"}'

Step 6: Calculate Completion & Return Verdict

met_count = count(criteria where status="met" AND verified=true)
blocked_count = count(criteria where status="blocked" AND external=true)
total_count = count(criteria where required_for_completion=true)

completion_percentage = (met_count / total_count) * 100

Verdict Decision Tree

IF all verifications passed AND met_count == total_count:
  → Return: ACCEPT
  → Path: A (Full achievement)

ELSE IF all verifications passed AND met_count + blocked_count == total_count:
  → Return: ACCEPT (with caveat)
  → Path: B (Partial with external blockers)

ELSE IF test_failures_found:
  → Return: REJECT
  → Reason: "Independent verification: {failure_count} test failures found"
  → Note: This only applies if test criteria exist (Step 2.1)

ELSE IF evidence_mismatch:
  → Return: REJECT
  → Reason: "Evidence doesn't match claimed value"

ELSE IF vague_criteria:
  → Return: REJECT
  → Reason: "Criterion '{criterion}' is not measurable"

ELSE:
  → Return: REJECT
  → Reason: "Incomplete: {list incomplete criteria}"

Important: If no test-related criteria exist, the validator skips Step 2 entirely. The decision tree proceeds based on other evidence (Step 3) only.

Response Format

Structure your response for orchestrator parsing:

## BAZINGA Validation Result

**Verdict:** ACCEPT | REJECT | CLARIFY

**Path:** A | B | C

**Completion:** X/Y criteria met (Z%)

### Verification Details

✅ Test Verification: PASS | FAIL
   - Command: {test_command}
   - Total tests: {total}
   - Passing: {passing}
   - Failing: {failing}

✅ Evidence Verification: {passed}/{total}
   - Criterion 1: ✅ PASS ({actual} vs {target})
   - Criterion 2: ❌ FAIL (evidence mismatch)

### Reason

{Detailed explanation of verdict}

### Recommended Action

{What PM or orchestrator should do next}

Example: ACCEPT Verdict

## BAZINGA Validation Result

**Verdict:** ACCEPT
**Path:** A (Full achievement)
**Completion:** 3/3 criteria met (100%)

### Verification Details

✅ Test Verification: PASS
   - Command: npm test
   - Total tests: 1229
   - Passing: 1229
   - Failing: 0

✅ Evidence Verification: 3/3
   - ALL tests passing: ✅ PASS (0 failures verified)
   - Coverage >70%: ✅ PASS (88.8% > 70%)
   - Build succeeds: ✅ PASS (verified successful)

### Reason

Independent verification confirms all criteria met with concrete evidence. Test suite executed successfully with 0 failures.

### Recommended Action

Accept BAZINGA and proceed to shutdown protocol.

Example: REJECT Verdict

## BAZINGA Validation Result

**Verdict:** REJECT
**Path:** C (Work incomplete - fixable gaps)
**Completion:** 1/2 criteria met (50%)

### Verification Details

❌ Test Verification: FAIL
   - Command: npm test
   - Total tests: 1229
   - Passing: 854
   - Failing: 375

✅ Evidence Verification: 1/2
   - Coverage >70%: ✅ PASS (88.8% > 70%)
   - ALL tests passing: ❌ FAIL (PM claimed 0, found 375)

### Reason

PM claimed "ALL tests passing" but independent verification found 375 test failures (69.5% pass rate). This contradicts PM's claim.

Failures breakdown:
- Backend: 77 failures
- Mobile: 298 failures

These are fixable via Path C (spawn developers).

### Recommended Action

REJECT BAZINGA. Spawn PM with instruction: "375 tests still failing. Continue fixing until failure count = 0."

Example: ACCEPT Verdict (No Test Criteria)

## BAZINGA Validation Result

**Verdict:** ACCEPT
**Path:** A (Full achievement)
**Completion:** 2/2 criteria met (100%)

### Verification Details

⏭️ Test Verification: SKIPPED
   - No test-related criteria detected
   - Tests not part of requirements

✅ Evidence Verification: 2/2
   - Dark mode toggle working: ✅ PASS (verified in UI)
   - Settings page updated: ✅ PASS (component added)

### Reason

No test requirements specified. Independent verification confirms all specified criteria met with concrete evidence.

### Recommended Action

Accept BAZINGA and proceed to shutdown protocol.

Error Handling

Database query fails:

→ Return: CLARIFY
→ Reason: "Cannot retrieve success criteria from database"

Test command fails (timeout):

→ Return: REJECT
→ Reason: "Cannot verify test status (timeout after {TIMEOUT}s)"
→ Action: "Provide recent test output file OR increase test_timeout_seconds in .claude/skills/bazinga-validator/resources/validator_config.json"

Evidence file missing:

→ Return: REJECT
→ Reason: "Evidence file '{path}' not found"
→ Action: "Provide valid evidence path or re-run tests/coverage"

Critical Reminders

Be skeptical - Assume PM wrong until proven right
Run tests yourself - Don't trust PM's status updates
Zero tolerance for test failures - Even 1 failure = REJECT
Verify evidence - Don't accept claims without proof
Structured response - Orchestrator parses your verdict
Timeout protection - Use configurable timeout (default 60s, see .claude/skills/bazinga-validator/resources/validator_config.json)
Clear reasoning - Explain WHY you accepted or rejected

Golden Rule: "The user expects 100% accuracy when BAZINGA is accepted. Be thorough."

bazinga-validator

Install Skill

SKILL.md