name	quality-check
description	Automated quality evaluation for code and documentation outputs. Use when reviewing code quality, running self-critique loops, evaluating substantial outputs, or when user mentions quality, review, critique, evaluate, check, validate, or score.

Quality Check Skill

Automated quality evaluation for code and documentation outputs

Overview

Provides structured quality evaluation workflows for code reviews and self-critique, ensuring outputs meet quality standards before delivery.

Core capabilities:

Code review with systematic evaluation flow
Self-critique loop for substantial outputs
Automated scoring against quality dimensions
Integration with PRAGMATIC quality framework
Template-driven review processes

When to Use This Skill

Auto-invoked when:

Before commit (for code quality gate)
After substantial output produced (>200 lines code/docs)
User mentions "review", "check quality", "validate"
Before PR creation

Manual invocation:

Mid-development quality checks
Documentation review
Architecture decision evaluation
Output refinement loops

Capabilities

1. Code Review Flow

Systematic evaluation of code changes before commit:

python scripts/code_review_flow.py --files <changed-files>

Evaluation steps:

Context - Understand requirements and scope
Critical Path - Inspect main execution path logic
Edge Cases - Check error handling, null paths, config changes
Tests - Verify test coverage for new behavior
Quality - Check clarity, naming, architecture alignment

Output format:

Findings:
1. [Severity] file:line – issue summary + impact + fix suggestion

Questions:
- Clarifications needed

Next Steps:
- Actions before approval

2. Self-Critique Loop

Iterative quality improvement for substantial outputs:

python scripts/self_critique.py --input <output-file> --criteria clarity,completeness,actionability

Workflow:

Set criteria (Clarity, Completeness, Actionability, Accuracy, Relevance)
Score 1-10 (anything <8 triggers improvement)
List concrete fixes
Revise and rescore
Stop when all scores ≥8

Benefits:

Catches issues before user sees them
Systematic improvement process
Prevents endless polishing (2-3 cycles typical)

3. Quality Scoring

Automated scoring against quality dimensions:

python scripts/score_output.py --input <file> --dimensions clarity,completeness,actionability,accuracy,relevance

Scoring dimensions:

Clarity (1-10): Easy to understand, well-structured, jargon-free
Completeness (1-10): Addresses all requirements, no gaps
Actionability (1-10): Clear next steps, specific guidance
Accuracy (1-10): Factually correct, properly researched
Relevance (1-10): Focused on user needs, no tangents

Threshold: ≥8/10 on all dimensions = ready for delivery

Workflows

Workflow 1: Pre-Commit Code Review

Pattern: Systematic quality gate before commits

trigger: before-commit
workflow:
  1. Detect uncommitted changes
  2. Run code_review_flow.py
  3. Evaluate:
       - Critical path logic
       - Edge case handling
       - Test coverage
       - Code quality
  4. If blockers found:
       - Present to user
       - Block commit
  5. If approved:
       - Proceed to commit

Integration with /commit:

Quality gates delegation pattern:

/commit (orchestration):
- Classifies changes (docs/config/code/mixed)
- Determines if quality gates needed
- Delegates to language skill
Language skill (python/resources/quality-gates.md):
- Executes domain-specific checks
- ruff → mypy → pytest (Python)
- Captures evidence
- Returns pass/fail
This skill (quality-check):
- Provides evidence format standards
- Scoring rubrics for quality dimensions
- Self-critique workflows for substantial outputs

Evidence format (consistent across all):

✓ Linting: All checks passed (ruff)
✓ Type checking: No issues found (mypy)
✓ Tests: 10 passed in 2.5s (pytest)

Separation of concerns:

commit.md: Workflow orchestration
python skill: Python-specific quality checks
quality-check skill: Evidence standards + scoring frameworks

Workflow 2: Self-Critique After Output

Pattern: Iterative improvement of substantial outputs

trigger: substantial-output-produced
workflow:
  1. Detect output size (>200 lines)
  2. Run self_critique.py with standard criteria
  3. Score output (1-10 per dimension)
  4. If any score <8:
       - List specific fixes
       - Apply improvements
       - Rescore
  5. Repeat until all scores ≥8
  6. Deliver refined output

Example criteria:

Documentation: Clarity, Completeness, Examples, Accuracy
Code: Correctness, Maintainability, Test Coverage, Performance
Architecture: Scalability, Reversibility, Orthogonality

Workflow 3: Documentation Quality Check

Pattern: Validate documentation before publishing

trigger: documentation-complete
workflow:
  1. Run score_output.py with doc criteria
  2. Check:
       - Clarity (well-structured, readable)
       - Completeness (all sections present)
       - Accuracy (verified facts, current info)
       - Examples (code samples work)
       - Links (no broken references)
  3. If score <8:
       - Fix identified issues
       - Re-run scoring
  4. If score ≥8:
       - Approve for publishing

Evidence Format Standards

Display format (consistent across commits/PRs):

✓ Linting: All checks passed (ruff/eslint) ✓ Type checking: No issues found (mypy) ✓ Tests: 10 passed in 2.5s (pytest/jest) ✓ Build: Success (0 errors)

Symbol: Use ✓ (text checkmark U+2713), NOT ✅ (emoji checkmark)

Pattern: ✓ Check: Result (tool)

Why:

Consistent with NO emoji rule (pragmatic/agent-output-standards.md)
Plain text (accessible, parseable by scripts)
Professional tone
Aligns with conventional commit standards

Include in:

Commit messages (after running quality checks)
PR descriptions (show validation passed)
Validation reports (consistent format)

Why this matters:

Verifiable outcomes (not "looks good")
Builds confidence through evidence
Consistent across all quality gates
Enables automation (parseable format)

Scripts Reference

`scripts/code_review_flow.py`

Purpose: Systematic code review with structured output

Usage:

# Review uncommitted files
python scripts/code_review_flow.py --auto

# Review specific files
python scripts/code_review_flow.py --files src/auth/*.ts

# With custom review template
python scripts/code_review_flow.py --template resources/review-template.yaml

Options:

--auto: Auto-detect changed files
--files: Specific file patterns
--template: Custom review template
--output: Save findings to file

Review checklist:

Requirements understood
Critical path logic correct
Edge cases handled
Tests cover new behavior
Code quality (naming, clarity, architecture)

`scripts/self_critique.py`

Purpose: Iterative quality improvement loop

Usage:

# Auto-critique last output
python scripts/self_critique.py --auto

# Critique specific file
python scripts/self_critique.py --input output.md

# Custom criteria
python scripts/self_critique.py --input doc.md --criteria clarity,examples,accuracy

# Max iterations
python scripts/self_critique.py --input doc.md --max-iterations 3

Options:

--auto: Critique last substantial output
--input: File to critique
--criteria: Quality dimensions (comma-separated)
--max-iterations: Prevent endless polishing (default: 3)
--threshold: Minimum score to pass (default: 8)

Output:

Self-Evaluation (Iteration 1):
- Clarity: 7/10 (needs better structure)
- Completeness: 9/10 (good)
- Actionability: 6/10 (missing next steps)
- Accuracy: 10/10 (verified)
- Relevance: 8/10 (focused)

Improvements needed:
1. Add section headings for better structure
2. Include explicit "Next Steps" section
3. Add code examples for clarity

[Apply fixes...]

Self-Evaluation (Iteration 2):
- Clarity: 9/10 (improved)
- Completeness: 9/10 (good)
- Actionability: 9/10 (fixed)
- Accuracy: 10/10 (verified)
- Relevance: 8/10 (focused)

✓ All scores ≥8. Ready for delivery.

`scripts/score_output.py`

Purpose: Score output against quality dimensions

Usage:

# Score with default dimensions
python scripts/score_output.py --input output.md

# Custom dimensions
python scripts/score_output.py --input code.py --dimensions correctness,maintainability,performance

# Output as JSON
python scripts/score_output.py --input doc.md --format json

# Compare before/after
python scripts/score_output.py --input doc-v1.md --compare doc-v2.md

Options:

--input: File to score
--dimensions: Quality dimensions to evaluate
--format: Output format (text, json, yaml)
--compare: Compare scores with another file
--rubric: Custom scoring rubric

Output:

{
  "file": "output.md",
  "overall_score": 8.4,
  "dimensions": {
    "clarity": {
      "score": 9,
      "notes": "Well-structured with clear headings"
    },
    "completeness": {
      "score": 8,
      "notes": "Covers all requirements"
    },
    "actionability": {
      "score": 8,
      "notes": "Clear next steps provided"
    },
    "accuracy": {
      "score": 9,
      "notes": "Facts verified"
    },
    "relevance": {
      "score": 8,
      "notes": "Focused on user needs"
    }
  },
  "pass": true,
  "threshold": 8
}

Resources

`resources/review-template.yaml`

Code review template with evaluation criteria:

review_template:
  context:
    - "Understand requirements and scope"
    - "Review PR description or ticket"
    - "Identify linked issues or dependencies"

  critical_path:
    - "Inspect main execution path first"
    - "Verify logic correctness"
    - "Check for obvious bugs"

  edge_cases:
    - "Error handling for failures"
    - "Null/undefined checks"
    - "Configuration changes impact"
    - "Boundary conditions"

  tests:
    - "Existing tests cover new behavior"
    - "Edge cases have tests"
    - "Tests are maintainable"

  quality:
    - "Clear variable/function names"
    - "Follows project conventions"
    - "Architecture alignment (PRAGMATIC principles)"
    - "No duplication or magic values"

`resources/scoring-rubric.yaml`

Quality scoring rubric for different output types:

# Code Quality Rubric
code:
  correctness:
    10: "Logic correct, all edge cases handled"
    8: "Core logic correct, minor edge cases missed"
    6: "Logic correct, major edge cases missed"
    4: "Logic flawed but fixable"
    2: "Fundamental logic errors"

  maintainability:
    10: "Clear, well-structured, follows conventions"
    8: "Generally clear, minor inconsistencies"
    6: "Understandable but needs improvement"
    4: "Confusing structure or naming"
    2: "Unmaintainable"

  test_coverage:
    10: "Comprehensive tests including edge cases"
    8: "Good coverage, minor gaps"
    6: "Basic coverage, significant gaps"
    4: "Minimal testing"
    2: "No tests"

# Documentation Quality Rubric
documentation:
  clarity:
    10: "Crystal clear, well-structured, scannable"
    8: "Clear with minor structure improvements possible"
    6: "Understandable but dense or poorly organized"
    4: "Confusing structure, hard to follow"
    2: "Incomprehensible"

  completeness:
    10: "All sections present, nothing missing"
    8: "Core content complete, minor gaps"
    6: "Significant sections missing"
    4: "Major gaps"
    2: "Skeleton only"

  examples:
    10: "Comprehensive, working examples for all use cases"
    8: "Good examples, minor gaps"
    6: "Basic examples, significant gaps"
    4: "Minimal examples"
    2: "No examples"

  accuracy:
    10: "All facts verified, up-to-date"
    8: "Generally accurate, minor errors"
    6: "Some inaccuracies"
    4: "Multiple errors"
    2: "Fundamentally incorrect"

# Architecture Decision Rubric
architecture:
  scalability:
    10: "Scales seamlessly, proven patterns"
    8: "Scales well, minor concerns"
    6: "Scalability concerns exist"
    4: "Poor scalability"
    2: "Will not scale"

  reversibility:
    10: "Fully reversible, clear exit path"
    8: "Mostly reversible"
    6: "Some irreversible aspects"
    4: "Difficult to reverse"
    2: "Irreversible"

  orthogonality:
    10: "Fully decoupled, independent concerns"
    8: "Well-separated with minor coupling"
    6: "Some coupling concerns"
    4: "Significant coupling"
    2: "Tightly coupled"

Integration with Optimal Workflow

This Skill integrates with the optimal workflow pattern:

Quality Gate Pattern (from optimal-workflow.md):

Session → Scope → Execute → **quality-check** → commit → Notes
                                    ↑
                              (Auto-invoked here)

Code Review Before Commit (MANDATORY):

Before /commit:
  1. quality-check Skill auto-invokes
  2. Run code_review_flow.py
  3. Evaluate quality dimensions
  4. Block if critical issues found
  5. Approve if standards met

Auto-Detection Triggers

This Skill loads automatically when:

Before commit:
- User runs /commit
- Uncommitted changes detected
After substantial output:
- Output produced >200 lines
- User completes major deliverable
Quality keywords:
- "review quality"
- "check this"
- "how does this look"
- "critique my work"
- "validate output"
Documentation finalization:
- User indicates docs are "done"
- PR description written

Best Practices

Do's:

✅ Always run code review before committing
✅ Set meaningful criteria for self-critique
✅ Stop after 2-3 iterations (avoid endless polishing)
✅ Focus improvements on low-scoring dimensions
✅ Document quality standards in project rubric

Don'ts:

❌ Skip review for "quick fixes" (when bugs sneak in)
❌ Polish endlessly (diminishing returns after 2-3 cycles)
❌ Use generic criteria (customize for context)
❌ Ignore scores <8 without addressing them
❌ Apply fixes without understanding root cause

Examples

Example 1: Pre-Commit Code Review (Auto-Invoke)

Scenario: User runs /commit, quality-check auto-invokes

# Auto-invoked by /commit
python scripts/code_review_flow.py --auto

# Output:
Code Review Results:

Findings:
1. [HIGH] src/auth/login.ts:42 – Missing error handling for async token refresh
   Impact: Users see unhandled promise rejection
   Fix: Add try-catch around token refresh call

2. [MEDIUM] src/utils/cache.ts:78 – Magic number 3600 (cache TTL)
   Impact: Hard to configure, buried in code
   Fix: Extract to config constant CACHE_TTL_SECONDS

Questions:
- Should token refresh retry on failure?

Next Steps:
- Fix HIGH finding before commit
- Consider config extraction for MEDIUM finding
- Clarify retry behavior

BLOCKED: Resolve HIGH findings before committing.

Example 2: Self-Critique Loop

Scenario: Documentation written, needs quality check

python scripts/self_critique.py --input ARCHITECTURE.md --criteria clarity,completeness,examples,accuracy

# Iteration 1:
Self-Evaluation:
- Clarity: 7/10 (needs better structure)
- Completeness: 8/10 (good)
- Examples: 5/10 (missing code samples)
- Accuracy: 9/10 (facts verified)

Improvements:
1. Add section headings for better scan-ability
2. Include code examples for each API
3. Add diagram for architecture overview

[Apply fixes...]

# Iteration 2:
Self-Evaluation:
- Clarity: 9/10 (much improved)
- Completeness: 8/10 (good)
- Examples: 8/10 (added samples)
- Accuracy: 9/10 (verified)

✓ All scores ≥8. Ready for publishing.

Example 3: Quality Scoring Comparison

Scenario: Compare quality before/after refactoring

python scripts/score_output.py --input src/api/v1.ts --compare src/api/v2.ts

# Output:
Quality Score Comparison:

src/api/v1.ts:
  Correctness: 6/10
  Maintainability: 5/10
  Test Coverage: 7/10
  Performance: 6/10
  Overall: 6.0/10

src/api/v2.ts:
  Correctness: 9/10 (+3)
  Maintainability: 9/10 (+4)
  Test Coverage: 8/10 (+1)
  Performance: 8/10 (+2)
  Overall: 8.5/10 (+2.5)

✓ Refactoring improved quality significantly.

Troubleshooting

Issue: Skill not auto-invoking before commit

Check: /commit command includes quality-check integration
Verify: Quality-check listed in SKILL.md trigger-keywords
Debug: Manually run python scripts/code_review_flow.py --auto

Issue: Self-critique stuck in endless loop

Check: Using --max-iterations flag
Adjust: Lower threshold or adjust criteria
Solution: Accept "good enough" after 2-3 iterations

Issue: Scores seem arbitrary

Check: Review resources/scoring-rubric.yaml for definitions
Customize: Adjust rubric for project-specific standards
Verify: Scoring aligns with team quality expectations

Skills:

coderabbit - Expert code review automation
code-reviewer (agent) - Comprehensive review with project context
git - Commit message validation

Commands:

/commit - Auto-invokes this Skill for quality gate
/pr-review - PR review workflow

Framework:

quality-check/SKILL.md § Quality Gate Pattern - Quality gate patterns (Step 4)
PRAGMATIC.md - Quality framework principles

Related Skills

introspection: For reasoning validation and cognitive error detection. Quality-check validates output artifacts, introspection validates reasoning processes.

performance: For runtime performance profiling. Quality-check gates code quality before commit, performance debugs execution bottlenecks.

Skill auto-loads via progressive disclosure - only appears when quality checks needed.

Install Skill

SKILL.md

Quality Check Skill

Overview

When to Use This Skill

Capabilities

1. Code Review Flow

2. Self-Critique Loop

3. Quality Scoring

Workflows

Workflow 1: Pre-Commit Code Review

Workflow 2: Self-Critique After Output

Workflow 3: Documentation Quality Check

Evidence Format Standards

Scripts Reference

scripts/code_review_flow.py

scripts/self_critique.py

scripts/score_output.py

Resources

resources/review-template.yaml

resources/scoring-rubric.yaml

Integration with Optimal Workflow

Auto-Detection Triggers

Best Practices

Examples

Example 1: Pre-Commit Code Review (Auto-Invoke)

Example 2: Self-Critique Loop

Example 3: Quality Scoring Comparison

Troubleshooting

Related

Related Skills

`scripts/code_review_flow.py`

`scripts/self_critique.py`

`scripts/score_output.py`

`resources/review-template.yaml`

`resources/scoring-rubric.yaml`