name	moai-workflow-testing
description	Comprehensive development workflow specialist combining TDD, debugging, performance optimization, code review, PR review, and quality assurance into unified development workflows
version	2.1.0
category	workflow
modularized	true
tags	workflow, testing, debugging, performance, quality, tdd, review, pr-review
updated	Tue Jan 06 2026 00:00:00 GMT+0000 (Coordinated Universal Time)
status	active
author	MoAI-ADK Team
allowed-tools	Read, Write, Edit, Bash, Grep, Glob, mcp__context7__resolve-library-id, mcp__context7__get-library-docs

Development Workflow Specialist

Quick Reference

Unified Development Workflow provides comprehensive development lifecycle management combining TDD, AI-powered debugging, performance optimization, automated code review, and quality assurance into integrated workflows.

Core Capabilities:

Test-Driven Development: RED-GREEN-REFACTOR cycle with best practice patterns
AI-Powered Debugging: Intelligent error analysis and solution recommendations
Performance Optimization: Profiling and bottleneck detection guidance
Automated Code Review: TRUST 5 validation framework for quality analysis
PR Code Review: Multi-agent pattern with Haiku eligibility check and Sonnet parallel review
Quality Assurance: Comprehensive testing and CI/CD integration patterns
Workflow Orchestration: End-to-end development process guidance

Workflow Progression: Debug stage leads to Refactor stage, which leads to Optimize stage, then Review stage, followed by Test stage, and finally Profile stage. Each stage benefits from AI-powered analysis and recommendations.

When to Use:

Complete development lifecycle management
Enterprise-grade quality assurance implementation
Multi-language development projects
Performance-critical applications
Technical debt reduction initiatives
Automated testing and CI/CD integration
Pull request code review automation

Implementation Guide

Core Concepts

Unified Development Philosophy:

Integrates all aspects of development into cohesive workflow
AI-powered assistance for complex decision-making
Industry best practices integration for optimal patterns
Continuous feedback loops between workflow stages
Automated quality gates and validation

Workflow Components:

Component 1 - AI-Powered Debugging: The debugging component provides intelligent error classification and solution recommendations. When an error occurs, the system analyzes the error type, stack trace, and surrounding context to identify root causes and suggest appropriate fixes. The debugger references current best practices and common error resolution patterns.

Component 2 - Smart Refactoring: The refactoring component performs technical debt analysis and identifies safe automated transformation opportunities. It evaluates code complexity, duplication, and maintainability metrics to recommend specific refactoring actions with risk assessments.

Component 3 - Performance Optimization: The performance component provides real-time monitoring guidance and bottleneck detection. It helps identify CPU-intensive operations, memory leaks, and I/O bottlenecks, then recommends specific optimization strategies based on the identified issues.

Component 4 - TDD Cycle Management: The TDD component guides the RED-GREEN-REFACTOR cycle with enhanced test generation. It helps write failing tests first, implement minimal code to pass, then refactor while maintaining test coverage.

Component 5 - Automated Code Review: The code review component applies TRUST 5 framework validation with AI-powered quality analysis. It evaluates code against five trust dimensions and provides actionable improvement recommendations.

TRUST 5 Framework

The TRUST 5 framework is a conceptual quality assessment model with five dimensions. This framework provides guidance for evaluating code quality, not an implemented module.

Dimension 1 - Testability: Evaluate whether the code can be effectively tested. Consider: Are functions pure and deterministic? Are dependencies injectable? Is the code modular enough for unit testing? Scoring ranges from low testability requiring significant refactoring to high testability with excellent test coverage support.

Dimension 2 - Readability: Assess how easily the code can be understood by others. Consider: Are variable and function names descriptive? Is the code structure logical? Are complex operations documented? Scoring evaluates naming conventions, code organization, and documentation quality.

Dimension 3 - Understandability: Evaluate the conceptual clarity of the implementation. Consider: Is the business logic clearly expressed? Are abstractions appropriate? Can a new developer understand the code quickly? This goes beyond surface readability to assess architectural clarity.

Dimension 4 - Security: Assess security posture and vulnerability exposure. Consider: Are inputs validated? Are secrets properly managed? Are common vulnerability patterns avoided (injection, XSS, CSRF)? Scoring evaluates adherence to security best practices.

Dimension 5 - Transparency: Evaluate operational visibility and debuggability. Consider: Is error handling comprehensive? Are logs meaningful and structured? Can issues be traced through the system? Scoring assesses observability and troubleshooting capabilities.

Overall TRUST Score Calculation: The overall TRUST score combines all five dimensions using weighted averaging. Critical issues in any dimension can override the average, ensuring security or testability problems are not masked by high scores elsewhere. A passing score typically requires minimum thresholds in each dimension plus an acceptable weighted average.

Basic Workflow Implementation

Debugging Workflow Process:

Step 1: Capture the error with full context including stack trace, environment, and recent code changes
Step 2: Classify the error type (syntax, runtime, logic, integration, or performance)
Step 3: Analyze the error pattern against known issue databases and best practices
Step 4: Generate solution candidates ranked by likelihood of success
Step 5: Apply the recommended fix and verify resolution
Step 6: Document the issue and solution for future reference

Refactoring Workflow Process:

Step 1: Analyze the target codebase for code smells and technical debt indicators
Step 2: Calculate complexity metrics including cyclomatic complexity and coupling
Step 3: Identify refactoring opportunities with associated risk levels
Step 4: Generate a refactoring plan with prioritized actions
Step 5: Apply refactoring transformations in safe increments
Step 6: Verify behavior preservation through test execution

Performance Optimization Process:

Step 1: Configure profiling for target metrics (CPU, memory, I/O, network)
Step 2: Execute profiling runs under representative load conditions
Step 3: Analyze profiling results to identify bottlenecks
Step 4: Generate optimization recommendations with expected impact estimates
Step 5: Apply optimizations in isolation to measure individual effects
Step 6: Validate overall performance improvement

TDD Cycle Process:

RED Phase: Write a failing test that defines the desired behavior. The test should clearly express what the code should do, not how it should do it. Run the test to confirm it fails for the expected reason.
GREEN Phase: Write the minimum code necessary to make the test pass. Avoid over-engineering or premature optimization. Focus solely on satisfying the test requirements.
REFACTOR Phase: Improve the code structure while keeping all tests passing. Apply design patterns, extract common functionality, and improve naming. Run tests after each refactoring step.

Code Review Process:

Step 1: Scan the codebase to identify files requiring review
Step 2: Apply TRUST 5 framework analysis to each file
Step 3: Identify critical issues requiring immediate attention
Step 4: Calculate per-file and aggregate quality scores
Step 5: Generate actionable recommendations with priority rankings
Step 6: Create a summary report with improvement roadmap

PR Code Review Process:

Step 1: Eligibility Check using Haiku agent to filter PRs (skip closed, draft, already reviewed, trivial changes)
Step 2: Gather Context by finding CLAUDE.md files in modified directories and summarizing PR changes
Step 3: Parallel Review Agents using five Sonnet agents for independent analysis (CLAUDE.md compliance, obvious bugs, git blame context, previous comments, code comment compliance)
Step 4: Confidence Scoring from 0-100 for each detected issue (0: false positive, 25: somewhat confident, 50: moderately confident, 75: highly confident, 100: absolutely certain)
Step 5: Filter and Report by removing issues below 80 confidence threshold and posting via gh CLI

Common Use Cases

Enterprise Development Workflow: For enterprise applications, the workflow integrates quality gates at each stage. Before deployment, the code must pass minimum TRUST score thresholds, have zero critical issues identified, and meet required test coverage percentages. The quality gates configuration specifies minimum trust scores (typically 0.85), maximum allowed critical issues (typically zero), and required coverage levels (typically 80 percent).

Performance-Critical Applications: For performance-sensitive systems, the workflow emphasizes profiling and optimization stages. Performance thresholds define maximum acceptable response times, memory usage limits, and minimum throughput requirements. The workflow provides percentage improvement tracking and specific optimization recommendations.

Advanced Features

Workflow Integration Patterns

Continuous Integration Integration: The workflow integrates with CI/CD pipelines through a multi-stage validation process. The CI pipeline executes the following stages sequentially:

Stage 1 - Code Quality Validation: Run the code review component and verify results meet quality standards. If the quality check fails, the pipeline terminates with a quality failure report.

Stage 2 - Testing Validation: Execute the full test suite including unit, integration, and end-to-end tests. If any tests fail, the pipeline terminates with a test failure report.

Stage 3 - Performance Validation: Run performance tests and compare results against defined thresholds. If performance standards are not met, the pipeline terminates with a performance failure report.

Stage 4 - Security Validation: Execute security analysis including static analysis and dependency scanning. If critical vulnerabilities are found, the pipeline terminates with a security failure report.

Upon passing all stages, the pipeline generates a success report and proceeds to deployment.

Quality Gate Configuration

Quality gates define the criteria that must be met at each workflow stage. Gates can be configured with different strictness levels:

Strict Mode: All quality dimensions must meet or exceed thresholds. Any critical issue blocks progression. Full test coverage requirements apply.

Standard Mode: Average quality score must meet threshold. Critical issues block progression, but warnings are allowed. Standard coverage requirements apply.

Lenient Mode: Only critical blocking issues prevent progression. Quality scores generate warnings but do not block. Reduced coverage requirements apply.

Gate configuration includes threshold values for each TRUST dimension, maximum allowed issues by severity, required test coverage levels, and performance benchmark targets.

Multi-Language Support

The workflow supports development across multiple programming languages. Language-specific adaptations include:

Python Projects: Integration with pytest for testing, pylint and flake8 for static analysis, bandit for security scanning, and cProfile or memory_profiler for performance analysis.

JavaScript/TypeScript Projects: Integration with Jest or Vitest for testing, ESLint for static analysis, npm audit for security scanning, and Chrome DevTools or lighthouse for performance analysis.

Go Projects: Integration with go test for testing, golint and staticcheck for static analysis, gosec for security scanning, and pprof for performance analysis.

Rust Projects: Integration with cargo test for testing, clippy for static analysis, cargo audit for security scanning, and flamegraph for performance analysis.

PR Code Review Multi-Agent Pattern

The PR Code Review process uses a multi-agent architecture following the official Claude Code plugin pattern:

Eligibility Check Agent (Haiku): The Haiku agent performs lightweight filtering to avoid unnecessary reviews. It checks the PR state and metadata to determine if review is warranted. Skip conditions include closed PRs, draft PRs, PRs already reviewed by bot, trivial changes like typo fixes, and automated dependency updates.

Context Gathering: Before launching review agents, the system gathers relevant context by finding CLAUDE.md files in directories containing modified code to understand project-specific coding standards. It also generates a concise summary of PR changes including files modified, lines added or removed, and overall impact assessment.

Parallel Review Agents (Sonnet x5): Five Sonnet agents run in parallel, each focusing on a specific review dimension. Agent 1 audits CLAUDE.md compliance checking for violations of documented coding standards and conventions. Agent 2 scans for obvious bugs including logic errors, null reference risks, and resource leaks. Agent 3 provides git blame and history context to identify recent changes and potential patterns. Agent 4 checks previous PR comments for recurring issues and unresolved feedback. Agent 5 validates code comment compliance ensuring comments are accurate and helpful.

Confidence Scoring System: Each detected issue receives a confidence score from 0 to 100. A score of 0 indicates a false positive with no confidence. A score of 25 means somewhat confident but might be real. A score of 50 indicates moderately confident that the issue is real but minor. A score of 75 means highly confident that the issue is very likely real. A score of 100 indicates absolutely certain that the issue is definitely real.

Filter and Report Stage: Issues below the 80 confidence threshold are filtered out to reduce noise. Remaining issues are formatted and posted to the PR using the GitHub CLI. The output format follows a standardized markdown structure with issue count, numbered list of issues, and direct links to code with specific commit SHA and line range.

Example PR Review Output:

### Code review

Found 3 issues:

1. Missing error handling for database connection (CLAUDE.md says "All database operations must include error handling")
   [path/to/file.py:42-45](https://github.com/repo/pull/123/files/abc123#diff-123)

2. Potential SQL injection vulnerability (Security rule: "Use parameterized queries")
   [path/to/query.py:88](https://github.com/repo/pull/123/files/abc123#diff-456)

3. Inconsistent naming convention (CLAUDE.md says "Use snake_case for function names")
   [path/to/utils.py:102](https://github.com/repo/pull/123/files/abc123#diff-789)

Works Well With

moai-domain-backend: Backend development workflows and API testing patterns
moai-domain-frontend: Frontend development workflows and UI testing strategies
moai-foundation-core: Core SPEC system and workflow management integration
moai-platform-supabase: Supabase-specific testing patterns and database testing
moai-platform-vercel: Vercel deployment testing and edge function validation
moai-platform-firebase-auth: Firebase authentication testing patterns
moai-workflow-project: Project management and documentation workflows

Technology Stack Reference

The workflow leverages industry-standard tools for each capability area:

Analysis Libraries:

cProfile provides Python profiling and performance analysis
memory_profiler enables memory usage analysis and optimization
psutil supports system resource monitoring
line_profiler offers line-by-line performance profiling

Static Analysis Tools:

pylint performs comprehensive code analysis and quality checks
flake8 enforces style guide compliance and error detection
bandit scans for security vulnerabilities
mypy validates static types

Testing Frameworks:

pytest provides advanced testing with fixtures and plugins
unittest offers standard library testing capabilities
coverage measures code coverage and identifies untested paths

Integration Patterns

GitHub Actions Integration

The workflow integrates with GitHub Actions through a multi-step job configuration:

Job Configuration Steps:

Step 1: Check out the repository using actions/checkout
Step 2: Set up the Python environment using actions/setup-python with the target Python version
Step 3: Install project dependencies including testing and analysis tools
Step 4: Execute the quality validation workflow with strict quality gates
Step 5: Run the test suite with coverage reporting
Step 6: Perform performance benchmarking against baseline metrics
Step 7: Execute security scanning and vulnerability detection
Step 8: Upload workflow results as job artifacts for review

The job can be configured to run on push and pull request events, with matrix testing across multiple Python versions if needed.

Docker Integration

For containerized environments, the workflow executes within Docker containers:

Container Configuration:

Base the image on a Python slim variant for minimal size
Install project dependencies from requirements file
Copy project source code into the container
Configure entrypoint to execute the complete workflow sequence
Mount volumes for result output if persistent storage is needed

The containerized workflow ensures consistent execution environments across development, testing, and production systems.

Status: Production Ready Last Updated: 2026-01-06 Maintained by: MoAI-ADK Development Workflow Team Version: 2.1.0 (PR Code Review Multi-Agent Pattern)

moai-workflow-testing

Install Skill

SKILL.md