name	testing
description	Comprehensive testing specialization covering test strategy, automation, TDD methodology, test writing, and web app testing. Use when setting up test infrastructure, writing tests, implementing TDD workflows, analyzing coverage, integrating tests into CI/CD, or testing web applications with Playwright. Framework-agnostic approach with framework-specific guidance via reference files.
author	Joseph OBrien
status	unpublished
updated	2025-12-23
version	1.0.1
tag	skill
type	skill

Testing

This skill provides comprehensive testing capabilities including test strategy, automation setup, Test-Driven Development (TDD), test writing best practices, coverage analysis, CI/CD integration, and web application testing with Playwright.

When to Use This Skill

When setting up test infrastructure for a project
When creating test strategies and test plans
When writing unit, integration, or E2E tests
When implementing TDD/test-first development
When analyzing test coverage and quality
When integrating tests into CI/CD pipelines
When testing web applications with Playwright
When debugging test failures or improving test reliability
When writing test fixtures, mock data, or factory functions
When mocking external dependencies (APIs, databases, file systems)
When organizing test file structure and test suites
When testing async code, Promises, or event-driven behavior
When implementing snapshot tests for UI components
When configuring test coverage thresholds

What This Skill Does

Test Strategy: Designs comprehensive testing strategies (unit, integration, E2E)
Test Automation: Sets up test frameworks and automation tools
TDD Methodology: Implements Test-Driven Development workflows (Red-Green-Refactor)
Test Writing: Writes focused, maintainable tests with proper patterns
Coverage Analysis: Analyzes and improves test coverage
CI/CD Integration: Integrates tests into continuous integration pipelines
Web App Testing: Tests web applications using Playwright
Test Quality: Improves test reliability and maintainability

Test Strategy

Test Pyramid

Recommended Distribution:

Unit Tests: 70% - Fast, isolated, test individual functions
Integration Tests: 20% - Test component interactions
E2E Tests: 10% - Test complete user workflows

Test Types:

Functional tests (happy path, edge cases, error handling)
Non-functional tests (performance, security, accessibility)
Regression tests (prevent breaking changes)
Smoke tests (critical path verification)

Framework Selection

JavaScript/TypeScript:

Jest, Vitest, Mocha for unit/integration
Playwright, Cypress for E2E
React Testing Library for component testing

Python:

pytest for unit/integration
Selenium, Playwright for E2E
unittest for standard library testing

Java:

JUnit for unit tests
TestNG for integration
Selenium for E2E

Go:

Built-in testing package
Testify for assertions

Rust:

Built-in test framework
Cargo test for running tests

Test-Driven Development (TDD)

TDD is a design technique, not just a testing technique. It produces better-designed, more maintainable code through small, disciplined steps.

Core Principle

Write tests before code. Always. TDD forces you to think about:

What behavior do I need?
How will I know it works?
What's the simplest implementation?

The Three Laws (Never Violate)

Write NO production code without a failing test first
Write only enough test to demonstrate one failure
Write only enough code to pass that test

Red-Green-Refactor Cycle

Phase 1: RED - Write Failing Test

Write ONE test that defines desired behavior
Run test - verify it FAILS
Verify it fails for the RIGHT reason (not syntax error)
DO NOT write implementation yet

Phase 2: GREEN - Minimal Implementation

Write MINIMAL code to make test pass
Resist urge to add extra features
Run test - verify it PASSES
If test still fails, fix implementation (not test)

Phase 3: REFACTOR - Clean Code

Remove code duplication (DRY)
Improve naming for clarity
Extract complex logic into functions
Run ALL tests - must stay green throughout
Check test coverage on changed lines

After REFACTOR, start new RED phase for next behavior.

Test Writing Patterns

Arrange-Act-Assert (AAA)

Structure:

Arrange: Set up test data and conditions
Act: Execute the code being tested
Assert: Verify the expected outcome

Example:

describe('UserService', () => {
  it('should create user with valid data', async () => {
    // Arrange
    const userData = { email: 'test@example.com', name: 'Test User' };

    // Act
    const result = await userService.createUser(userData);

    // Assert
    expect(result).toHaveProperty('id');
    expect(result.email).toBe(userData.email);
  });
});

Given-When-Then (BDD Style)

Structure:

Given: Initial context/preconditions
When: Action/event that triggers behavior
Then: Expected outcome

Test Organization

File Structure:

project/
├── src/
│   └── components/
│       └── User.jsx
├── tests/
│   ├── unit/
│   │   └── User.test.jsx
│   ├── integration/
│   │   └── UserAPI.test.js
│   └── e2e/
│       └── user-flow.spec.js
├── jest.config.js
└── playwright.config.js

Coverage Analysis

Coverage Goals

Recommended Thresholds:

Lines: 80%+
Functions: 80%+
Branches: 80%+
Statements: 80%+

Critical Paths:

Always aim for 100% coverage on critical business logic
Authentication and authorization
Payment processing
Data validation

Coverage Gaps

Common Gaps:

Error handling paths
Edge cases
Boundary conditions
Integration points

Improvement Strategies:

Identify untested code paths
Add tests for error scenarios
Test edge cases and boundaries
Increase integration test coverage

CI/CD Integration

Test Pipeline

Stages:

Unit Tests: Fast feedback, run on every commit
Integration Tests: Run on pull requests
E2E Tests: Run before merging to main
Performance Tests: Run on main branch

Quality Gates:

All tests must pass
Coverage must meet threshold
No critical security issues
Performance benchmarks met

Web Application Testing with Playwright

Helper Scripts

This skill includes Python helper scripts in scripts/:

with_server.py - Manages server lifecycle (supports multiple servers). Always run with --help first to see usage.

# Single server
python scripts/with_server.py --server "npm run dev" --port 5173 -- python your_automation.py

# Multiple servers (e.g., backend + frontend)
python scripts/with_server.py \
  --server "cd backend && python server.py" --port 3000 \
  --server "cd frontend && npm run dev" --port 5173 \
  -- python your_automation.py

Decision Tree: Choosing Your Approach

User task → Is it static HTML?
    ├─ Yes → Read HTML file directly to identify selectors
    │         ├─ Success → Write Playwright script using selectors
    │         └─ Fails/Incomplete → Treat as dynamic (below)
    │
    └─ No (dynamic webapp) → Is the server already running?
        ├─ No → Run: python scripts/with_server.py --help
        │        Then use the helper + write simplified Playwright script
        │
        └─ Yes → Reconnaissance-then-action:
            1. Navigate and wait for networkidle
            2. Take screenshot or inspect DOM
            3. Identify selectors from rendered state
            4. Execute actions with discovered selectors

Playwright Best Practices

Use bundled scripts as black boxes - Use --help to see usage, then invoke directly
Use sync_playwright() for synchronous scripts
Always close the browser when done
Use descriptive selectors: text=, role=, CSS selectors, or IDs
Add appropriate waits: page.wait_for_selector() or page.wait_for_timeout()
CRITICAL: Wait for page.wait_for_load_state('networkidle') before inspection on dynamic apps

Example: Basic Playwright Script

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=True)
    page = browser.new_page()
    page.goto('http://localhost:5173')
    page.wait_for_load_state('networkidle')  # CRITICAL: Wait for JS to execute
    # ... your automation logic
    browser.close()

Examples

See examples/ directory for:

element_discovery.py - Discovering buttons, links, and inputs on a page
static_html_automation.py - Using file:// URLs for local HTML
console_logging.py - Capturing console logs during automation

Reference Files

For detailed testing patterns and workflows, load reference files as needed:

references/framework_workflows.md - Framework-specific TDD workflows and examples for Python (pytest), JavaScript (Jest, Vitest), Java (JUnit), Go, Rust
references/test_patterns.md - Common test patterns, test organization, naming conventions, test doubles (mocks, stubs, spies), parametrization, and anti-patterns
references/webapp_testing.md - Web application testing patterns, Playwright best practices, and E2E testing strategies
references/TESTING_REPORT.template.md - Test quality report template with coverage metrics, audit findings, and recommendations

When working with specific frameworks or need detailed patterns, load the appropriate reference file.

Best Practices

Test Quality

Isolation: Tests should be independent and runnable in any order
Deterministic: Tests should produce consistent results
Fast: Unit tests should run quickly (< 100ms each)
Clear: Test names should describe what they test
Maintainable: Tests should be easy to update when code changes

TDD Best Practices

One Behavior Per Test: Each test verifies ONE behavior
Descriptive Names: Test names describe the behavior being tested
Independent Tests: Tests don't depend on each other
Fast Tests: Mock external dependencies to keep tests fast
Clear Assertions: Assertions clearly show what's being verified

Common Mistakes to Avoid

❌ Writing multiple tests at once (write one test at a time)
❌ Skipping refactor phase (always refactor after green)
❌ Implementation before test (delete code and start with test)
❌ Over-engineering in GREEN (simplest thing that passes)
❌ Writing test that passes immediately (must fail first)

Test Maintenance

Review and update tests when requirements change
Remove obsolete tests
Refactor tests to reduce duplication
Keep test data factories up to date
Monitor test execution time

Integration with Other Skills

debugging: Use when tests fail unexpectedly
code-review: TDD produces code that's easier to review
dead-code-removal: Tests help identify unused code
performance: Use for performance testing strategies

Meta-Principle

TDD is a DESIGN technique, not a testing technique.

The cycle never changes: RED → GREEN → REFACTOR → Repeat

Writing tests first forces you to think about:
- What behavior do I need?
- How will I know it works?
- What's the simplest implementation?

This produces better-designed, more maintainable code.

testing

Install Skill

SKILL.md