Claude Code Plugins

Community-maintained marketplace

Feedback

AI-native E2E testing orchestration for Claude Code. Use when creating, running, or managing end-to-end test scenarios with visual evidence capture. Activates for tasks involving E2E tests, browser automation testing, test scenario creation, test execution with screenshots, or checking test status.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name harshjudge
description AI-native E2E testing orchestration for Claude Code. Use when creating, running, or managing end-to-end test scenarios with visual evidence capture. Activates for tasks involving E2E tests, browser automation testing, test scenario creation, test execution with screenshots, or checking test status.

HarshJudge E2E Testing

AI-native E2E testing with MCP tools and visual evidence capture.

Core Principles

  1. Evidence First: Screenshot before and after every action
  2. Fail Fast: Stop on error, report with context
  3. Complete Runs: Always call completeRun, even on failure
  4. Step Isolation: Each step executes in its own spawned agent for token efficiency
  5. Knowledge Accumulation: Learnings go to prd.md, not scenarios

Step-Based Execution

HarshJudge uses a step-based agent pattern for token-efficient test execution:

Main Agent                    Step Agents (spawned per step)
    │
    ├─ startRun(scenarioSlug)
    │      ↓
    │  Returns: runId, steps[]
    │
    ├─► Spawn Agent: Step 01 ──────────────────────► Execute actions
    │      │                                              │
    │      │ ◄─────────────────────────────────── Return: { status, evidencePaths }
    │      │
    │   completeStep(runId, "01", status)
    │      │
    ├─► Spawn Agent: Step 02 ──────────────────────► Execute actions
    │      │                                              │
    │      │ ◄─────────────────────────────────── Return: { status, evidencePaths }
    │      │
    │   completeStep(runId, "02", status)
    │      │
    │   ... (repeat for each step)
    │
    └─ completeRun(runId, finalStatus)

Benefits:

  • Each step agent has isolated context (no token accumulation)
  • Large outputs (screenshots, logs) saved to files, not returned
  • Main agent only receives concise summaries
  • Automatic token optimization without manual management

Workflows

Intent Reference Key Tools
Initialize project references/setup.md initProject
Create scenario references/create.md createScenario
Run scenario references/run.md startRun, completeStep, completeRun
Fix failed test references/iterate.md getStatus, createScenario
Check status references/status.md getStatus

Project Structure

.harshJudge/
  config.yaml              # Project configuration
  prd.md                   # Product requirements (from assets/prd.md template)
  scenarios/{slug}/
    meta.yaml              # Scenario definition + run statistics
    steps/                 # Individual step files
      01-step-slug.md      # Step 01 details
      02-step-slug.md      # Step 02 details
      ...
    runs/{runId}/          # Run history
      result.json          # Run result with per-step data
      step-01/evidence/    # Step 01 evidence
      step-02/evidence/    # Step 02 evidence
      ...
  snapshots/               # Inspection tool outputs (token-saving pattern)

Quick Reference

HarshJudge MCP Tools

Tool Purpose
initProject Initialize project (spawns dashboard)
createScenario Create/update scenario with step files
toggleStar Toggle/set scenario starred status
startRun Start test run, returns step list
recordEvidence Capture evidence for a step
completeStep Complete a step, get next step ID
completeRun Finalize run with status
getStatus Check project or scenario status
openDashboard / closeDashboard Manage dashboard server

Playwright MCP Tools

Tool Purpose
browser_navigate Navigate to URL
browser_snapshot Get accessibility tree (use before click/type)
browser_click Click element using ref
browser_type Type into input using ref
browser_take_screenshot Capture screenshot for evidence
browser_console_messages Get console logs
browser_network_requests Get network activity
browser_wait_for Wait for text/condition

Step Agent Prompt Template

When spawning an agent for each step:

Execute step {stepId} of scenario {scenarioSlug}:

## Step Content
{content from steps/{stepId}-{slug}.md}

## Project Context
Base URL: {from config.yaml}
Auth: {from prd.md if needed}

## Previous Step
Status: {pass|fail|first step}

## Your Task
1. Execute the actions using Playwright MCP tools
2. Use browser_snapshot before clicking to get element refs
3. Capture before/after screenshots using browser_take_screenshot
4. Record evidence using recordEvidence with step={stepNumber}

Return ONLY a JSON object:
{
  "status": "pass" | "fail",
  "evidencePaths": ["path1.png", "path2.png"],
  "error": null | "error message"
}

DO NOT return full evidence content. DO NOT explain your work.

Error Handling

On ANY error:

  1. STOP - Do not proceed
  2. Report - Tool, params, error, resolution
  3. Check prd.md - Is this a known pattern?
  4. Do NOT retry - Unless user instructs