name	run-tests
description	Execute tests and record results in tests-in-progress.json. Use this skill during the run_tests stage of test execution to add trajectory, verified, passes, and blocked fields.

Run Tests

Reference Documentation

For test definition format and writing good steps, see references/test-definitions.md.

Execute tests for the given ticket and record results in tests-in-progress.json.

Context

This skill is part of the test execution pipeline:

tests-definition.json (READ-ONLY)
        |
        | load-tests.sh (filter by mode)
        v
tests-in-progress.json  <-- YOU ARE HERE (run_tests stage)
        |
        | verification-setup.sh
        v
tests-in-verification.json (verify stage)
        |
        | merge-results.sh
        v
tests-results.json (final)

Input File

Location: docs/3-tests/tickets/{ticket_id}/tests-in-progress.json

This file is created by load-tests.sh from tests-definition.json. It contains filtered tests to execute.

Input Format

{
  "ticket": "T00015",
  "title": "Feature Implementation Tests",
  "tests": [
    {
      "id": "T00015-01",
      "name": "Feature works end-to-end",
      "description": "Verify feature implementation",
      "ticket-completion": true,
      "steps": [
        { "step": "Action with verifiable expectation", "trajectory": [], "verified": false }
      ],
      "passes": false
    }
  ]
}

Your Task

For each test in tests-in-progress.json:

Execute each step - Run commands, scripts, or manual verification
Record trajectory - Document exactly what you did and observed
Set verified - true only if expectation is exactly met
Set passes - true only if ALL steps have verified: true
Set blocked - true if blocking issue encountered

Execution Methods

Method 1: Test Scripts

If test script exists in docs/3-tests/tickets/{ticket_id}/test_*.py:

{
  "step": "POST /api/search returns status 200",
  "trajectory": [
    "Ran: python docs/3-tests/tickets/T00015/test_search.py",
    "Request sent to http://localhost:8000/api/search",
    "Response: 200 OK - MATCHES"
  ],
  "verified": true
}

Method 2: Manual Execution

If no script exists, execute manually:

{
  "step": "Config file exists at ~/.config/app/settings.json",
  "trajectory": [
    "Checked: ~/.config/app/settings.json exists",
    "File found with 245 bytes - MATCHES"
  ],
  "verified": true
}

Method 3: Create Script During Execution

If step needs automation and no script exists:

Create script in docs/3-tests/tickets/{ticket_id}/test_*.py
Run the script
Record in trajectory

Trajectory Recording

Record what you did and observed:

Action Type	Trajectory Format
Command	`"Ran: {command}"`, `"Output: {output}"`, `"Exit code: {code}"`
File check	`"Checked: {path} exists"`, `"File contains: {content}"`
API call	`"Request: {method} {url}"`, `"Response: {status}"`
UI action	`"Clicked '{element}'"`, `"Screenshot shows: {observation}"`
Assertion	`"{field}: {actual} - MATCHES"` or `"- DOES NOT MATCH expected {expected}"`

Steps Covered by Script

When a script covers multiple steps:

{
  "step": "Response contains results array",
  "trajectory": ["Covered by script: results array has 5 items - MATCHES"],
  "verified": true
}

Strict Verification Rules

Set verified: true ONLY when:

Expectation is exactly met (not approximate)
You have concrete evidence (output, screenshot, API response)
There is no ambiguity about pass/fail

Keep verified: false when:

Cannot confirm exact expectation
Partial match or approximate result
Unable to test (missing endpoint, app not running)
Uncertain or ambiguous outcome

Examples

Step	Observation	verified
`"Returns status 200"`	Got status 200	`true`
`"Returns status 200"`	Got status 201	`false`
`"Button color is #4CAF50"`	Color is #4CAF50	`true`
`"Button color is #4CAF50"`	Color looks green	`false`
`"File exists at /tmp/out.json"`	File exists	`true`
`"File exists at /tmp/out.json"`	Directory exists	`false`

Blocking Issues

When encountering blocking issues (missing credentials, external service unavailable, ambiguous requirement):

Set blocked: true on the test
Document in trajectory with details

{
  "id": "T00015-03",
  "description": "Verify external API integration",
  "ticket-completion": true,
  "steps": [
    {
      "step": "POST /api/payment with valid card returns 200",
      "trajectory": [
        "Attempted: POST http://localhost:8000/api/payment",
        "BLOCKING ISSUE: Payment API requires STRIPE_API_KEY environment variable",
        "Checked .env - STRIPE_API_KEY not set",
        "Cannot proceed without valid API credentials"
      ],
      "verified": false
    }
  ],
  "passes": false,
  "blocked": true
}

Trajectory must include:

What was attempted
Why it's blocked (specific error or missing requirement)
What's needed to unblock

Output Format

After execution, update tests-in-progress.json with results:

{
  "ticket": "T00015",
  "title": "Feature Implementation Tests",
  "tests": [
    {
      "id": "T00015-01",
      "name": "Feature works end-to-end",
      "description": "Verify feature implementation",
      "ticket-completion": true,
      "steps": [
        {
          "step": "Run 'uv sync' exits with code 0",
          "trajectory": [
            "Ran: uv sync",
            "Output: Resolved 15 packages...",
            "Exit code: 0 - MATCHES"
          ],
          "verified": true
        }
      ],
      "passes": true
    }
  ]
}

Workflow

Read tests-in-progress.json for given ticket
Execute each test in order (ticket-completion: true first)
Record trajectory for every step
Update verified, passes, blocked fields
Save updated tests-in-progress.json
Return summary

Return Summary

After completion, return:

Run Tests Complete: T00015
- Executed: 5 tests
- Passed: 3
- Failed: 1
- Blocked: 1
- Scripts created: 2

Visual Verification

For UI tests, capture screenshots as evidence:

Desktop: Use screenshot-capture skill or debug endpoint
Web: Use mcp chrome-devtools take_screenshot

Record screenshot observations in trajectory.

run-tests

Install Skill

SKILL.md