| name | run-tests |
| description | Execute tests and record results in tests-in-progress.json. Use this skill during the run_tests stage of test execution to add trajectory, verified, passes, and blocked fields. |
Run Tests
Reference Documentation
For test definition format and writing good steps, see references/test-definitions.md.
Execute tests for the given ticket and record results in tests-in-progress.json.
Context
This skill is part of the test execution pipeline:
tests-definition.json (READ-ONLY)
|
| load-tests.sh (filter by mode)
v
tests-in-progress.json <-- YOU ARE HERE (run_tests stage)
|
| verification-setup.sh
v
tests-in-verification.json (verify stage)
|
| merge-results.sh
v
tests-results.json (final)
Input File
Location: docs/3-tests/tickets/{ticket_id}/tests-in-progress.json
This file is created by load-tests.sh from tests-definition.json. It contains filtered tests to execute.
Input Format
{
"ticket": "T00015",
"title": "Feature Implementation Tests",
"tests": [
{
"id": "T00015-01",
"name": "Feature works end-to-end",
"description": "Verify feature implementation",
"ticket-completion": true,
"steps": [
{ "step": "Action with verifiable expectation", "trajectory": [], "verified": false }
],
"passes": false
}
]
}
Your Task
For each test in tests-in-progress.json:
- Execute each step - Run commands, scripts, or manual verification
- Record trajectory - Document exactly what you did and observed
- Set verified -
trueonly if expectation is exactly met - Set passes -
trueonly if ALL steps haveverified: true - Set blocked -
trueif blocking issue encountered
Execution Methods
Method 1: Test Scripts
If test script exists in docs/3-tests/tickets/{ticket_id}/test_*.py:
{
"step": "POST /api/search returns status 200",
"trajectory": [
"Ran: python docs/3-tests/tickets/T00015/test_search.py",
"Request sent to http://localhost:8000/api/search",
"Response: 200 OK - MATCHES"
],
"verified": true
}
Method 2: Manual Execution
If no script exists, execute manually:
{
"step": "Config file exists at ~/.config/app/settings.json",
"trajectory": [
"Checked: ~/.config/app/settings.json exists",
"File found with 245 bytes - MATCHES"
],
"verified": true
}
Method 3: Create Script During Execution
If step needs automation and no script exists:
- Create script in
docs/3-tests/tickets/{ticket_id}/test_*.py - Run the script
- Record in trajectory
Trajectory Recording
Record what you did and observed:
| Action Type | Trajectory Format |
|---|---|
| Command | "Ran: {command}", "Output: {output}", "Exit code: {code}" |
| File check | "Checked: {path} exists", "File contains: {content}" |
| API call | "Request: {method} {url}", "Response: {status}" |
| UI action | "Clicked '{element}'", "Screenshot shows: {observation}" |
| Assertion | "{field}: {actual} - MATCHES" or "- DOES NOT MATCH expected {expected}" |
Steps Covered by Script
When a script covers multiple steps:
{
"step": "Response contains results array",
"trajectory": ["Covered by script: results array has 5 items - MATCHES"],
"verified": true
}
Strict Verification Rules
Set verified: true ONLY when:
- Expectation is exactly met (not approximate)
- You have concrete evidence (output, screenshot, API response)
- There is no ambiguity about pass/fail
Keep verified: false when:
- Cannot confirm exact expectation
- Partial match or approximate result
- Unable to test (missing endpoint, app not running)
- Uncertain or ambiguous outcome
Examples
| Step | Observation | verified |
|---|---|---|
"Returns status 200" |
Got status 200 | true |
"Returns status 200" |
Got status 201 | false |
"Button color is #4CAF50" |
Color is #4CAF50 | true |
"Button color is #4CAF50" |
Color looks green | false |
"File exists at /tmp/out.json" |
File exists | true |
"File exists at /tmp/out.json" |
Directory exists | false |
Blocking Issues
When encountering blocking issues (missing credentials, external service unavailable, ambiguous requirement):
- Set
blocked: trueon the test - Document in trajectory with details
{
"id": "T00015-03",
"description": "Verify external API integration",
"ticket-completion": true,
"steps": [
{
"step": "POST /api/payment with valid card returns 200",
"trajectory": [
"Attempted: POST http://localhost:8000/api/payment",
"BLOCKING ISSUE: Payment API requires STRIPE_API_KEY environment variable",
"Checked .env - STRIPE_API_KEY not set",
"Cannot proceed without valid API credentials"
],
"verified": false
}
],
"passes": false,
"blocked": true
}
Trajectory must include:
- What was attempted
- Why it's blocked (specific error or missing requirement)
- What's needed to unblock
Output Format
After execution, update tests-in-progress.json with results:
{
"ticket": "T00015",
"title": "Feature Implementation Tests",
"tests": [
{
"id": "T00015-01",
"name": "Feature works end-to-end",
"description": "Verify feature implementation",
"ticket-completion": true,
"steps": [
{
"step": "Run 'uv sync' exits with code 0",
"trajectory": [
"Ran: uv sync",
"Output: Resolved 15 packages...",
"Exit code: 0 - MATCHES"
],
"verified": true
}
],
"passes": true
}
]
}
Workflow
- Read
tests-in-progress.jsonfor given ticket - Execute each test in order (
ticket-completion: truefirst) - Record trajectory for every step
- Update verified, passes, blocked fields
- Save updated
tests-in-progress.json - Return summary
Return Summary
After completion, return:
Run Tests Complete: T00015
- Executed: 5 tests
- Passed: 3
- Failed: 1
- Blocked: 1
- Scripts created: 2
Visual Verification
For UI tests, capture screenshots as evidence:
- Desktop: Use
screenshot-captureskill or debug endpoint - Web: Use
mcp chrome-devtools take_screenshot
Record screenshot observations in trajectory.