name	literate-tests
description	This skill should be used when the user asks to "create literate tests", "generate markdown tests", "specification-as-tests", "TDD with markdown", "agent-driven testing", or mentions test suites where markdown IS the test format. NOT for pytest/jest/unittest. Creates .md test files with inline assertions and uses a bundled custom test runner.
license	MIT
metadata	[object Object]

Literate Test Suite Generator

Create a test suite for [DOMAIN] that an agent can run autonomously.

What This Pattern Produces

CRITICAL: This is NOT pytest/jest/unittest/etc. This creates a CUSTOM test format.

Generate two artifacts:

1. Markdown Test Files (`tests/<feature>.md`)

# Feature Name

Prose explaining what this tests and why.

## Test Group

### Specific Behavior

\`\`\`py
result = do_thing("input")
result.value  # expect: 42
\`\`\`

### Error Case  

\`\`\`py
do_thing(None)  # error: [null-input]
\`\`\`

The markdown file IS the test. Code blocks contain executable code. Comments are assertions.

2. Test Runner (language-specific)

Copy the appropriate bundled runner for your language:

Language	Runner	Copy Command
Python	`run_tests.py`	`cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.py" tests/`
JavaScript	`run_tests.js`	`cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.js" tests/`
TypeScript	`run_tests.ts`	`cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.ts" tests/`
Bash/Shell	`run_tests.sh`	`cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.sh" tests/`
PowerShell	`run_tests.ps1`	`cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.ps1" tests/`
Rust	`run_tests.rs`	`cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.rs" tests/`
C#	`RunTests.cs`	`cp "${CLAUDE_PLUGIN_ROOT}/scripts/RunTests.cs" tests/`

All runners:

Parse markdown files for TOML frontmatter
Extract language-specific code blocks
Validate # expect: / // expect: and # error: / // error: assertions
Support matchers: approx(), contains()
Report pass/fail with colored output

Do NOT use pytest, jest, xunit, or any standard test framework.

Why This Format

Tests are the oracle for agent-driven development:

Agent runs python run_tests.py
Sees which assertions fail
Fixes the code
Repeats until green

The test file is both:

Documentation humans can read (it's markdown!)
Executable specification agents verify against

This pattern enabled Simon Willison to port an entire HTML5 parser in 4.5 hours—the agent ran 9,200 tests autonomously.

File Structure

project/
├── src/
│   └── <module>.<ext>        # Code under test
└── tests/
    ├── <feature>.md          # Markdown test files
    └── run_tests.<ext>       # Language-specific runner (copied from skill)

Markdown Test File Format

Frontmatter (Required)

Must be the FIRST thing in the file:

# Test Configuration
module = "mypackage.validator"
import = ["validate", "normalize", "ValidationError"]
isolation = "per-block"

Error Code Index (Required)

List ALL error codes at the top, before any tests:

**Error codes:**
- `[empty-input]` — User provided empty string
- `[null-input]` — User provided null/None
- `[invalid-format]` — String doesn't match expected pattern

Test Sections

Headers create test groups. Prose explains intent. Code blocks are tests:

## Input Validation

Users paste data from spreadsheets which may contain invisible whitespace.
The validator normalizes input before checking length.

### Accepts Valid Input

\`\`\`py
validate("hello")  # expect: True
\`\`\`

### Rejects Empty String

Empty means "user submitted nothing" — distinct from whitespace-only:

\`\`\`py
validate("")  # error: [empty-input]
\`\`\`

Assertion Syntax

Value Assertions

expression  # expect: <value>

The line before # expect: must be an evaluable expression, not a statement.

# Setup lines run first
result = calculate(10)
# This line is evaluated and compared
result.value  # expect: 42

Error Assertions

statement  # error: [code]
statement  # error: "message substring"  
statement  # error: [code] "message substring"

Matchers

For non-exact comparisons:

pi_value()  # expect: approx(3.14159, tol=0.0001)
output      # expect: contains("success")
text        # expect: matches(/^Error: \d+/)

CLI/Shell Tests

mycommand --bad-flag
# exit: 1
# stderr: contains("[invalid-flag]")

Using the Bundled Runners

Python Runner

cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.py" tests/
python tests/run_tests.py

Code blocks: ```py or ```python Assertions: # expect:, # error:

JavaScript Runner

cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.js" tests/
node tests/run_tests.js

Code blocks: ```js or ```javascript Assertions: // expect:, // error:, // throws:

TypeScript Runner

Requires: npm install -D tsx

cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.ts" tests/
npx tsx tests/run_tests.ts

Code blocks: ```ts or ```typescript Assertions: // expect:, // error:, // throws:

Bash Runner

cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.sh" tests/
bash tests/run_tests.sh

Code blocks: ```sh or ```bash Assertions: # exit:, # stdout:, # stderr:

PowerShell Runner

cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.ps1" tests/
pwsh tests/run_tests.ps1

Code blocks: ```ps1 or ```powershell Assertions: # exit:, # stdout:, # stderr:, # throws:

Rust Runner

Requires: cargo install rust-script

cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.rs" tests/
rust-script tests/run_tests.rs

Code blocks: ```rs or ```rust Assertions: // expect:, // error:, // compiles, // compile_fails:

C# Runner

Requires: dotnet tool install -g dotnet-script

cp "${CLAUDE_PLUGIN_ROOT}/scripts/RunTests.cs" tests/
dotnet script tests/RunTests.cs

Code blocks: ```cs or ```csharp Assertions: // expect:, // error:

Error Identity (Critical)

Every error must have a stable code the runner can match:

Language	How to Expose Code
Python	Exception with `.code` property
Rust	Error enum variant
CLI	`[code]` in stderr + exit code
C#	Exception type name

Example Python exception:

class ValidationError(Exception):
    def __init__(self, code: str, message: str):
        self.code = code
        super().__init__(message)

# Usage
raise ValidationError("empty-input", "Input cannot be empty")

The runner matches # error: [empty-input] against exception.code.

Example: Complete Test File

# Test Configuration  
module = "temperature"
import = ["celsius_to_fahrenheit", "TemperatureError"]
isolation = "per-block"

Temperature Conversion

Converts between Celsius and Fahrenheit with validation.

Problem: Users need temperature conversion, but invalid inputs (below absolute zero) must fail clearly—not return nonsense values.

Error codes:

[below-absolute-zero] — Temperature violates laws of physics

Celsius to Fahrenheit

Standard formula: F = (C × 9/5) + 32

Converts Freezing Point

celsius_to_fahrenheit(0)  # expect: 32.0

Converts Boiling Point

celsius_to_fahrenheit(100)  # expect: 212.0

Rejects Below Absolute Zero

-273.15°C is the physical limit. Below that is impossible:

celsius_to_fahrenheit(-300)  # error: [below-absolute-zero]

Checklist Before Generating

When asked to create literate tests:

Identify target language and copy appropriate runner from ${CLAUDE_PLUGIN_ROOT}/scripts/
Create .md files in tests/, NOT language-specific test files
Include TOML frontmatter at top of each test file
Include error code index before tests
Use correct assertion syntax for language (# for Python/Bash, // for Rust/C#)
Prose explains WHY, not just WHAT
One behavior per code block
Test names are behavior descriptions

Domain Spec Template

Fill in when generating:

What it does: [DESCRIPTION]
What problem it solves: [USER PAIN]
Language: [LANGUAGE]
Code block tag: [py/rs/sh/etc]
Comment prefix: [#/////--]
Error codes: [EXHAUSTIVE LIST]
Public API: [FUNCTIONS TO TEST]
Happy paths: [LIST]
Error cases: [LIST + WHY]
Edge cases: [LIST + WHY]

literate-tests

Install Skill

SKILL.md

Literate Test Suite Generator

What This Pattern Produces

1. Markdown Test Files (`tests/<feature>.md`)

2. Test Runner (language-specific)

Why This Format

File Structure

Markdown Test File Format

Frontmatter (Required)

Error Code Index (Required)

Test Sections

Assertion Syntax

Value Assertions

Error Assertions

Matchers

CLI/Shell Tests

Using the Bundled Runners

Python Runner

JavaScript Runner

TypeScript Runner

Bash Runner

PowerShell Runner

Rust Runner

C# Runner

Error Identity (Critical)

Example: Complete Test File

Temperature Conversion

Celsius to Fahrenheit

Converts Freezing Point

Converts Boiling Point

Rejects Below Absolute Zero

Checklist Before Generating

Domain Spec Template

Install Skill

SKILL.md

Literate Test Suite Generator

What This Pattern Produces

1. Markdown Test Files (tests/<feature>.md)

2. Test Runner (language-specific)

Why This Format

File Structure

Markdown Test File Format

Frontmatter (Required)

Error Code Index (Required)

Test Sections

Assertion Syntax

Value Assertions

Error Assertions

Matchers

CLI/Shell Tests

Using the Bundled Runners

Python Runner

JavaScript Runner

TypeScript Runner

Bash Runner

PowerShell Runner

Rust Runner

C# Runner

Error Identity (Critical)

Example: Complete Test File

Temperature Conversion

Celsius to Fahrenheit

Converts Freezing Point

Converts Boiling Point

Rejects Below Absolute Zero

Checklist Before Generating

Domain Spec Template

1. Markdown Test Files (`tests/<feature>.md`)