name	loop-test
description	Use to regression test the full OODA loop. Validates phase transitions, artifact handoffs, and resumption. Maintainer tooling - invoke explicitly with "use loop-test to verify the system".

Loop-Test

Regression testing for the full OODA loop. Validates that phases transition correctly, artifacts are readable between phases, and the system can resume from any point.

When to Use

After modifying any OODA skill (observe, orient, decide, act)
After adding new skills that integrate with the loop
Before releasing changes to antmachine
When debugging loop behavior

What It Tests

Layer	Tests	Purpose
1. Transition Validators	Observe→Orient, Orient→Decide, Decide→Act	Artifacts readable between phases
2. Canonical Scenarios	Simple task, complex task, re-observation	Full loop works end-to-end
3. Ad-Hoc Generation	Fresh scenario from codebase	Catches edge cases
4. Resumption	Crash at any phase, resume	Idempotency works
5. Reporting	PASS/FAIL with details	Clear feedback

Running Tests

Full Test Suite

Use loop-test to run all layers

Specific Layer

Use loop-test to run layer 1 (transitions)
Use loop-test to run layer 2 (scenarios)

Ad-Hoc Scenario

Use loop-test with ad-hoc scenario

Layer 1: Transition Validators

Each transition has a contract:

Observe → Orient

Observe writes:

{
  "scout": "codebase",
  "timestamp": "2026-01-03T15:30:00Z",
  "status": "success",
  "findings": { ... },
  "gaps": []
}

Orient expects: Artifacts in .antmachine/observations/ with scout, findings, status.

Orient → Decide

Orient writes:

{
  "timestamp": "...",
  "goal": "...",
  "plans": [
    {"id": "plan-a", "name": "...", "steps": [...]},
    {"id": "plan-b", "name": "...", "steps": [...]}
  ],
  "recommendation": {"pick": "plan-a"}
}

Decide expects: Plan in .antmachine/orient/artifacts/plans/ with 2+ options.

Orient → Act (Direct)

When Orient has extremely high confidence and action is clear:

State indicates:

{
  "routing": {
    "from": "orient",
    "to": "act",
    "reason": "high_confidence",
    "skipped": ["decide"]
  }
}

Decide → Act

Plan includes:

{
  "selected": {
    "plan_id": "plan-a",
    "decided_by": "human"
  }
}

Act expects: Selected plan with executable steps.

Any Phase → Observe (Re-observation)

When any phase needs more information:

State indicates:

{
  "current_phase": "observe",
  "observation_request": {
    "from": "orient",
    "focus": "need codebase structure",
    "return_to": "orient"
  }
}

Flow is always: needs_observation → Observe → Orient → (routes appropriately)

Orient is the routing hub after every observation.

Layer 2: Canonical Scenarios

Simple Task (Fast Path)

Scenario: "Add a console.log to file X"

Expected flow: Observe → Orient → Act (skips Decide)

Validates: Orient can route directly to Act for high-confidence tasks.

Complex Task (Full Loop)

Scenario: "Add retry logic with tests"

Expected flow: Intent → Observe → Orient → Decide → Act → Verify

Validates: Full loop with human confirmation gates.

Re-observation from Act

Scenario: Act needs file info mid-execution

Expected flow: Act signals needs_observation → Observe → Orient → Act

Validates: Observation loop works correctly.

Re-observation from Orient

Scenario: Orient unclear on codebase structure

Expected flow: Orient signals needs_observation → Observe → Orient

Validates: Any phase can request observations.

Layer 3: Ad-Hoc Generation

Given current codebase state, generate a fresh test scenario:

Scan for recently modified files
Identify a realistic task (add test, fix type error, add feature)
Run task through loop
Validate all transitions

Layer 4: Resumption Testing

Simulate crash at each phase:

Run scenario to phase X
Clear agent context (simulate crash)
New agent reads .antmachine/state.json
Verify continuation from correct phase
Verify artifacts are not duplicated or corrupted

Critical State Fields

{
  "current_phase": "act",
  "current_task": "...",
  "phase_status": {
    "observe": "completed",
    "orient": "completed",
    "decide": "completed",
    "act": "in_progress"
  }
}

Layer 5: Reporting

PASS Report

Loop-Test Results: PASS
-----------------------
Transitions validated: 4/4
Scenarios run: 3/3
Resumption tests: 2/2

Phases executed:
  ✓ Observe → Orient
  ✓ Orient → Decide
  ✓ Decide → Act
  ✓ Act → Complete

FAIL Report

Loop-Test Results: FAIL
-----------------------
Breakage: Orient → Decide

Expected: Plan artifact with 2+ options
Found: Plan artifact with 1 option

File: .antmachine/orient/artifacts/plans/2026-01-03-test.json
Issue: Only plan-a present, missing alternatives

Red Flags - STOP If You Think These

Thought	Reality
"Tests passed, loop works"	Tests check contracts. Run scenarios too.
"Just test the skill I changed"	Skills interact. Test the full loop.
"Resumption is edge case"	Crashes happen. Test resumption.
"Ad-hoc is overkill"	Canonical misses edge cases. Run ad-hoc.

Test Artifacts

Tests create artifacts in:

.antmachine/observations/*-test-*.json
.antmachine/orient/artifacts/plans/*-test-*.json
.antmachine/test-fixtures/*.json

Teardown cleans these after tests complete.

Integration with CI

./tests/skills/loop-test.test.sh

Returns exit code 0 on pass, 1 on fail.

loop-test

Install Skill

SKILL.md