| name | loop-test |
| description | Use to regression test the full OODA loop. Validates phase transitions, artifact handoffs, and resumption. Maintainer tooling - invoke explicitly with "use loop-test to verify the system". |
Loop-Test
Regression testing for the full OODA loop. Validates that phases transition correctly, artifacts are readable between phases, and the system can resume from any point.
When to Use
- After modifying any OODA skill (observe, orient, decide, act)
- After adding new skills that integrate with the loop
- Before releasing changes to antmachine
- When debugging loop behavior
What It Tests
| Layer | Tests | Purpose |
|---|---|---|
| 1. Transition Validators | Observe→Orient, Orient→Decide, Decide→Act | Artifacts readable between phases |
| 2. Canonical Scenarios | Simple task, complex task, re-observation | Full loop works end-to-end |
| 3. Ad-Hoc Generation | Fresh scenario from codebase | Catches edge cases |
| 4. Resumption | Crash at any phase, resume | Idempotency works |
| 5. Reporting | PASS/FAIL with details | Clear feedback |
Running Tests
Full Test Suite
Use loop-test to run all layers
Specific Layer
Use loop-test to run layer 1 (transitions)
Use loop-test to run layer 2 (scenarios)
Ad-Hoc Scenario
Use loop-test with ad-hoc scenario
Layer 1: Transition Validators
Each transition has a contract:
Observe → Orient
Observe writes:
{
"scout": "codebase",
"timestamp": "2026-01-03T15:30:00Z",
"status": "success",
"findings": { ... },
"gaps": []
}
Orient expects: Artifacts in .antmachine/observations/ with scout, findings, status.
Orient → Decide
Orient writes:
{
"timestamp": "...",
"goal": "...",
"plans": [
{"id": "plan-a", "name": "...", "steps": [...]},
{"id": "plan-b", "name": "...", "steps": [...]}
],
"recommendation": {"pick": "plan-a"}
}
Decide expects: Plan in .antmachine/orient/artifacts/plans/ with 2+ options.
Orient → Act (Direct)
When Orient has extremely high confidence and action is clear:
State indicates:
{
"routing": {
"from": "orient",
"to": "act",
"reason": "high_confidence",
"skipped": ["decide"]
}
}
Decide → Act
Plan includes:
{
"selected": {
"plan_id": "plan-a",
"decided_by": "human"
}
}
Act expects: Selected plan with executable steps.
Any Phase → Observe (Re-observation)
When any phase needs more information:
State indicates:
{
"current_phase": "observe",
"observation_request": {
"from": "orient",
"focus": "need codebase structure",
"return_to": "orient"
}
}
Flow is always: needs_observation → Observe → Orient → (routes appropriately)
Orient is the routing hub after every observation.
Layer 2: Canonical Scenarios
Simple Task (Fast Path)
Scenario: "Add a console.log to file X"
Expected flow: Observe → Orient → Act (skips Decide)
Validates: Orient can route directly to Act for high-confidence tasks.
Complex Task (Full Loop)
Scenario: "Add retry logic with tests"
Expected flow: Intent → Observe → Orient → Decide → Act → Verify
Validates: Full loop with human confirmation gates.
Re-observation from Act
Scenario: Act needs file info mid-execution
Expected flow: Act signals needs_observation → Observe → Orient → Act
Validates: Observation loop works correctly.
Re-observation from Orient
Scenario: Orient unclear on codebase structure
Expected flow: Orient signals needs_observation → Observe → Orient
Validates: Any phase can request observations.
Layer 3: Ad-Hoc Generation
Given current codebase state, generate a fresh test scenario:
- Scan for recently modified files
- Identify a realistic task (add test, fix type error, add feature)
- Run task through loop
- Validate all transitions
Layer 4: Resumption Testing
Simulate crash at each phase:
- Run scenario to phase X
- Clear agent context (simulate crash)
- New agent reads
.antmachine/state.json - Verify continuation from correct phase
- Verify artifacts are not duplicated or corrupted
Critical State Fields
{
"current_phase": "act",
"current_task": "...",
"phase_status": {
"observe": "completed",
"orient": "completed",
"decide": "completed",
"act": "in_progress"
}
}
Layer 5: Reporting
PASS Report
Loop-Test Results: PASS
-----------------------
Transitions validated: 4/4
Scenarios run: 3/3
Resumption tests: 2/2
Phases executed:
✓ Observe → Orient
✓ Orient → Decide
✓ Decide → Act
✓ Act → Complete
FAIL Report
Loop-Test Results: FAIL
-----------------------
Breakage: Orient → Decide
Expected: Plan artifact with 2+ options
Found: Plan artifact with 1 option
File: .antmachine/orient/artifacts/plans/2026-01-03-test.json
Issue: Only plan-a present, missing alternatives
Red Flags - STOP If You Think These
| Thought | Reality |
|---|---|
| "Tests passed, loop works" | Tests check contracts. Run scenarios too. |
| "Just test the skill I changed" | Skills interact. Test the full loop. |
| "Resumption is edge case" | Crashes happen. Test resumption. |
| "Ad-hoc is overkill" | Canonical misses edge cases. Run ad-hoc. |
Test Artifacts
Tests create artifacts in:
.antmachine/observations/*-test-*.json.antmachine/orient/artifacts/plans/*-test-*.json.antmachine/test-fixtures/*.json
Teardown cleans these after tests complete.
Integration with CI
./tests/skills/loop-test.test.sh
Returns exit code 0 on pass, 1 on fail.