Claude Code Plugins

Community-maintained marketplace

Feedback

|

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name loop-test
description Use to regression test the full OODA loop. Validates phase transitions, artifact handoffs, and resumption. Maintainer tooling - invoke explicitly with "use loop-test to verify the system".

Loop-Test

Regression testing for the full OODA loop. Validates that phases transition correctly, artifacts are readable between phases, and the system can resume from any point.

When to Use

  • After modifying any OODA skill (observe, orient, decide, act)
  • After adding new skills that integrate with the loop
  • Before releasing changes to antmachine
  • When debugging loop behavior

What It Tests

Layer Tests Purpose
1. Transition Validators Observe→Orient, Orient→Decide, Decide→Act Artifacts readable between phases
2. Canonical Scenarios Simple task, complex task, re-observation Full loop works end-to-end
3. Ad-Hoc Generation Fresh scenario from codebase Catches edge cases
4. Resumption Crash at any phase, resume Idempotency works
5. Reporting PASS/FAIL with details Clear feedback

Running Tests

Full Test Suite

Use loop-test to run all layers

Specific Layer

Use loop-test to run layer 1 (transitions)
Use loop-test to run layer 2 (scenarios)

Ad-Hoc Scenario

Use loop-test with ad-hoc scenario

Layer 1: Transition Validators

Each transition has a contract:

Observe → Orient

Observe writes:

{
  "scout": "codebase",
  "timestamp": "2026-01-03T15:30:00Z",
  "status": "success",
  "findings": { ... },
  "gaps": []
}

Orient expects: Artifacts in .antmachine/observations/ with scout, findings, status.

Orient → Decide

Orient writes:

{
  "timestamp": "...",
  "goal": "...",
  "plans": [
    {"id": "plan-a", "name": "...", "steps": [...]},
    {"id": "plan-b", "name": "...", "steps": [...]}
  ],
  "recommendation": {"pick": "plan-a"}
}

Decide expects: Plan in .antmachine/orient/artifacts/plans/ with 2+ options.

Orient → Act (Direct)

When Orient has extremely high confidence and action is clear:

State indicates:

{
  "routing": {
    "from": "orient",
    "to": "act",
    "reason": "high_confidence",
    "skipped": ["decide"]
  }
}

Decide → Act

Plan includes:

{
  "selected": {
    "plan_id": "plan-a",
    "decided_by": "human"
  }
}

Act expects: Selected plan with executable steps.

Any Phase → Observe (Re-observation)

When any phase needs more information:

State indicates:

{
  "current_phase": "observe",
  "observation_request": {
    "from": "orient",
    "focus": "need codebase structure",
    "return_to": "orient"
  }
}

Flow is always: needs_observation → Observe → Orient → (routes appropriately)

Orient is the routing hub after every observation.

Layer 2: Canonical Scenarios

Simple Task (Fast Path)

Scenario: "Add a console.log to file X"

Expected flow: Observe → Orient → Act (skips Decide)

Validates: Orient can route directly to Act for high-confidence tasks.

Complex Task (Full Loop)

Scenario: "Add retry logic with tests"

Expected flow: Intent → Observe → Orient → Decide → Act → Verify

Validates: Full loop with human confirmation gates.

Re-observation from Act

Scenario: Act needs file info mid-execution

Expected flow: Act signals needs_observation → Observe → Orient → Act

Validates: Observation loop works correctly.

Re-observation from Orient

Scenario: Orient unclear on codebase structure

Expected flow: Orient signals needs_observation → Observe → Orient

Validates: Any phase can request observations.

Layer 3: Ad-Hoc Generation

Given current codebase state, generate a fresh test scenario:

  1. Scan for recently modified files
  2. Identify a realistic task (add test, fix type error, add feature)
  3. Run task through loop
  4. Validate all transitions

Layer 4: Resumption Testing

Simulate crash at each phase:

  1. Run scenario to phase X
  2. Clear agent context (simulate crash)
  3. New agent reads .antmachine/state.json
  4. Verify continuation from correct phase
  5. Verify artifacts are not duplicated or corrupted

Critical State Fields

{
  "current_phase": "act",
  "current_task": "...",
  "phase_status": {
    "observe": "completed",
    "orient": "completed",
    "decide": "completed",
    "act": "in_progress"
  }
}

Layer 5: Reporting

PASS Report

Loop-Test Results: PASS
-----------------------
Transitions validated: 4/4
Scenarios run: 3/3
Resumption tests: 2/2

Phases executed:
  ✓ Observe → Orient
  ✓ Orient → Decide
  ✓ Decide → Act
  ✓ Act → Complete

FAIL Report

Loop-Test Results: FAIL
-----------------------
Breakage: Orient → Decide

Expected: Plan artifact with 2+ options
Found: Plan artifact with 1 option

File: .antmachine/orient/artifacts/plans/2026-01-03-test.json
Issue: Only plan-a present, missing alternatives

Red Flags - STOP If You Think These

Thought Reality
"Tests passed, loop works" Tests check contracts. Run scenarios too.
"Just test the skill I changed" Skills interact. Test the full loop.
"Resumption is edge case" Crashes happen. Test resumption.
"Ad-hoc is overkill" Canonical misses edge cases. Run ad-hoc.

Test Artifacts

Tests create artifacts in:

  • .antmachine/observations/*-test-*.json
  • .antmachine/orient/artifacts/plans/*-test-*.json
  • .antmachine/test-fixtures/*.json

Teardown cleans these after tests complete.

Integration with CI

./tests/skills/loop-test.test.sh

Returns exit code 0 on pass, 1 on fail.