Claude Code Plugins

Community-maintained marketplace

Feedback

Execute tasks autonomously from a task queue with multi-context window support. Use when user requests autonomous mode, batch task execution, or needs to complete multiple tasks systematically. Handles task loading, execution, verification, and state persistence across context windows.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name autonomous-tasks
description Execute tasks autonomously from a task queue with multi-context window support. Use when user requests autonomous mode, batch task execution, or needs to complete multiple tasks systematically. Handles task loading, execution, verification, and state persistence across context windows.
allowed-tools Read, Write, Bash, Task

Autonomous Tasks Skill

Purpose

This skill enables fully autonomous task execution where Claude:

  1. Loads tasks from .claude/tasks/ folder
  2. Executes them one by one
  3. Physically verifies each implementation
  4. Saves state to memory before context refresh
  5. Continues across multiple context windows
  6. Only signs off when tasks are truly complete

Core Autonomous Behavior

Default to action, not suggestions. When in autonomous mode:

  • Implement changes rather than only suggesting them
  • Infer the most useful likely action and proceed
  • Use tools to discover missing details instead of guessing
  • Be persistent and complete tasks fully
  • Save progress to memory frequently

Task File Format

Tasks are stored in .claude/tasks/ as markdown files:

# Task: [Brief title]

## Description
[What needs to be done]

## Acceptance Criteria
- [ ] Criterion 1
- [ ] Criterion 2
- [ ] Criterion 3

## Verification Required
- [ ] Physical verification with screenshots
- [ ] UI interaction testing
- [ ] Build succeeds
- [ ] No regressions

## Context
[Any additional context, file paths, or requirements]

## Priority
high | normal | low

Example Task File

.claude/tasks/001-cost-dashboard-minimize.md:

# Task: Make Cost Dashboard Minimizable

## Description
The cost dashboard button in bottom-right corner should be minimizable.
Currently blocks text and cannot be minimized through pressing.

## Acceptance Criteria
- [ ] Button shows $ icon
- [ ] Button is in bottom-right corner above connectivity button
- [ ] Clicking button toggles dashboard visibility
- [ ] Dashboard hidden by default
- [ ] Button looks professional and HD quality

## Verification Required
- [ ] Screenshot shows button exists
- [ ] Screenshot before/after click shows toggle works
- [ ] UI audit confirms professional appearance
- [ ] Build succeeds with no errors

## Context
Files: src/components/CostDashboard.tsx, src/components/buttons.css
Style: Professional, clean, smooth, high-definition

## Priority
high

Autonomous Execution Workflow

Copy this checklist and track progress:

Autonomous Task Execution:
- [ ] Step 1: Load task queue from .claude/tasks/
- [ ] Step 2: Sort by priority (high > normal > low)
- [ ] Step 3: For each task:
  - [ ] 3a. Read task file
  - [ ] 3b. Understand requirements
  - [ ] 3c. Implement solution
  - [ ] 3d. Run build verification
  - [ ] 3e. Physical verification with screenshots
  - [ ] 3f. Mark acceptance criteria complete
  - [ ] 3g. Save state to memory
- [ ] Step 4: Generate completion report
- [ ] Step 5: Physical verification of all changes

Step 1: Load Task Queue

import os
import glob
from pathlib import Path

tasks_dir = Path(".claude/tasks")
task_files = sorted(glob.glob(str(tasks_dir / "*.md")))

tasks = []
for task_file in task_files:
    with open(task_file) as f:
        content = f.read()
        # Parse task metadata
        tasks.append({
            "file": task_file,
            "content": content,
            "status": "pending"
        })

print(f"Loaded {len(tasks)} tasks")

Step 2: Sort by Priority

def get_priority(task_content):
    """Extract priority from task content."""
    if "Priority" in task_content:
        if "high" in task_content.lower():
            return 0
        elif "normal" in task_content.lower():
            return 1
        else:
            return 2
    return 1  # default to normal

tasks.sort(key=lambda t: get_priority(t["content"]))

Step 3: Execute Each Task

For each task in the queue:

3a. Read and Parse Task

task = tasks[0]  # Current task
content = task["content"]

# Extract key sections
title = # Parse "# Task: ..." line
description = # Parse ## Description section
criteria = # Parse ## Acceptance Criteria checkboxes
verification = # Parse ## Verification Required
context = # Parse ## Context section

3b. Understand Requirements

Use extended thinking to fully understand:

  • What needs to be built
  • Why it's needed
  • How to implement it
  • What physical verification is required
  • How to verify it actually works

3c. Implement Solution

Default to implementation:

  • Read relevant files
  • Make necessary changes
  • Follow existing patterns
  • Write clean, production-ready code
  • Add comments where helpful

Use tools proactively:

  • Read files to understand structure
  • Grep for patterns to maintain consistency
  • Edit files with exact replacements
  • Write new files when needed
  • Run bash commands for verification

3d. Build Verification

After implementation, verify build succeeds:

cd /home/runner/workspace && npm run build

If build fails:

  • Read error messages carefully
  • Fix the issues
  • Rebuild
  • Repeat until build succeeds

3e. Physical Verification

CRITICAL: Use the physical-verification skill:

from services.chatkit_backend.app.vision import register_and_verify_ui_change

result = await register_and_verify_ui_change(
    change_id=task_id,
    description=title,
    files_modified=files_changed,
    verification_criteria=[
        {"element": criterion, "expected_state": expected}
        for criterion in criteria
    ],
    priority="high",
    wait_for_verification=True
)

if not result["success"]:
    # Fix issues and retry
    # Do NOT mark task complete until verification passes
    pass

3f. Mark Acceptance Criteria

Update task file with completion status:

# Update checkboxes from [ ] to [x]
updated_content = content.replace("- [ ] Criterion 1", "- [x] Criterion 1")

with open(task["file"], "w") as f:
    f.write(updated_content)

3g. Save State to Memory

IMPORTANT: Before moving to next task, save state:

from services.chatkit_backend.app.llm.memory_tool import upsert_memory

await upsert_memory(
    content=f"""
    Autonomous Task Progress:

    Completed: {task["file"]}
    Status: VERIFIED
    Files Modified: {files_changed}
    Verification: {result}

    Next Task: {tasks[1]["file"] if len(tasks) > 1 else "None"}
    Remaining: {len(tasks) - 1} tasks
    """,
    context="autonomous_execution"
)

This ensures progress is not lost if context window refreshes.

Step 4: Generate Completion Report

After all tasks complete, create report:

# Autonomous Execution Report

## Summary
Completed X tasks with full physical verification

## Tasks Completed
1. Task 1 - VERIFIED
   - Files: [list]
   - Verification: [screenshot paths]

2. Task 2 - VERIFIED
   - Files: [list]
   - Verification: [screenshot paths]

## Build Status
✓ All builds successful

## Verification Status
✓ All features physically verified with computer vision

## Evidence
[Screenshot paths for all verifications]

## Next Steps
[Any follow-up tasks or recommendations]

Step 5: Final Comprehensive Audit

Run comprehensive audit of all changes:

from services.chatkit_backend.app.vision import AuditorAgent

auditor = AuditorAgent(api_key=os.getenv("ANTHROPIC_API_KEY"))

all_test_cases = []
for task in completed_tasks:
    all_test_cases.extend(task["verification_criteria"])

result = await auditor.comprehensive_ui_audit(
    feature_name="Autonomous Task Batch",
    test_cases=all_test_cases,
    max_iterations=30
)

Multi-Context Window Support

When approaching context limit, save state and prepare for refresh:

Before Context Refresh

# Save detailed state to memory
await upsert_memory(
    content=f"""
    # Autonomous Task State Checkpoint

    ## Current Task
    {current_task_file}

    ## Progress
    Completed: {len(completed_tasks)}
    Remaining: {len(remaining_tasks)}

    ## Files Modified
    {all_modified_files}

    ## Verification Results
    {verification_summary}

    ## Next Actions
    1. Continue with task: {next_task_file}
    2. Load state from this memory
    3. Resume execution
    """,
    context="autonomous_checkpoint"
)

# Move completed tasks to archive
for task in completed_tasks:
    os.rename(
        task["file"],
        task["file"].replace("tasks", "tasks/completed")
    )

After Context Refresh

# Load state from memory
from services.chatkit_backend.app.llm.memory_tool import search_memory

results = await search_memory(
    query="Autonomous task state checkpoint",
    context="autonomous_checkpoint"
)

# Parse state and resume
# Load remaining tasks
# Continue from where we left off

User Interface Integration

The autonomous mode should be controllable from UI:

Controls needed:

  • Toggle autonomous mode ON/OFF
  • Select task folder (.claude/tasks/)
  • Configure verification strictness (always | optional | never)
  • View task queue and progress
  • Pause/resume execution
  • Manual intervention button

Status display:

  • Current task being executed
  • Progress bar (X of Y tasks)
  • Last verification result
  • Build status
  • Error messages if any

See implementation in src/components/AutonomousPanel.tsx

System Prompt for Autonomous Mode

When autonomous mode is active, the meta agent should have this system prompt:

<default_to_action>
By default, implement changes rather than only suggesting them. If the user's intent is unclear, infer the most useful likely action and proceed, using tools to discover any missing details instead of guessing.
</default_to_action>

<persistence>
Your context window will be automatically compacted as it approaches its limit, allowing you to continue working indefinitely. Save your current progress and state to memory before the context window refreshes. Always be as persistent and autonomous as possible and complete tasks fully.
</persistence>

<physical_verification>
NEVER claim a feature works without physical verification. Use computer vision to take screenshots and verify UI changes actually exist and function as expected. Code inspection and test passing are not sufficient.
</physical_verification>

<task_execution>
Execute tasks from .claude/tasks/ folder systematically:
1. Load all tasks
2. Sort by priority
3. For each task: implement, verify build, physically verify, mark complete
4. Save state to memory after each task
5. Generate completion report with evidence
</task_execution>

Error Handling

Build Failures

  • Read error messages carefully
  • Fix the specific issues mentioned
  • Rebuild and verify
  • Do NOT skip to next task until build passes

Verification Failures

  • Take screenshot to see actual state
  • Compare to expected state
  • Identify root cause
  • Fix the implementation
  • Re-verify with computer vision
  • Do NOT claim feature works without visual proof

Context Window Approaching Limit

  • Save detailed state to memory immediately
  • Mark current position in task queue
  • Note which files were modified
  • Allow context to refresh
  • Load state from memory
  • Resume from exact position

Anti-Patterns to Avoid

❌ Suggesting Instead of Implementing

"You could implement this by adding a button..."  # NO

✓ Actually Implementing

[Reads files, makes changes, verifies build]
"Button implemented and verified"  # YES

❌ Claiming Done Without Verification

"I've added the feature, it should work now"  # NO

✓ Verification Before Claiming Done

[Takes screenshots, verifies with computer vision]
"Feature verified with visual evidence: [screenshot paths]"  # YES

❌ Stopping at First Error

"Build failed, need help"  # NO

✓ Fixing Errors Autonomously

[Reads error, identifies issue, fixes it, rebuilds]
"Build issue resolved, continuing"  # YES

Integration with Meta Agent

The meta agent should:

  1. Detect when user requests autonomous mode
  2. Load this skill automatically
  3. Execute task queue systematically
  4. Use physical-verification skill for all UI changes
  5. Save state to memory throughout execution
  6. Handle context refreshes seamlessly
  7. Generate final report with evidence

Success Criteria

Autonomous execution is successful when:

  • All tasks in queue are completed
  • All builds pass
  • All features physically verified with screenshots
  • Evidence (screenshots) available for every claim
  • State saved to memory at checkpoints
  • Completion report generated
  • User can see progress and results in UI

Next Steps

After completing autonomous task execution:

  1. Review completion report
  2. Check all screenshot evidence
  3. Run final comprehensive UI audit
  4. Update documentation
  5. Commit changes to git (if requested)
  6. Archive completed tasks
  7. Ready for next batch of tasks