Claude Code Plugins

Community-maintained marketplace

Feedback

agent-performance

@rysweet/azlin
0
0

Track and report agent invocation metrics including usage counts, success/failure rates, and completion times. Use for understanding which agents are utilized, identifying underused agents, and optimizing agent delegation patterns.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name agent-performance
description Track and report agent invocation metrics including usage counts, success/failure rates, and completion times. Use for understanding which agents are utilized, identifying underused agents, and optimizing agent delegation patterns.
source_urls https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices

Agent Performance Dashboard

Purpose

Provides visibility into agent usage patterns to optimize delegation and identify improvement opportunities.

When I Activate

I automatically load when you mention:

  • "agent performance" or "agent metrics"
  • "agent dashboard" or "agent usage"
  • "which agents are used" or "underutilized agents"
  • "agent success rate" or "agent statistics"

What I Do

  1. Track Invocations: Record agent usage via workflow tracker
  2. Measure Success: Track completion rates per agent
  3. Analyze Patterns: Identify usage trends and gaps
  4. Generate Reports: Create actionable dashboards

Quick Start

User: "Show me agent performance metrics"
Skill: *activates automatically*
       "Generating agent performance report..."

Core Capabilities

1. Report Generation

Generate a performance report by reading workflow logs and aggregating agent metrics:

User: "Generate agent performance report"

Report includes:

  • Invocation counts per agent
  • Success/failure rates
  • Average completion times (when tracked)
  • Underutilized agents list
  • Recommendations for optimization

2. Live Tracking

Track agent invocations during workflow execution using the existing workflow_tracker:

# Already available in .claude/tools/amplihack/hooks/workflow_tracker.py
from workflow_tracker import log_agent_invocation

log_agent_invocation(
    agent_name="architect",
    purpose="Design authentication module",
    step_number=2
)

3. Metrics Storage

Metrics are stored in:

  • Raw logs: .claude/runtime/logs/workflow_adherence/workflow_execution.jsonl
  • Aggregated: .claude/runtime/metrics/agent_performance.yaml

Report Format

Summary Dashboard

# Agent Performance Summary
# Generated: 2025-11-25

total_invocations: 142

agents:
  architect:
    invocations: 45
    success_rate: 95.6%
    avg_duration_ms: 2340
    trend: increasing

  builder:
    invocations: 38
    success_rate: 89.5%
    avg_duration_ms: 4520
    trend: stable

  reviewer:
    invocations: 25
    success_rate: 100%
    avg_duration_ms: 1890
    trend: increasing

underutilized:
  - database (0 invocations in last 30 days)
  - integration (2 invocations in last 30 days)
  - patterns (3 invocations in last 30 days)

recommendations:
  - Consider using database agent for schema work
  - Integration agent available for external service connections
  - Patterns agent can identify reusable solutions

Implementation Guide

To Generate a Report

  1. Read workflow execution logs:

    Read: .claude/runtime/logs/workflow_adherence/workflow_execution.jsonl
    
  2. Filter for agent_invoked events:

    { "event": "agent_invoked", "agent": "architect", "purpose": "...", "step": 2 }
    
  3. Aggregate by agent name:

    • Count invocations
    • Calculate success rates from workflow_end events
    • Compute average durations
  4. Identify underutilized agents:

    • List all available agents from .claude/agents/amplihack/
    • Compare against invocation counts
    • Flag agents with <5 invocations in analysis period
  5. Write report to:

    .claude/runtime/metrics/agent_performance.yaml
    

Available Agents Inventory

Core Agents (6):

  • architect, builder, reviewer, tester, optimizer, api-designer

Specialized Agents (25):

  • ambiguity, amplifier-cli-architect, analyzer, azure-kubernetes-expert
  • ci-diagnostic-workflow, cleanup, database, documentation-writer
  • fallback-cascade, fix-agent, integration, knowledge-archaeologist
  • memory-manager, multi-agent-debate, n-version-validator, patterns
  • philosophy-guardian, pre-commit-diagnostic, preference-reviewer
  • prompt-writer, rust-programming-expert, security, visualization-architect
  • worktree-manager, xpia-defense

Note: Agent count may change as specialized agents are added/removed. Use ls .claude/agents/amplihack/specialized/ for current count.

Tracking Best Practices

When Invoking Agents

Always log invocations for accurate tracking:

# Before invoking an agent via Task tool
log_agent_invocation(
    agent_name="security",
    purpose="Audit authentication implementation",
    step_number=7  # Optional: link to workflow step
)

# Then invoke the agent
Task(subagent_type="security", prompt="...")

Workflow Integration

The DEFAULT_WORKFLOW.md specifies agent delegation at each step. This skill helps verify adherence:

  • Step 1: prompt-writer
  • Step 2: architect
  • Step 3: builder
  • Step 4: tester
  • Step 5: reviewer
  • etc.

Configuration

Setting Default Description
ANALYSIS_DAYS 30 Days of history to analyze
UNDERUTILIZED_THRESHOLD 5 Invocations below this = underutilized
METRICS_FILE agent_performance.yaml Output file name

Philosophy Alignment

This skill follows:

  • Ruthless Simplicity: Uses existing infrastructure (workflow_tracker)
  • Zero-BS: No placeholders, working aggregation logic
  • Modular Design: Self-contained skill, clear boundaries
  • Emergence: Insights emerge from simple tracking patterns

Interpreting Metrics

Success Rate Guidelines

Rate Assessment Action
95-100% Excellent Maintain current patterns
85-94% Good Review occasional failures for patterns
70-84% Needs Attention Investigate failure causes, adjust prompts
Below 70% Critical Agent may need redesign or prompt overhaul

Invocation Volume Interpretation

  • High volume (30+ in 30 days): Core workflow agent, ensure reliability
  • Medium volume (10-29): Regular use, monitor for optimization opportunities
  • Low volume (5-9): Specialized use case, verify still needed
  • Very low (<5): Consider if agent is discoverable or relevant

Duration Benchmarks

  • < 2 seconds: Fast execution, typical for simple analysis
  • 2-10 seconds: Normal for moderate complexity
  • 10-60 seconds: Expected for deep analysis or multi-step tasks
  • > 60 seconds: May indicate inefficiency, consider optimization

Empty State Handling

When no log data exists (new project or logs cleared):

# Agent Performance Report
# Period: Last 30 days
# Status: No data available

summary:
  total_invocations: 0
  message: "No agent invocations logged yet"

getting_started:
  - "Agent tracking begins when workflow_tracker logs invocations"
  - "Ensure agents are invoked via Task tool with proper logging"
  - "First report available after initial workflow execution"

next_steps:
  - "Run a workflow task to generate initial data"
  - "Verify workflow_tracker is properly configured"
  - "Check .claude/runtime/logs/ directory exists"

Limitations

This skill has the following constraints:

  1. Depends on workflow_tracker: Only tracks agents invoked through the logging system
  2. No real-time metrics: Reports are generated on-demand, not streamed
  3. Historical data only: Cannot predict future usage patterns
  4. Manual log analysis: Does not auto-detect anomalies or alert on issues
  5. Single-project scope: Metrics are per-project, no cross-project aggregation
  6. Time-based only: No correlation with code quality or PR outcomes