Claude Code Plugins

Community-maintained marketplace

Feedback

when-profiling-performance-use-performance-profiler

@DNYoussef/ai-chrome-extension
0
0

Comprehensive performance profiling, bottleneck detection, and optimization system

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name when-profiling-performance-use-performance-profiler
version 1.0.0
description Comprehensive performance profiling, bottleneck detection, and optimization system
author Claude Code
category performance
complexity HIGH
tags performance, profiling, optimization, benchmarking, mece
agents performance-analyzer, performance-benchmarker, coder, optimizer
components subagent, slash-command, mcp-tool
dependencies claude-flow@alpha, perf (Linux), instruments (macOS), clinic.js (Node.js)

Performance Profiler Skill

Overview

When profiling performance, use performance-profiler to measure, analyze, and optimize application performance across CPU, memory, I/O, and network dimensions.

MECE Breakdown

Mutually Exclusive Components:

  1. Baseline Phase: Establish current performance metrics
  2. Detection Phase: Identify bottlenecks and hot paths
  3. Analysis Phase: Root cause analysis and impact assessment
  4. Optimization Phase: Generate and prioritize recommendations
  5. Implementation Phase: Apply optimizations with agent assistance
  6. Validation Phase: Benchmark improvements and verify gains

Collectively Exhaustive Coverage:

  • CPU Profiling: Function execution time, hot paths, call graphs
  • Memory Profiling: Heap usage, allocations, leaks, garbage collection
  • I/O Profiling: File system, database, network latency
  • Network Profiling: Request timing, bandwidth, connection pooling
  • Concurrency: Thread utilization, lock contention, async operations
  • Algorithm Analysis: Time complexity, space complexity
  • Cache Analysis: Hit rates, cache misses, invalidation patterns
  • Database: Query performance, N+1 problems, index usage

Features

Core Capabilities:

  • Multi-dimensional performance profiling (CPU, memory, I/O, network)
  • Automated bottleneck detection with prioritization
  • Real-time profiling and historical analysis
  • Flame graph generation for visual analysis
  • Memory leak detection and heap snapshots
  • Database query optimization
  • Algorithmic complexity analysis
  • A/B comparison of before/after optimizations
  • Production-safe profiling with minimal overhead
  • Integration with APM tools (New Relic, DataDog, etc.)

Profiling Modes:

  • Quick Scan: 30-second lightweight profiling
  • Standard: 5-minute comprehensive analysis
  • Deep: 30-minute detailed investigation
  • Continuous: Long-running production monitoring
  • Stress Test: Load-based profiling under high traffic

Usage

Slash Command:

/profile [path] [--mode quick|standard|deep] [--target cpu|memory|io|network|all]

Subagent Invocation:

Task("Performance Profiler", "Profile ./app with deep CPU and memory analysis", "performance-analyzer")

MCP Tool:

mcp__performance-profiler__analyze({
  project_path: "./app",
  profiling_mode: "standard",
  targets: ["cpu", "memory", "io"],
  generate_optimizations: true
})

Architecture

Phase 1: Baseline Measurement

  1. Establish current performance metrics
  2. Define performance budgets
  3. Set up monitoring infrastructure
  4. Capture baseline snapshots

Phase 2: Bottleneck Detection

  1. CPU profiling (sampling or instrumentation)
  2. Memory profiling (heap analysis)
  3. I/O profiling (syscall tracing)
  4. Network profiling (packet analysis)
  5. Database profiling (query logs)

Phase 3: Root Cause Analysis

  1. Correlate metrics across dimensions
  2. Identify causal relationships
  3. Calculate performance impact
  4. Prioritize issues by severity

Phase 4: Optimization Generation

  1. Algorithmic improvements
  2. Caching strategies
  3. Parallelization opportunities
  4. Database query optimization
  5. Memory optimization
  6. Network optimization

Phase 5: Implementation

  1. Generate optimized code with coder agent
  2. Apply database optimizations
  3. Configure caching layers
  4. Implement parallelization

Phase 6: Validation

  1. Run benchmark suite
  2. Compare before/after metrics
  3. Verify no regressions
  4. Generate performance report

Output Formats

Performance Report:

{
  "project": "my-app",
  "profiling_mode": "standard",
  "duration_seconds": 300,
  "baseline": {
    "requests_per_second": 1247,
    "avg_response_time_ms": 123,
    "p95_response_time_ms": 456,
    "p99_response_time_ms": 789,
    "cpu_usage_percent": 67,
    "memory_usage_mb": 512,
    "error_rate_percent": 0.1
  },
  "bottlenecks": [
    {
      "type": "cpu",
      "severity": "high",
      "function": "processData",
      "time_percent": 34.5,
      "calls": 123456,
      "avg_time_ms": 2.3,
      "recommendation": "Optimize algorithm complexity from O(n²) to O(n log n)"
    }
  ],
  "optimizations": [...],
  "estimated_improvement": {
    "throughput_increase": "3.2x",
    "latency_reduction": "68%",
    "memory_reduction": "45%"
  }
}

Flame Graph:

Interactive SVG flame graph showing call stack with time proportions

Heap Snapshot:

Memory allocation breakdown with retention paths

Optimization Report:

Prioritized list of actionable improvements with code examples

Examples

Example 1: Quick CPU Profiling

/profile ./my-app --mode quick --target cpu

Example 2: Deep Memory Analysis

/profile ./my-app --mode deep --target memory --detect-leaks

Example 3: Full Stack Optimization

/profile ./my-app --mode standard --target all --optimize --benchmark

Example 4: Database Query Optimization

/profile ./my-app --mode standard --target io --database --explain-queries

Integration with Claude-Flow

Coordination Pattern:

// Step 1: Initialize profiling swarm
mcp__claude-flow__swarm_init({ topology: "star", maxAgents: 5 })

// Step 2: Spawn specialized agents
[Parallel Execution]:
  Task("CPU Profiler", "Profile CPU usage and identify hot paths in ./app", "performance-analyzer")
  Task("Memory Profiler", "Analyze heap usage and detect memory leaks", "performance-analyzer")
  Task("I/O Profiler", "Profile file system and database operations", "performance-analyzer")
  Task("Network Profiler", "Analyze network requests and identify slow endpoints", "performance-analyzer")
  Task("Optimizer", "Generate optimization recommendations based on profiling data", "optimizer")

// Step 3: Implementation agent applies optimizations
[Sequential Execution]:
  Task("Coder", "Implement recommended optimizations from profiling analysis", "coder")
  Task("Benchmarker", "Run benchmark suite and validate improvements", "performance-benchmarker")

Configuration

Default Settings:

{
  "profiling": {
    "sampling_rate_hz": 99,
    "stack_depth": 128,
    "include_native_code": false,
    "track_allocations": true
  },
  "thresholds": {
    "cpu_hot_path_percent": 10,
    "memory_leak_growth_mb": 10,
    "slow_query_ms": 100,
    "slow_request_ms": 1000
  },
  "optimization": {
    "auto_apply": false,
    "require_approval": true,
    "run_tests_before": true,
    "run_benchmarks_after": true
  },
  "output": {
    "flame_graph": true,
    "heap_snapshot": true,
    "call_tree": true,
    "recommendations": true
  }
}

Profiling Techniques

CPU Profiling:

  • Sampling: Periodic stack sampling (low overhead)
  • Instrumentation: Function entry/exit hooks (accurate but higher overhead)
  • Tracing: Event-based profiling

Memory Profiling:

  • Heap Snapshots: Point-in-time memory state
  • Allocation Tracking: Record all allocations
  • Leak Detection: Compare snapshots over time
  • GC Analysis: Garbage collection patterns

I/O Profiling:

  • Syscall Tracing: Track system calls (strace, dtrace)
  • File System: Monitor read/write operations
  • Database: Query logging and EXPLAIN ANALYZE
  • Network: Packet capture and request timing

Concurrency Profiling:

  • Thread Analysis: CPU utilization per thread
  • Lock Contention: Identify blocking operations
  • Async Operations: Promise/callback timing

Performance Optimization Strategies

Algorithmic:

  • Reduce time complexity (O(n²) → O(n log n))
  • Use appropriate data structures
  • Eliminate unnecessary work
  • Memoization and dynamic programming

Caching:

  • In-memory caching (Redis, Memcached)
  • CDN for static assets
  • HTTP caching headers
  • Query result caching

Parallelization:

  • Multi-threading
  • Worker pools
  • Async I/O
  • Batching operations

Database:

  • Add missing indexes
  • Optimize queries
  • Reduce N+1 queries
  • Connection pooling
  • Read replicas

Memory:

  • Object pooling
  • Reduce allocations
  • Stream processing
  • Compression

Network:

  • Connection keep-alive
  • HTTP/2 or HTTP/3
  • Compression
  • Request batching
  • Rate limiting

Performance Budgets

Frontend:

  • Time to First Byte (TTFB): < 200ms
  • First Contentful Paint (FCP): < 1.8s
  • Largest Contentful Paint (LCP): < 2.5s
  • Time to Interactive (TTI): < 3.8s
  • Total Blocking Time (TBT): < 200ms
  • Cumulative Layout Shift (CLS): < 0.1

Backend:

  • API Response Time (p50): < 100ms
  • API Response Time (p95): < 500ms
  • API Response Time (p99): < 1000ms
  • Throughput: > 1000 req/s
  • Error Rate: < 0.1%
  • CPU Usage: < 70%
  • Memory Usage: < 80%

Database:

  • Query Time (p50): < 10ms
  • Query Time (p95): < 50ms
  • Query Time (p99): < 100ms
  • Connection Pool Utilization: < 80%

Best Practices

  1. Profile production workloads when possible
  2. Use production-like data volumes
  3. Profile under realistic load
  4. Measure multiple times for consistency
  5. Focus on p95/p99, not just averages
  6. Optimize bottlenecks in order of impact
  7. Always benchmark before and after
  8. Monitor for regressions in CI/CD
  9. Set up continuous profiling
  10. Track performance over time

Troubleshooting

Issue: High CPU usage but no obvious hot path

Solution: Check for excessive small function calls, increase sampling rate, or use instrumentation

Issue: Memory grows continuously

Solution: Run heap snapshot comparison to identify leak sources

Issue: Slow database queries

Solution: Use EXPLAIN ANALYZE, check for missing indexes, analyze query plans

Issue: High latency but low CPU

Solution: Profile I/O operations, check for blocking synchronous calls

See Also

  • PROCESS.md - Detailed step-by-step profiling workflow
  • README.md - Quick start guide
  • subagent-performance-profiler.md - Agent implementation details
  • slash-command-profile.sh - Command-line interface
  • mcp-performance-profiler.json - MCP tool schema