Claude Code Plugins

Community-maintained marketplace

Feedback

load-testing-patterns

@tachyon-beep/skillpacks
3
0

Use when designing load tests, choosing tools (k6, JMeter, Gatling), calculating concurrent users from DAU, interpreting latency degradation, identifying bottlenecks, or running spike/soak/stress tests - provides test patterns, anti-patterns, and load calculation frameworks

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name load-testing-patterns
description Use when designing load tests, choosing tools (k6, JMeter, Gatling), calculating concurrent users from DAU, interpreting latency degradation, identifying bottlenecks, or running spike/soak/stress tests - provides test patterns, anti-patterns, and load calculation frameworks

Load Testing Patterns

Overview

Core principle: Test realistic load patterns, not constant artificial load. Find limits before users do.

Rule: Load testing reveals system behavior under stress. Without it, production is your load test.

Tool Selection Decision Tree

Your Need Protocol Team Skills Use Why
Modern API testing HTTP/REST/GraphQL JavaScript k6 Best dev experience, CI/CD friendly
Enterprise/complex protocols HTTP/SOAP/JMS/JDBC Java/GUI comfort JMeter Mature, comprehensive protocols
Python team HTTP/WebSocket Python Locust Pythonic, easy scripting
High performance/complex scenarios HTTP/gRPC Scala/Java Gatling Best reports, high throughput
Cloud-native at scale HTTP/WebSocket Any (SaaS) Artillery, Flood.io Managed, distributed

First choice: k6 (modern, scriptable, excellent CI/CD integration)

Why not ApacheBench/wrk: Too simple for realistic scenarios, no complex user flows

Test Pattern Library

Pattern Purpose Duration When to Use
Smoke Test Verify test works 1-2 min Before every test run
Load Test Normal/peak capacity 10-30 min Regular capacity validation
Stress Test Find breaking point 20-60 min Understand limits
Spike Test Sudden traffic surge 5-15 min Black Friday, launch events
Soak Test Memory leaks, stability 1-8 hours Pre-release validation
Capacity Test Max sustainable load Variable Capacity planning

Smoke Test

Goal: Verify test script works with minimal load

// k6 smoke test
export let options = {
  vus: 1,
  duration: '1m',
  thresholds: {
    http_req_duration: ['p(95)<500'],  // 95% < 500ms
    http_req_failed: ['rate<0.01'],     // <1% errors
  }
}

Purpose: Catch test script bugs before running expensive full tests

Load Test (Ramp-Up Pattern)

Goal: Test normal and peak expected load

// k6 load test with ramp-up
export let options = {
  stages: [
    { duration: '5m', target: 100 },   // Ramp to normal load
    { duration: '10m', target: 100 },  // Hold at normal
    { duration: '5m', target: 200 },   // Ramp to peak
    { duration: '10m', target: 200 },  // Hold at peak
    { duration: '5m', target: 0 },     // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500', 'p(99)<1000'],
    http_req_failed: ['rate<0.05'],
  }
}

Pattern: Gradual ramp-up → sustain → ramp down. Never start at peak.

Stress Test (Breaking Point)

Goal: Find system limits

// k6 stress test
export let options = {
  stages: [
    { duration: '5m', target: 100 },   // Normal
    { duration: '5m', target: 300 },   // Above peak
    { duration: '5m', target: 600 },   // 2x peak
    { duration: '5m', target: 900 },   // 3x peak (expect failure)
    { duration: '10m', target: 0 },    // Recovery
  ]
}

Success: Identify at what load system degrades (not necessarily breaking completely)

Spike Test (Sudden Surge)

Goal: Test sudden traffic bursts (viral post, email campaign)

// k6 spike test
export let options = {
  stages: [
    { duration: '1m', target: 100 },   // Normal
    { duration: '30s', target: 1000 }, // SPIKE to 10x
    { duration: '5m', target: 1000 },  // Hold spike
    { duration: '2m', target: 100 },   // Back to normal
    { duration: '5m', target: 100 },   // Recovery check
  ]
}

Tests: Auto-scaling, circuit breakers, rate limiting

Soak Test (Endurance)

Goal: Find memory leaks, resource exhaustion over time

// k6 soak test
export let options = {
  stages: [
    { duration: '5m', target: 100 },   // Ramp
    { duration: '4h', target: 100 },   // Soak (sustained load)
    { duration: '5m', target: 0 },     // Ramp down
  ]
}

Monitor: Memory growth, connection leaks, disk space, file descriptors

Duration: Minimum 1 hour, ideally 4-8 hours

Load Calculation Framework

Problem: Convert "10,000 daily active users" to concurrent load

Step 1: DAU to Concurrent Users

Concurrent Users = DAU × Concurrency Ratio × Peak Multiplier

Concurrency Ratios by App Type:
- Web apps: 5-10%
- Social media: 10-20%
- Business apps: 20-30% (work hours)
- Gaming: 15-25%

Peak Multiplier: 1.5-2x for safety margin

Example:

DAU = 10,000
Concurrency = 10% (web app)
Peak Multiplier = 1.5

Concurrent Users = 10,000 × 0.10 × 1.5 = 1,500 concurrent users

Step 2: Concurrent Users to Requests/Second

RPS = (Concurrent Users × Requests per Session) / (Session Duration × Think Time Ratio)

Think Time Ratio:
- Active browsing: 0.3-0.5 (30-50% time clicking/typing)
- Reading-heavy: 0.1-0.2 (10-20% active)
- API clients: 0.8-1.0 (80-100% active)

Example:

Concurrent Users = 1,500
Requests per Session = 20
Session Duration = 10 minutes = 600 seconds
Think Time Ratio = 0.3 (web browsing)

RPS = (1,500 × 20) / (600 × 0.3) = 30,000 / 180 = 167 RPS

Step 3: Model Realistic Patterns

Don't use constant load. Use realistic traffic patterns:

// Realistic daily pattern
export let options = {
  stages: [
    // Morning ramp
    { duration: '2h', target: 500 },    // 08:00-10:00
    { duration: '2h', target: 1000 },   // 10:00-12:00 (peak)
    // Lunch dip
    { duration: '1h', target: 600 },    // 12:00-13:00
    // Afternoon peak
    { duration: '2h', target: 1200 },   // 13:00-15:00 (peak)
    { duration: '2h', target: 800 },    // 15:00-17:00
    // Evening drop
    { duration: '2h', target: 300 },    // 17:00-19:00
  ]
}

Anti-Patterns Catalog

❌ Coordinated Omission

Symptom: Fixed rate load generation ignores slow responses, underestimating latency

Why bad: Hides real latency impact when system slows down

Fix: Use arrival rate (requests/sec) not iteration rate

// ❌ Bad - coordinated omission
export default function() {
  http.get('https://api.example.com')
  sleep(1)  // Wait 1s between requests
}

// ✅ Good - arrival rate pacing
export let options = {
  scenarios: {
    constant_arrival_rate: {
      executor: 'constant-arrival-rate',
      rate: 100,  // 100 RPS regardless of response time
      timeUnit: '1s',
      duration: '10m',
      preAllocatedVUs: 50,
      maxVUs: 200,
    }
  }
}

❌ Cold Start Testing

Symptom: Running load test immediately after deployment without warm-up

Why bad: JIT compilation, cache warming, connection pooling haven't stabilized

Fix: Warm-up phase before measurement

// ✅ Good - warm-up phase
export let options = {
  stages: [
    { duration: '2m', target: 50 },    // Warm-up (not measured)
    { duration: '10m', target: 100 },  // Actual test
  ]
}

❌ Unrealistic Test Data

Symptom: Using same user ID, same query parameters for all virtual users

Why bad: Caches give unrealistic performance, doesn't test real database load

Fix: Parameterized, realistic data

// ❌ Bad - same data
http.get('https://api.example.com/users/123')

// ✅ Good - parameterized data
import { SharedArray } from 'k6/data'
import papaparse from 'https://jslib.k6.io/papaparse/5.1.1/index.js'

const csvData = new SharedArray('users', function () {
  return papaparse.parse(open('./users.csv'), { header: true }).data
})

export default function() {
  const user = csvData[__VU % csvData.length]
  http.get(`https://api.example.com/users/${user.id}`)
}

❌ Constant Load Pattern

Symptom: Running with constant VUs instead of realistic traffic pattern

Why bad: Real traffic has peaks, valleys, not flat line

Fix: Use realistic daily/hourly patterns


❌ Ignoring Think Time

Symptom: No delays between requests, hammering API as fast as possible

Why bad: Unrealistic user behavior, overestimates load

Fix: Add realistic think time based on user behavior

// ✅ Good - realistic think time
import { sleep } from 'k6'

export default function() {
  http.get('https://api.example.com/products')
  sleep(Math.random() * 3 + 2)  // 2-5 seconds browsing

  http.post('https://api.example.com/cart', {...})
  sleep(Math.random() * 5 + 5)  // 5-10 seconds deciding

  http.post('https://api.example.com/checkout', {...})
}

Result Interpretation Guide

Latency Degradation Patterns

Pattern Cause What to Check
Linear growth (2x users → 2x latency) CPU-bound Thread pool, CPU usage
Exponential growth (2x users → 10x latency) Resource saturation Connection pools, locks, queues
Sudden cliff (works until X, then fails) Hard limit hit Max connections, memory, file descriptors
Gradual degradation (slow increase over time) Memory leak, cache pollution Memory trends, GC activity

Bottleneck Classification

Symptom: p95 latency 10x at 2x loadResource saturation (database connection pool, thread pool, queue)

Symptom: Errors increase with loadHard limit (connection limit, rate limiting, timeout)

Symptom: Latency grows over time at constant loadMemory leak or cache pollution

Symptom: High variance (p50 good, p99 terrible)GC pauses, lock contention, or slow queries

What to Monitor

Layer Metrics to Track
Application Request rate, error rate, p50/p95/p99 latency, active requests
Runtime GC pauses (JVM, .NET), thread pool usage, heap/memory
Database Connection pool usage, query latency, lock waits, slow queries
Infrastructure CPU %, memory %, disk I/O, network throughput
External Third-party API latency, rate limit hits

Capacity Planning Formula

Safe Capacity = (Breaking Point × Degradation Factor) × Safety Margin

Breaking Point = VUs where p95 latency > threshold
Degradation Factor = 0.7 (start degradation before break)
Safety Margin = 0.5-0.7 (handle traffic spikes)

Example:
- System breaks at 1000 VUs (p95 > 1s)
- Start seeing degradation at 700 VUs (70%)
- Safe capacity: 700 × 0.7 = 490 VUs

Authentication and Session Management

Problem: Real APIs require authentication. Can't use same token for all virtual users.

Token Strategy Decision Framework

Scenario Strategy Why
Short test (<10 min) Pre-generate tokens Fast, simple, no login load
Long test (soak) Login during test + refresh Realistic, tests auth system
Testing auth system Simulate login flow Auth is part of load
Read-only testing Shared token (single user) Simplest, adequate for API-only tests

Default: Pre-generate tokens for load tests, simulate login for auth system tests

Pre-Generated Tokens Pattern

Best for: API testing where auth system isn't being tested

// k6 with pre-generated JWT tokens
import http from 'k6/http'
import { SharedArray } from 'k6/data'

// Load tokens from file (generated externally)
const tokens = new SharedArray('auth tokens', function () {
  return JSON.parse(open('./tokens.json'))
})

export default function() {
  const token = tokens[__VU % tokens.length]

  const headers = {
    'Authorization': `Bearer ${token}`
  }

  http.get('https://api.example.com/protected', { headers })
}

Generate tokens externally:

# Script to generate 1000 tokens
for i in {1..1000}; do
  curl -X POST https://api.example.com/login \
    -d "username=loadtest_user_$i&password=test" \
    | jq -r '.token'
done > tokens.json

Pros: No login load, fast test setup Cons: Tokens may expire during long tests, not testing auth flow


Login Flow Simulation Pattern

Best for: Testing auth system, soak tests where tokens expire

// k6 with login simulation
import http from 'k6/http'
import { SharedArray } from 'k6/data'

const users = new SharedArray('users', function () {
  return JSON.parse(open('./users.json'))  // [{username, password}, ...]
})

export default function() {
  const user = users[__VU % users.length]

  // Login to get token
  const loginRes = http.post('https://api.example.com/login', {
    username: user.username,
    password: user.password
  })

  const token = loginRes.json('token')

  // Use token for subsequent requests
  const headers = { 'Authorization': `Bearer ${token}` }

  http.get('https://api.example.com/protected', { headers })
  http.post('https://api.example.com/data', {}, { headers })
}

Token refresh for long tests:

// k6 with token refresh
import { sleep } from 'k6'

let token = null
let tokenExpiry = 0

export default function() {
  const now = Date.now() / 1000

  // Refresh token if expired or about to expire
  if (!token || now > tokenExpiry - 300) {  // Refresh 5 min before expiry
    const loginRes = http.post('https://api.example.com/login', {...})
    token = loginRes.json('token')
    tokenExpiry = loginRes.json('expires_at')
  }

  http.get('https://api.example.com/protected', {
    headers: { 'Authorization': `Bearer ${token}` }
  })

  sleep(1)
}

Session Cookie Management

For cookie-based auth:

// k6 with session cookies
import http from 'k6/http'

export default function() {
  // k6 automatically handles cookies with jar
  const jar = http.cookieJar()

  // Login (sets session cookie)
  http.post('https://example.com/login', {
    username: 'user',
    password: 'pass'
  })

  // Subsequent requests use session cookie automatically
  http.get('https://example.com/dashboard')
  http.get('https://example.com/profile')
}

Rate Limiting Detection

Pattern: Detect when hitting rate limits during load test

// k6 rate limit detection
import { check } from 'k6'

export default function() {
  const res = http.get('https://api.example.com/data')

  check(res, {
    'not rate limited': (r) => r.status !== 429
  })

  if (res.status === 429) {
    console.warn(`Rate limited at VU ${__VU}, iteration ${__ITER}`)
    const retryAfter = res.headers['Retry-After']
    console.warn(`Retry-After: ${retryAfter} seconds`)
  }
}

Thresholds for rate limiting:

export let options = {
  thresholds: {
    'http_req_failed{status:429}': ['rate<0.01']  // <1% rate limited
  }
}

Third-Party Dependency Handling

Problem: APIs call external services (payment, email, third-party APIs). Should you mock them?

Mock vs Real Decision Framework

External Service Mock or Real? Why
Payment gateway Real (sandbox) Need to test integration, has sandbox mode
Email provider Mock Cost ($0.001/email × 1000 VUs = expensive), no value testing
Third-party API (has staging) Real (staging) Test integration, realistic latency
Third-party API (no staging) Mock Can't load test production, rate limits
Internal microservices Real Testing real integration points
Analytics/tracking Mock High volume, no functional impact

Rule: Use real services if they have sandbox/staging. Mock if expensive, rate-limited, or no test environment.


Service Virtualization with WireMock

Best for: Mocking HTTP APIs with realistic responses

// k6 test pointing to WireMock
export default function() {
  // WireMock running on localhost:8080 mocks external API
  const res = http.get('http://localhost:8080/api/payment/process')

  check(res, {
    'payment mock responds': (r) => r.status === 200
  })
}

WireMock stub setup:

{
  "request": {
    "method": "POST",
    "url": "/api/payment/process"
  },
  "response": {
    "status": 200,
    "jsonBody": {
      "transaction_id": "{{randomValue type='UUID'}}",
      "status": "approved"
    },
    "headers": {
      "Content-Type": "application/json"
    },
    "fixedDelayMilliseconds": 200
  }
}

Why WireMock: Realistic latency simulation, dynamic responses, stateful mocking


Partial Mocking Pattern

Pattern: Mock some services, use real for others

// k6 with partial mocking
import http from 'k6/http'

export default function() {
  // Real API (points to staging)
  const productRes = http.get('https://staging-api.example.com/products')

  // Mock email service (points to WireMock)
  http.post('http://localhost:8080/mock/email/send', {
    to: 'user@example.com',
    subject: 'Order confirmation'
  })

  // Real payment sandbox
  http.post('https://sandbox-payment.stripe.com/charge', {
    amount: 1000,
    currency: 'usd',
    source: 'tok_visa'
  })
}

Decision criteria:

  • Real: Services with sandbox, need integration validation, low cost
  • Mock: No sandbox, expensive, rate-limited, testing failure scenarios

Testing External Service Failures

Use mocks to simulate failures:

// WireMock stub for failure scenarios
{
  "request": {
    "method": "POST",
    "url": "/api/payment/process"
  },
  "response": {
    "status": 503,
    "jsonBody": {
      "error": "Service temporarily unavailable"
    },
    "fixedDelayMilliseconds": 5000  // Slow failure
  }
}

k6 test for resilience:

export default function() {
  const res = http.post('http://localhost:8080/api/payment/process', {})

  // Verify app handles payment failures gracefully
  check(res, {
    'handles payment failure': (r) => r.status === 503,
    'returns within timeout': (r) => r.timings.duration < 6000
  })
}

Cost and Compliance Guardrails

Before testing with real external services:

Check Why
Sandbox mode exists? Avoid production costs/rate limits
Cost per request? 1000 VUs × 10 req/s × 600s = 6M requests
Rate limits? Will you hit external service limits?
Terms of service? Does load testing violate TOS?
Data privacy? Using real user emails/PII?

Example cost calculation:

Email service: $0.001/email
Load test: 100 VUs × 5 emails/session × 600s = 300,000 emails
Cost: 300,000 × $0.001 = $300

Decision: Mock email service, use real payment sandbox (free)

Compliance:

  • Don't use real user data in load tests (GDPR, privacy)
  • Check third-party TOS (some prohibit load testing)
  • Use synthetic test data only

Your First Load Test

Goal: Basic load test in one day

Hour 1-2: Install tool and write smoke test

# Install k6
brew install k6  # macOS
# or snap install k6  # Linux

# Create test.js
cat > test.js <<'EOF'
import http from 'k6/http'
import { check, sleep } from 'k6'

export let options = {
  vus: 1,
  duration: '30s'
}

export default function() {
  let res = http.get('https://your-api.com/health')
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response < 500ms': (r) => r.timings.duration < 500
  })
  sleep(1)
}
EOF

# Run smoke test
k6 run test.js

Hour 3-4: Calculate target load

Your DAU: 10,000
Concurrency: 10%
Peak multiplier: 1.5
Target: 10,000 × 0.10 × 1.5 = 1,500 VUs

Hour 5-6: Write load test with ramp-up

export let options = {
  stages: [
    { duration: '5m', target: 750 },   // Ramp to normal (50%)
    { duration: '10m', target: 750 },  // Hold normal
    { duration: '5m', target: 1500 },  // Ramp to peak
    { duration: '10m', target: 1500 }, // Hold peak
    { duration: '5m', target: 0 },     // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500', 'p(99)<1000'],
    http_req_failed: ['rate<0.05']  // < 5% errors
  }
}

Hour 7-8: Run test and analyze

# Run load test
k6 run --out json=results.json test.js

# Check summary output for:
# - p95/p99 latency trends
# - Error rates
# - When degradation started

If test fails: Check thresholds, adjust targets, investigate bottlenecks

Common Mistakes

❌ Testing Production Without Safeguards

Fix: Use feature flags, test environment, or controlled percentage


❌ No Baseline Performance Metrics

Fix: Run smoke test first to establish baseline before load testing


❌ Using Iteration Duration Instead of Arrival Rate

Fix: Use constant-arrival-rate executor in k6


❌ Not Warming Up Caches/JIT

Fix: 2-5 minute warm-up phase before measurement

Quick Reference

Tool Selection:

  • Modern API: k6
  • Enterprise: JMeter
  • Python team: Locust

Test Patterns:

  • Smoke: 1 VU, 1 min
  • Load: Ramp-up → peak → ramp-down
  • Stress: Increase until break
  • Spike: Sudden 10x surge
  • Soak: 4-8 hours constant

Load Calculation:

Concurrent = DAU × 0.10 × 1.5
RPS = (Concurrent × Requests/Session) / (Duration × Think Time)

Anti-Patterns:

  • Coordinated omission (use arrival rate)
  • Cold start (warm-up first)
  • Unrealistic data (parameterize)
  • Constant load (use realistic patterns)

Result Interpretation:

  • Linear growth → CPU-bound
  • Exponential growth → Resource saturation
  • Sudden cliff → Hard limit
  • Gradual degradation → Memory leak

Authentication:

  • Short tests: Pre-generate tokens
  • Long tests: Login + refresh
  • Testing auth: Simulate login flow

Third-Party Dependencies:

  • Has sandbox: Use real (staging/sandbox)
  • Expensive/rate-limited: Mock (WireMock)
  • No sandbox: Mock

Bottom Line

Start with smoke test (1 VU). Calculate realistic load from DAU. Use ramp-up pattern (never start at peak). Monitor p95/p99 latency. Find breaking point before users do.

Test realistic scenarios with think time, not hammer tests.