| name | load-testing-patterns |
| description | Use when designing load tests, choosing tools (k6, JMeter, Gatling), calculating concurrent users from DAU, interpreting latency degradation, identifying bottlenecks, or running spike/soak/stress tests - provides test patterns, anti-patterns, and load calculation frameworks |
Load Testing Patterns
Overview
Core principle: Test realistic load patterns, not constant artificial load. Find limits before users do.
Rule: Load testing reveals system behavior under stress. Without it, production is your load test.
Tool Selection Decision Tree
| Your Need | Protocol | Team Skills | Use | Why |
|---|---|---|---|---|
| Modern API testing | HTTP/REST/GraphQL | JavaScript | k6 | Best dev experience, CI/CD friendly |
| Enterprise/complex protocols | HTTP/SOAP/JMS/JDBC | Java/GUI comfort | JMeter | Mature, comprehensive protocols |
| Python team | HTTP/WebSocket | Python | Locust | Pythonic, easy scripting |
| High performance/complex scenarios | HTTP/gRPC | Scala/Java | Gatling | Best reports, high throughput |
| Cloud-native at scale | HTTP/WebSocket | Any (SaaS) | Artillery, Flood.io | Managed, distributed |
First choice: k6 (modern, scriptable, excellent CI/CD integration)
Why not ApacheBench/wrk: Too simple for realistic scenarios, no complex user flows
Test Pattern Library
| Pattern | Purpose | Duration | When to Use |
|---|---|---|---|
| Smoke Test | Verify test works | 1-2 min | Before every test run |
| Load Test | Normal/peak capacity | 10-30 min | Regular capacity validation |
| Stress Test | Find breaking point | 20-60 min | Understand limits |
| Spike Test | Sudden traffic surge | 5-15 min | Black Friday, launch events |
| Soak Test | Memory leaks, stability | 1-8 hours | Pre-release validation |
| Capacity Test | Max sustainable load | Variable | Capacity planning |
Smoke Test
Goal: Verify test script works with minimal load
// k6 smoke test
export let options = {
vus: 1,
duration: '1m',
thresholds: {
http_req_duration: ['p(95)<500'], // 95% < 500ms
http_req_failed: ['rate<0.01'], // <1% errors
}
}
Purpose: Catch test script bugs before running expensive full tests
Load Test (Ramp-Up Pattern)
Goal: Test normal and peak expected load
// k6 load test with ramp-up
export let options = {
stages: [
{ duration: '5m', target: 100 }, // Ramp to normal load
{ duration: '10m', target: 100 }, // Hold at normal
{ duration: '5m', target: 200 }, // Ramp to peak
{ duration: '10m', target: 200 }, // Hold at peak
{ duration: '5m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<500', 'p(99)<1000'],
http_req_failed: ['rate<0.05'],
}
}
Pattern: Gradual ramp-up → sustain → ramp down. Never start at peak.
Stress Test (Breaking Point)
Goal: Find system limits
// k6 stress test
export let options = {
stages: [
{ duration: '5m', target: 100 }, // Normal
{ duration: '5m', target: 300 }, // Above peak
{ duration: '5m', target: 600 }, // 2x peak
{ duration: '5m', target: 900 }, // 3x peak (expect failure)
{ duration: '10m', target: 0 }, // Recovery
]
}
Success: Identify at what load system degrades (not necessarily breaking completely)
Spike Test (Sudden Surge)
Goal: Test sudden traffic bursts (viral post, email campaign)
// k6 spike test
export let options = {
stages: [
{ duration: '1m', target: 100 }, // Normal
{ duration: '30s', target: 1000 }, // SPIKE to 10x
{ duration: '5m', target: 1000 }, // Hold spike
{ duration: '2m', target: 100 }, // Back to normal
{ duration: '5m', target: 100 }, // Recovery check
]
}
Tests: Auto-scaling, circuit breakers, rate limiting
Soak Test (Endurance)
Goal: Find memory leaks, resource exhaustion over time
// k6 soak test
export let options = {
stages: [
{ duration: '5m', target: 100 }, // Ramp
{ duration: '4h', target: 100 }, // Soak (sustained load)
{ duration: '5m', target: 0 }, // Ramp down
]
}
Monitor: Memory growth, connection leaks, disk space, file descriptors
Duration: Minimum 1 hour, ideally 4-8 hours
Load Calculation Framework
Problem: Convert "10,000 daily active users" to concurrent load
Step 1: DAU to Concurrent Users
Concurrent Users = DAU × Concurrency Ratio × Peak Multiplier
Concurrency Ratios by App Type:
- Web apps: 5-10%
- Social media: 10-20%
- Business apps: 20-30% (work hours)
- Gaming: 15-25%
Peak Multiplier: 1.5-2x for safety margin
Example:
DAU = 10,000
Concurrency = 10% (web app)
Peak Multiplier = 1.5
Concurrent Users = 10,000 × 0.10 × 1.5 = 1,500 concurrent users
Step 2: Concurrent Users to Requests/Second
RPS = (Concurrent Users × Requests per Session) / (Session Duration × Think Time Ratio)
Think Time Ratio:
- Active browsing: 0.3-0.5 (30-50% time clicking/typing)
- Reading-heavy: 0.1-0.2 (10-20% active)
- API clients: 0.8-1.0 (80-100% active)
Example:
Concurrent Users = 1,500
Requests per Session = 20
Session Duration = 10 minutes = 600 seconds
Think Time Ratio = 0.3 (web browsing)
RPS = (1,500 × 20) / (600 × 0.3) = 30,000 / 180 = 167 RPS
Step 3: Model Realistic Patterns
Don't use constant load. Use realistic traffic patterns:
// Realistic daily pattern
export let options = {
stages: [
// Morning ramp
{ duration: '2h', target: 500 }, // 08:00-10:00
{ duration: '2h', target: 1000 }, // 10:00-12:00 (peak)
// Lunch dip
{ duration: '1h', target: 600 }, // 12:00-13:00
// Afternoon peak
{ duration: '2h', target: 1200 }, // 13:00-15:00 (peak)
{ duration: '2h', target: 800 }, // 15:00-17:00
// Evening drop
{ duration: '2h', target: 300 }, // 17:00-19:00
]
}
Anti-Patterns Catalog
❌ Coordinated Omission
Symptom: Fixed rate load generation ignores slow responses, underestimating latency
Why bad: Hides real latency impact when system slows down
Fix: Use arrival rate (requests/sec) not iteration rate
// ❌ Bad - coordinated omission
export default function() {
http.get('https://api.example.com')
sleep(1) // Wait 1s between requests
}
// ✅ Good - arrival rate pacing
export let options = {
scenarios: {
constant_arrival_rate: {
executor: 'constant-arrival-rate',
rate: 100, // 100 RPS regardless of response time
timeUnit: '1s',
duration: '10m',
preAllocatedVUs: 50,
maxVUs: 200,
}
}
}
❌ Cold Start Testing
Symptom: Running load test immediately after deployment without warm-up
Why bad: JIT compilation, cache warming, connection pooling haven't stabilized
Fix: Warm-up phase before measurement
// ✅ Good - warm-up phase
export let options = {
stages: [
{ duration: '2m', target: 50 }, // Warm-up (not measured)
{ duration: '10m', target: 100 }, // Actual test
]
}
❌ Unrealistic Test Data
Symptom: Using same user ID, same query parameters for all virtual users
Why bad: Caches give unrealistic performance, doesn't test real database load
Fix: Parameterized, realistic data
// ❌ Bad - same data
http.get('https://api.example.com/users/123')
// ✅ Good - parameterized data
import { SharedArray } from 'k6/data'
import papaparse from 'https://jslib.k6.io/papaparse/5.1.1/index.js'
const csvData = new SharedArray('users', function () {
return papaparse.parse(open('./users.csv'), { header: true }).data
})
export default function() {
const user = csvData[__VU % csvData.length]
http.get(`https://api.example.com/users/${user.id}`)
}
❌ Constant Load Pattern
Symptom: Running with constant VUs instead of realistic traffic pattern
Why bad: Real traffic has peaks, valleys, not flat line
Fix: Use realistic daily/hourly patterns
❌ Ignoring Think Time
Symptom: No delays between requests, hammering API as fast as possible
Why bad: Unrealistic user behavior, overestimates load
Fix: Add realistic think time based on user behavior
// ✅ Good - realistic think time
import { sleep } from 'k6'
export default function() {
http.get('https://api.example.com/products')
sleep(Math.random() * 3 + 2) // 2-5 seconds browsing
http.post('https://api.example.com/cart', {...})
sleep(Math.random() * 5 + 5) // 5-10 seconds deciding
http.post('https://api.example.com/checkout', {...})
}
Result Interpretation Guide
Latency Degradation Patterns
| Pattern | Cause | What to Check |
|---|---|---|
| Linear growth (2x users → 2x latency) | CPU-bound | Thread pool, CPU usage |
| Exponential growth (2x users → 10x latency) | Resource saturation | Connection pools, locks, queues |
| Sudden cliff (works until X, then fails) | Hard limit hit | Max connections, memory, file descriptors |
| Gradual degradation (slow increase over time) | Memory leak, cache pollution | Memory trends, GC activity |
Bottleneck Classification
Symptom: p95 latency 10x at 2x load → Resource saturation (database connection pool, thread pool, queue)
Symptom: Errors increase with load → Hard limit (connection limit, rate limiting, timeout)
Symptom: Latency grows over time at constant load → Memory leak or cache pollution
Symptom: High variance (p50 good, p99 terrible) → GC pauses, lock contention, or slow queries
What to Monitor
| Layer | Metrics to Track |
|---|---|
| Application | Request rate, error rate, p50/p95/p99 latency, active requests |
| Runtime | GC pauses (JVM, .NET), thread pool usage, heap/memory |
| Database | Connection pool usage, query latency, lock waits, slow queries |
| Infrastructure | CPU %, memory %, disk I/O, network throughput |
| External | Third-party API latency, rate limit hits |
Capacity Planning Formula
Safe Capacity = (Breaking Point × Degradation Factor) × Safety Margin
Breaking Point = VUs where p95 latency > threshold
Degradation Factor = 0.7 (start degradation before break)
Safety Margin = 0.5-0.7 (handle traffic spikes)
Example:
- System breaks at 1000 VUs (p95 > 1s)
- Start seeing degradation at 700 VUs (70%)
- Safe capacity: 700 × 0.7 = 490 VUs
Authentication and Session Management
Problem: Real APIs require authentication. Can't use same token for all virtual users.
Token Strategy Decision Framework
| Scenario | Strategy | Why |
|---|---|---|
| Short test (<10 min) | Pre-generate tokens | Fast, simple, no login load |
| Long test (soak) | Login during test + refresh | Realistic, tests auth system |
| Testing auth system | Simulate login flow | Auth is part of load |
| Read-only testing | Shared token (single user) | Simplest, adequate for API-only tests |
Default: Pre-generate tokens for load tests, simulate login for auth system tests
Pre-Generated Tokens Pattern
Best for: API testing where auth system isn't being tested
// k6 with pre-generated JWT tokens
import http from 'k6/http'
import { SharedArray } from 'k6/data'
// Load tokens from file (generated externally)
const tokens = new SharedArray('auth tokens', function () {
return JSON.parse(open('./tokens.json'))
})
export default function() {
const token = tokens[__VU % tokens.length]
const headers = {
'Authorization': `Bearer ${token}`
}
http.get('https://api.example.com/protected', { headers })
}
Generate tokens externally:
# Script to generate 1000 tokens
for i in {1..1000}; do
curl -X POST https://api.example.com/login \
-d "username=loadtest_user_$i&password=test" \
| jq -r '.token'
done > tokens.json
Pros: No login load, fast test setup Cons: Tokens may expire during long tests, not testing auth flow
Login Flow Simulation Pattern
Best for: Testing auth system, soak tests where tokens expire
// k6 with login simulation
import http from 'k6/http'
import { SharedArray } from 'k6/data'
const users = new SharedArray('users', function () {
return JSON.parse(open('./users.json')) // [{username, password}, ...]
})
export default function() {
const user = users[__VU % users.length]
// Login to get token
const loginRes = http.post('https://api.example.com/login', {
username: user.username,
password: user.password
})
const token = loginRes.json('token')
// Use token for subsequent requests
const headers = { 'Authorization': `Bearer ${token}` }
http.get('https://api.example.com/protected', { headers })
http.post('https://api.example.com/data', {}, { headers })
}
Token refresh for long tests:
// k6 with token refresh
import { sleep } from 'k6'
let token = null
let tokenExpiry = 0
export default function() {
const now = Date.now() / 1000
// Refresh token if expired or about to expire
if (!token || now > tokenExpiry - 300) { // Refresh 5 min before expiry
const loginRes = http.post('https://api.example.com/login', {...})
token = loginRes.json('token')
tokenExpiry = loginRes.json('expires_at')
}
http.get('https://api.example.com/protected', {
headers: { 'Authorization': `Bearer ${token}` }
})
sleep(1)
}
Session Cookie Management
For cookie-based auth:
// k6 with session cookies
import http from 'k6/http'
export default function() {
// k6 automatically handles cookies with jar
const jar = http.cookieJar()
// Login (sets session cookie)
http.post('https://example.com/login', {
username: 'user',
password: 'pass'
})
// Subsequent requests use session cookie automatically
http.get('https://example.com/dashboard')
http.get('https://example.com/profile')
}
Rate Limiting Detection
Pattern: Detect when hitting rate limits during load test
// k6 rate limit detection
import { check } from 'k6'
export default function() {
const res = http.get('https://api.example.com/data')
check(res, {
'not rate limited': (r) => r.status !== 429
})
if (res.status === 429) {
console.warn(`Rate limited at VU ${__VU}, iteration ${__ITER}`)
const retryAfter = res.headers['Retry-After']
console.warn(`Retry-After: ${retryAfter} seconds`)
}
}
Thresholds for rate limiting:
export let options = {
thresholds: {
'http_req_failed{status:429}': ['rate<0.01'] // <1% rate limited
}
}
Third-Party Dependency Handling
Problem: APIs call external services (payment, email, third-party APIs). Should you mock them?
Mock vs Real Decision Framework
| External Service | Mock or Real? | Why |
|---|---|---|
| Payment gateway | Real (sandbox) | Need to test integration, has sandbox mode |
| Email provider | Mock | Cost ($0.001/email × 1000 VUs = expensive), no value testing |
| Third-party API (has staging) | Real (staging) | Test integration, realistic latency |
| Third-party API (no staging) | Mock | Can't load test production, rate limits |
| Internal microservices | Real | Testing real integration points |
| Analytics/tracking | Mock | High volume, no functional impact |
Rule: Use real services if they have sandbox/staging. Mock if expensive, rate-limited, or no test environment.
Service Virtualization with WireMock
Best for: Mocking HTTP APIs with realistic responses
// k6 test pointing to WireMock
export default function() {
// WireMock running on localhost:8080 mocks external API
const res = http.get('http://localhost:8080/api/payment/process')
check(res, {
'payment mock responds': (r) => r.status === 200
})
}
WireMock stub setup:
{
"request": {
"method": "POST",
"url": "/api/payment/process"
},
"response": {
"status": 200,
"jsonBody": {
"transaction_id": "{{randomValue type='UUID'}}",
"status": "approved"
},
"headers": {
"Content-Type": "application/json"
},
"fixedDelayMilliseconds": 200
}
}
Why WireMock: Realistic latency simulation, dynamic responses, stateful mocking
Partial Mocking Pattern
Pattern: Mock some services, use real for others
// k6 with partial mocking
import http from 'k6/http'
export default function() {
// Real API (points to staging)
const productRes = http.get('https://staging-api.example.com/products')
// Mock email service (points to WireMock)
http.post('http://localhost:8080/mock/email/send', {
to: 'user@example.com',
subject: 'Order confirmation'
})
// Real payment sandbox
http.post('https://sandbox-payment.stripe.com/charge', {
amount: 1000,
currency: 'usd',
source: 'tok_visa'
})
}
Decision criteria:
- Real: Services with sandbox, need integration validation, low cost
- Mock: No sandbox, expensive, rate-limited, testing failure scenarios
Testing External Service Failures
Use mocks to simulate failures:
// WireMock stub for failure scenarios
{
"request": {
"method": "POST",
"url": "/api/payment/process"
},
"response": {
"status": 503,
"jsonBody": {
"error": "Service temporarily unavailable"
},
"fixedDelayMilliseconds": 5000 // Slow failure
}
}
k6 test for resilience:
export default function() {
const res = http.post('http://localhost:8080/api/payment/process', {})
// Verify app handles payment failures gracefully
check(res, {
'handles payment failure': (r) => r.status === 503,
'returns within timeout': (r) => r.timings.duration < 6000
})
}
Cost and Compliance Guardrails
Before testing with real external services:
| Check | Why |
|---|---|
| Sandbox mode exists? | Avoid production costs/rate limits |
| Cost per request? | 1000 VUs × 10 req/s × 600s = 6M requests |
| Rate limits? | Will you hit external service limits? |
| Terms of service? | Does load testing violate TOS? |
| Data privacy? | Using real user emails/PII? |
Example cost calculation:
Email service: $0.001/email
Load test: 100 VUs × 5 emails/session × 600s = 300,000 emails
Cost: 300,000 × $0.001 = $300
Decision: Mock email service, use real payment sandbox (free)
Compliance:
- Don't use real user data in load tests (GDPR, privacy)
- Check third-party TOS (some prohibit load testing)
- Use synthetic test data only
Your First Load Test
Goal: Basic load test in one day
Hour 1-2: Install tool and write smoke test
# Install k6
brew install k6 # macOS
# or snap install k6 # Linux
# Create test.js
cat > test.js <<'EOF'
import http from 'k6/http'
import { check, sleep } from 'k6'
export let options = {
vus: 1,
duration: '30s'
}
export default function() {
let res = http.get('https://your-api.com/health')
check(res, {
'status is 200': (r) => r.status === 200,
'response < 500ms': (r) => r.timings.duration < 500
})
sleep(1)
}
EOF
# Run smoke test
k6 run test.js
Hour 3-4: Calculate target load
Your DAU: 10,000
Concurrency: 10%
Peak multiplier: 1.5
Target: 10,000 × 0.10 × 1.5 = 1,500 VUs
Hour 5-6: Write load test with ramp-up
export let options = {
stages: [
{ duration: '5m', target: 750 }, // Ramp to normal (50%)
{ duration: '10m', target: 750 }, // Hold normal
{ duration: '5m', target: 1500 }, // Ramp to peak
{ duration: '10m', target: 1500 }, // Hold peak
{ duration: '5m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<500', 'p(99)<1000'],
http_req_failed: ['rate<0.05'] // < 5% errors
}
}
Hour 7-8: Run test and analyze
# Run load test
k6 run --out json=results.json test.js
# Check summary output for:
# - p95/p99 latency trends
# - Error rates
# - When degradation started
If test fails: Check thresholds, adjust targets, investigate bottlenecks
Common Mistakes
❌ Testing Production Without Safeguards
Fix: Use feature flags, test environment, or controlled percentage
❌ No Baseline Performance Metrics
Fix: Run smoke test first to establish baseline before load testing
❌ Using Iteration Duration Instead of Arrival Rate
Fix: Use constant-arrival-rate executor in k6
❌ Not Warming Up Caches/JIT
Fix: 2-5 minute warm-up phase before measurement
Quick Reference
Tool Selection:
- Modern API: k6
- Enterprise: JMeter
- Python team: Locust
Test Patterns:
- Smoke: 1 VU, 1 min
- Load: Ramp-up → peak → ramp-down
- Stress: Increase until break
- Spike: Sudden 10x surge
- Soak: 4-8 hours constant
Load Calculation:
Concurrent = DAU × 0.10 × 1.5
RPS = (Concurrent × Requests/Session) / (Duration × Think Time)
Anti-Patterns:
- Coordinated omission (use arrival rate)
- Cold start (warm-up first)
- Unrealistic data (parameterize)
- Constant load (use realistic patterns)
Result Interpretation:
- Linear growth → CPU-bound
- Exponential growth → Resource saturation
- Sudden cliff → Hard limit
- Gradual degradation → Memory leak
Authentication:
- Short tests: Pre-generate tokens
- Long tests: Login + refresh
- Testing auth: Simulate login flow
Third-Party Dependencies:
- Has sandbox: Use real (staging/sandbox)
- Expensive/rate-limited: Mock (WireMock)
- No sandbox: Mock
Bottom Line
Start with smoke test (1 VU). Calculate realistic load from DAU. Use ramp-up pattern (never start at peak). Monitor p95/p99 latency. Find breaking point before users do.
Test realistic scenarios with think time, not hammer tests.