| name | performance-testing-standards |
| category | testing |
| difficulty | intermediate |
| tags | performance, load-testing, stress-testing, k6, jmeter, monitoring |
| prerequisites | api-testing-standards, monitoring-observability-standards |
| related_skills | security-testing-standards, ci-cd-pipeline-standards |
| estimated_time | 45 minutes |
| last_updated | 2025-10-17 |
| applies_to | backend, frontend, api, microservices, data-pipelines |
| description | Performance Test Types: |
Performance Testing Standards
📋 Metadata
Category: Testing Difficulty: Intermediate Prerequisites: API testing basics, monitoring fundamentals Estimated Time: 45 minutes Last Updated: 2025-10-17
🎯 Level 1: Quick Reference
Core Concepts
Performance Test Types:
- Load Testing: Validate system behavior under expected load
- Stress Testing: Find breaking point and failure modes
- Spike Testing: Validate sudden traffic surge handling
- Soak Testing: Verify stability under sustained load (memory leaks, resource exhaustion)
- Breakpoint Testing: Incrementally increase load to find capacity limits
Key Metrics:
- Latency Percentiles: p50 (median), p95, p99, p99.9
- Throughput: Requests per second (RPS), transactions per second (TPS)
- Error Rate: Percentage of failed requests
- Concurrent Users: Virtual users (VUs) active simultaneously
- Resource Utilization: CPU, memory, disk I/O, network bandwidth
Critical Success Indicators:
- p95 latency < SLO threshold
- Error rate < 0.1%
- System stability during sustained load
- Graceful degradation under stress
- Recovery after traffic spikes
Essential Checklist
Before Testing:
☐ Define SLIs/SLOs (Service Level Indicators/Objectives)
☐ Identify critical user journeys
☐ Set up monitoring (APM, metrics, logs)
☐ Establish baseline performance
☐ Configure realistic test scenarios
☐ Prepare production-like test environment
During Testing:
☐ Monitor system resources in real-time
☐ Track error logs and exceptions
☐ Validate data integrity
☐ Capture network traces for bottlenecks
☐ Document anomalies and failures
After Testing:
☐ Analyze latency distributions
☐ Identify performance bottlenecks
☐ Compare against SLOs
☐ Generate executive summary
☐ Create remediation plan
☐ Update capacity planning docs
Quick Start: k6 Load Test
// load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '2m', target: 50 }, // Ramp-up
{ duration: '5m', target: 50 }, // Steady state
{ duration: '2m', target: 0 }, // Ramp-down
],
thresholds: {
http_req_duration: ['p(95)<500'], // 95% < 500ms
http_req_failed: ['rate<0.01'], // Error rate < 1%
},
};
export default function () {
const res = http.get('https://api.example.com/health');
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 500ms': (r) => r.timings.duration < 500,
});
sleep(1);
}
Run:
k6 run load-test.js
Quick Metrics Reference
| Metric | Good | Warning | Critical |
|---|---|---|---|
| p50 latency | < 100ms | 100-200ms | > 200ms |
| p95 latency | < 500ms | 500-1000ms | > 1000ms |
| p99 latency | < 1s | 1-2s | > 2s |
| Error rate | < 0.1% | 0.1-1% | > 1% |
| CPU usage | < 60% | 60-80% | > 80% |
| Memory usage | < 70% | 70-85% | > 85% |
🔧 Level 2: Implementation Guide
1. Test Strategy & Planning
Define Performance Requirements
Service Level Indicators (SLIs):
# performance-requirements.yaml
service: user-api
environment: production
slis:
availability:
target: 99.9%
measurement: successful_requests / total_requests
latency:
p50: 100ms
p95: 500ms
p99: 1000ms
measurement: response time at percentile
throughput:
target: 1000 rps
measurement: requests per second
error_rate:
target: 0.1%
measurement: failed_requests / total_requests
critical_user_journeys:
- name: user_login
weight: 30%
slo_p95: 300ms
- name: fetch_dashboard
weight: 40%
slo_p95: 500ms
- name: submit_transaction
weight: 20%
slo_p95: 1000ms
- name: search_products
weight: 10%
slo_p95: 400ms
Test Scenario Design
Load Profile Patterns:
// 1. Ramp-Up Pattern (Gradual Load Increase)
export const rampUpOptions = {
stages: [
{ duration: '5m', target: 100 }, // Ramp to 100 users
{ duration: '10m', target: 100 }, // Hold at 100
{ duration: '5m', target: 200 }, // Ramp to 200
{ duration: '10m', target: 200 }, // Hold at 200
{ duration: '5m', target: 0 }, // Ramp down
],
};
// 2. Stress Test Pattern (Beyond Normal Capacity)
export const stressOptions = {
stages: [
{ duration: '2m', target: 200 }, // Fast ramp-up
{ duration: '5m', target: 200 }, // Stay at 200
{ duration: '2m', target: 400 }, // Push to 400
{ duration: '5m', target: 400 }, // Hold at 400
{ duration: '10m', target: 0 }, // Long ramp-down
],
};
// 3. Spike Test Pattern (Sudden Traffic Surge)
export const spikeOptions = {
stages: [
{ duration: '1m', target: 50 }, // Baseline
{ duration: '10s', target: 500 }, // Sudden spike
{ duration: '3m', target: 500 }, // Hold spike
{ duration: '10s', target: 50 }, // Drop back
{ duration: '3m', target: 50 }, // Recovery
{ duration: '10s', target: 0 }, // End
],
};
// 4. Soak Test Pattern (Sustained Load)
export const soakOptions = {
stages: [
{ duration: '5m', target: 100 }, // Ramp-up
{ duration: '4h', target: 100 }, // Soak for 4 hours
{ duration: '5m', target: 0 }, // Ramp-down
],
};
2. k6 Implementation
Advanced k6 Test Structure
// advanced-load-test.js
import http from 'k6/http';
import { check, group, sleep } from 'k6';
import { Rate, Trend, Counter } from 'k6/metrics';
import { htmlReport } from 'https://raw.githubusercontent.com/benc-uk/k6-reporter/main/dist/bundle.js';
// Custom metrics
const errorRate = new Rate('errors');
const customTrend = new Trend('custom_duration');
const requestCount = new Counter('request_count');
// Test configuration
export const options = {
stages: [
{ duration: '2m', target: 100 },
{ duration: '5m', target: 100 },
{ duration: '2m', target: 200 },
{ duration: '5m', target: 200 },
{ duration: '2m', target: 0 },
],
thresholds: {
// HTTP-specific thresholds
'http_req_duration': ['p(95)<500', 'p(99)<1000'],
'http_req_duration{type:api}': ['p(95)<300'],
'http_req_duration{type:static}': ['p(95)<100'],
'http_req_failed': ['rate<0.01'],
// Custom metric thresholds
'errors': ['rate<0.01'],
'custom_duration': ['p(95)<400'],
// Group-specific thresholds
'group_duration{group:::Login Flow}': ['p(95)<800'],
'group_duration{group:::Dashboard Load}': ['p(95)<600'],
},
// Graceful stop configuration
gracefulStop: '30s',
// Disable default summary
summaryTrendStats: ['min', 'med', 'avg', 'p(90)', 'p(95)', 'p(99)', 'max'],
};
// Test data
const BASE_URL = __ENV.BASE_URL || 'https://api.example.com';
const users = JSON.parse(open('./data/test-users.json'));
// Setup function (runs once)
export function setup() {
// Authenticate and get token
const loginRes = http.post(`${BASE_URL}/auth/login`, {
username: 'test-user',
password: 'test-password',
});
return {
token: loginRes.json('token'),
timestamp: new Date().toISOString(),
};
}
// Main test scenario
export default function (data) {
const params = {
headers: {
'Authorization': `Bearer ${data.token}`,
'Content-Type': 'application/json',
},
tags: { type: 'api' },
};
// User login flow
group('Login Flow', () => {
const loginStart = new Date();
const res = http.get(`${BASE_URL}/api/user/profile`, params);
const success = check(res, {
'login status is 200': (r) => r.status === 200,
'response has user data': (r) => r.json('id') !== undefined,
});
errorRate.add(!success);
customTrend.add(new Date() - loginStart);
requestCount.add(1);
});
sleep(1);
// Dashboard load flow
group('Dashboard Load', () => {
const responses = http.batch([
['GET', `${BASE_URL}/api/dashboard/summary`, null, params],
['GET', `${BASE_URL}/api/notifications`, null, params],
['GET', `${BASE_URL}/api/user/settings`, null, params],
]);
responses.forEach((res) => {
check(res, {
'dashboard status is 200': (r) => r.status === 200,
});
});
});
sleep(2);
// Transaction submission
group('Submit Transaction', () => {
const payload = JSON.stringify({
amount: 100.00,
currency: 'USD',
description: `Test transaction ${__VU}-${__ITER}`,
});
const res = http.post(`${BASE_URL}/api/transactions`, payload, params);
check(res, {
'transaction created': (r) => r.status === 201,
'transaction ID returned': (r) => r.json('transactionId') !== undefined,
});
});
sleep(1);
}
// Teardown function (runs once)
export function teardown(data) {
console.log(`Test completed at ${new Date().toISOString()}`);
console.log(`Test started at ${data.timestamp}`);
}
// Custom HTML report generation
export function handleSummary(data) {
return {
'summary.html': htmlReport(data),
'summary.json': JSON.stringify(data),
};
}
k6 with Prometheus Integration
// k6-prometheus.js
import http from 'k6/http';
import { check } from 'k6';
import { textSummary } from 'https://jslib.k6.io/k6-summary/0.0.1/index.js';
export const options = {
stages: [
{ duration: '5m', target: 100 },
{ duration: '10m', target: 100 },
{ duration: '5m', target: 0 },
],
// Send metrics to Prometheus via remote write
ext: {
loadimpact: {
projectID: 123456,
name: 'API Load Test',
},
},
};
export default function () {
const res = http.get('https://api.example.com/endpoint');
check(res, {
'status is 200': (r) => r.status === 200,
});
}
export function handleSummary(data) {
// Export metrics for Prometheus
const prometheusMetrics = generatePrometheusMetrics(data);
return {
'stdout': textSummary(data, { indent: ' ', enableColors: true }),
'prometheus.txt': prometheusMetrics,
};
}
function generatePrometheusMetrics(data) {
let output = '';
// Request duration
output += `# HELP http_req_duration HTTP request duration\n`;
output += `# TYPE http_req_duration summary\n`;
output += `http_req_duration{quantile="0.5"} ${data.metrics.http_req_duration.values['p(50)']}\n`;
output += `http_req_duration{quantile="0.95"} ${data.metrics.http_req_duration.values['p(95)']}\n`;
output += `http_req_duration{quantile="0.99"} ${data.metrics.http_req_duration.values['p(99)']}\n`;
// Request rate
output += `# HELP http_reqs_total Total HTTP requests\n`;
output += `# TYPE http_reqs_total counter\n`;
output += `http_reqs_total ${data.metrics.http_reqs.values.count}\n`;
// Error rate
output += `# HELP http_req_failed HTTP request failure rate\n`;
output += `# TYPE http_req_failed gauge\n`;
output += `http_req_failed ${data.metrics.http_req_failed.values.rate}\n`;
return output;
}
3. JMeter Implementation
JMeter Test Plan Structure
<!-- jmeter-test-plan.jmx (simplified) -->
<?xml version="1.0" encoding="UTF-8"?>
<jmeterTestPlan version="1.2" properties="5.0">
<hashTree>
<TestPlan guiclass="TestPlanGui" testclass="TestPlan" testname="API Load Test">
<stringProp name="TestPlan.comments">Performance test for User API</stringProp>
<boolProp name="TestPlan.functional_mode">false</boolProp>
<boolProp name="TestPlan.serialize_threadgroups">false</boolProp>
<elementProp name="TestPlan.user_defined_variables" elementType="Arguments">
<collectionProp name="Arguments.arguments">
<elementProp name="BASE_URL" elementType="Argument">
<stringProp name="Argument.name">BASE_URL</stringProp>
<stringProp name="Argument.value">${__P(base_url,https://api.example.com)}</stringProp>
</elementProp>
<elementProp name="USERS" elementType="Argument">
<stringProp name="Argument.name">USERS</stringProp>
<stringProp name="Argument.value">${__P(users,100)}</stringProp>
</elementProp>
</collectionProp>
</elementProp>
</TestPlan>
<hashTree>
<!-- Thread Group -->
<ThreadGroup guiclass="ThreadGroupGui" testclass="ThreadGroup" testname="User Load">
<stringProp name="ThreadGroup.num_threads">${USERS}</stringProp>
<stringProp name="ThreadGroup.ramp_time">120</stringProp>
<stringProp name="ThreadGroup.duration">600</stringProp>
<boolProp name="ThreadGroup.scheduler">true</boolProp>
</ThreadGroup>
<hashTree>
<!-- HTTP Request Defaults -->
<ConfigTestElement guiclass="HttpDefaultsGui" testclass="ConfigTestElement">
<stringProp name="HTTPSampler.domain">${BASE_URL}</stringProp>
<stringProp name="HTTPSampler.protocol">https</stringProp>
<stringProp name="HTTPSampler.connect_timeout">10000</stringProp>
<stringProp name="HTTPSampler.response_timeout">30000</stringProp>
</ConfigTestElement>
<!-- HTTP Header Manager -->
<HeaderManager guiclass="HeaderPanel" testclass="HeaderManager">
<collectionProp name="HeaderManager.headers">
<elementProp name="" elementType="Header">
<stringProp name="Header.name">Content-Type</stringProp>
<stringProp name="Header.value">application/json</stringProp>
</elementProp>
</collectionProp>
</HeaderManager>
<!-- HTTP Requests -->
<HTTPSamplerProxy guiclass="HttpTestSampleGui" testclass="HTTPSamplerProxy" testname="Get User Profile">
<stringProp name="HTTPSampler.path">/api/user/profile</stringProp>
<stringProp name="HTTPSampler.method">GET</stringProp>
</HTTPSamplerProxy>
<!-- Response Assertions -->
<ResponseAssertion guiclass="AssertionGui" testclass="ResponseAssertion">
<collectionProp name="Asserion.test_strings">
<stringProp name="49586">200</stringProp>
</collectionProp>
<stringProp name="Assertion.test_field">Assertion.response_code</stringProp>
</ResponseAssertion>
<!-- Duration Assertion -->
<DurationAssertion guiclass="DurationAssertionGui" testclass="DurationAssertion">
<stringProp name="DurationAssertion.duration">500</stringProp>
</DurationAssertion>
</hashTree>
</hashTree>
</hashTree>
</jmeterTestPlan>
Run JMeter from CLI:
# Run test plan
jmeter -n -t test-plan.jmx \
-Jbase_url=https://api.example.com \
-Jusers=200 \
-l results.jtl \
-e -o report/
# Generate HTML report from existing results
jmeter -g results.jtl -o report/
4. Metrics & Monitoring
Key Performance Indicators
// performance-metrics.js
export const PerformanceMetrics = {
// Latency metrics (in milliseconds)
latency: {
p50: 'Median response time - 50% of requests',
p90: '90th percentile - 90% of requests faster than this',
p95: '95th percentile - SLO threshold',
p99: '99th percentile - Worst-case for most users',
p99_9: '99.9th percentile - Absolute worst-case',
},
// Throughput metrics
throughput: {
rps: 'Requests per second',
tps: 'Transactions per second',
data_transfer: 'MB/s transferred',
},
// Error metrics
errors: {
error_rate: 'Failed requests / Total requests',
timeout_rate: 'Timed out requests / Total requests',
error_types: 'Breakdown by HTTP status code',
},
// Resource utilization
resources: {
cpu_usage: 'Percentage of CPU capacity',
memory_usage: 'Percentage of memory capacity',
disk_io: 'Read/write operations per second',
network_bandwidth: 'MB/s network throughput',
},
// Application-specific metrics
application: {
db_query_time: 'Database query duration',
cache_hit_rate: 'Cache hits / Total cache requests',
queue_depth: 'Number of items in processing queue',
connection_pool: 'Active database connections',
},
};
SLI/SLO/SLA Framework
# slo-definitions.yaml
service: user-api
version: 1.0.0
# Service Level Indicators (What we measure)
slis:
- name: availability
description: "Percentage of successful requests"
query: "sum(rate(http_requests_total{status=~'2..'}[5m])) / sum(rate(http_requests_total[5m]))"
- name: latency_p95
description: "95th percentile response time"
query: "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))"
- name: error_rate
description: "Percentage of failed requests"
query: "sum(rate(http_requests_total{status=~'5..'}[5m])) / sum(rate(http_requests_total[5m]))"
# Service Level Objectives (Targets we aim for)
slos:
- sli: availability
target: 99.9
unit: percent
window: 30d
- sli: latency_p95
target: 500
unit: milliseconds
window: 30d
- sli: error_rate
target: 0.1
unit: percent
window: 30d
# Service Level Agreements (Contractual commitments)
slas:
- name: "Gold Tier SLA"
availability: 99.95
latency_p95: 300
support_response: 1h
credits: "10% monthly fee per 0.1% below SLA"
- name: "Silver Tier SLA"
availability: 99.9
latency_p95: 500
support_response: 4h
credits: "5% monthly fee per 0.5% below SLA"
# Error Budget
error_budget:
calculation: "(1 - SLO_target) * total_requests"
availability_budget: 43.2m # 99.9% uptime = 43.2 min downtime/month
burn_rate_alert: 10x # Alert if burning budget 10x faster
5. CI/CD Integration
GitHub Actions Performance Testing
# .github/workflows/performance-test.yml
name: Performance Testing
on:
schedule:
- cron: '0 2 * * *' # Daily at 2 AM
workflow_dispatch:
inputs:
duration:
description: 'Test duration (minutes)'
required: false
default: '10'
vus:
description: 'Virtual users'
required: false
default: '100'
jobs:
performance-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install k6
run: |
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
echo "deb https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update
sudo apt-get install k6
- name: Run Load Test
env:
BASE_URL: ${{ secrets.PERF_TEST_URL }}
DURATION: ${{ github.event.inputs.duration || '10' }}
VUS: ${{ github.event.inputs.vus || '100' }}
run: |
k6 run \
--out json=results.json \
--summary-export=summary.json \
tests/performance/load-test.js
- name: Parse Results
id: results
run: |
p95=$(jq -r '.metrics.http_req_duration.values["p(95)"]' summary.json)
error_rate=$(jq -r '.metrics.http_req_failed.values.rate' summary.json)
echo "p95_latency=$p95" >> $GITHUB_OUTPUT
echo "error_rate=$error_rate" >> $GITHUB_OUTPUT
# Check thresholds
if (( $(echo "$p95 > 500" | bc -l) )); then
echo "threshold_status=failed" >> $GITHUB_OUTPUT
exit 1
fi
if (( $(echo "$error_rate > 0.01" | bc -l) )); then
echo "threshold_status=failed" >> $GITHUB_OUTPUT
exit 1
fi
echo "threshold_status=passed" >> $GITHUB_OUTPUT
- name: Upload Results
uses: actions/upload-artifact@v3
with:
name: performance-results
path: |
results.json
summary.json
summary.html
- name: Comment on PR
if: github.event_name == 'pull_request'
uses: actions/github-script@v6
with:
script: |
const p95 = '${{ steps.results.outputs.p95_latency }}';
const errorRate = '${{ steps.results.outputs.error_rate }}';
const status = '${{ steps.results.outputs.threshold_status }}';
const body = `## Performance Test Results
**Status:** ${status === 'passed' ? '✅ PASSED' : '❌ FAILED'}
| Metric | Value | Threshold | Status |
|--------|-------|-----------|--------|
| p95 Latency | ${p95}ms | < 500ms | ${p95 < 500 ? '✅' : '❌'} |
| Error Rate | ${(errorRate * 100).toFixed(2)}% | < 1% | ${errorRate < 0.01 ? '✅' : '❌'} |
`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: body
});
- name: Send Slack Notification
if: failure()
uses: slackapi/slack-github-action@v1
with:
payload: |
{
"text": "Performance test failed!",
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": ":warning: *Performance Test Failed*\n\nWorkflow: ${{ github.workflow }}\nRun: ${{ github.run_id }}"
}
}
]
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
6. Profiling & Bottleneck Analysis
Application Profiling
# Node.js profiling with clinic.js
npm install -g clinic
# Flame graph (CPU profiling)
clinic flame -- node server.js
# Load test the app
clinic flame --visualize
# Bubbleprof (async operations)
clinic bubbleprof -- node server.js
# Load test the app
clinic bubbleprof --visualize
# Doctor (event loop monitoring)
clinic doctor -- node server.js
# Load test the app
clinic doctor --visualize
Database Query Profiling
-- PostgreSQL: Enable query logging
ALTER DATABASE mydb SET log_min_duration_statement = 100; -- Log queries > 100ms
-- Identify slow queries
SELECT
query,
calls,
total_time,
mean_time,
max_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 20;
-- MySQL: Enable slow query log
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 0.1; -- 100ms threshold
-- Analyze slow queries
SELECT * FROM mysql.slow_log
ORDER BY query_time DESC
LIMIT 20;
Network Trace Analysis
# Capture network traces during load test
tcpdump -i any -w performance-test.pcap port 8080
# Analyze with tshark
tshark -r performance-test.pcap -Y "http" -T fields \
-e frame.time_relative \
-e ip.src \
-e ip.dst \
-e http.request.method \
-e http.response.code \
-e http.time
# Visualize with Wireshark
wireshark performance-test.pcap
7. Results Analysis & Reporting
Performance Report Template
# Performance Test Report
**Test Date:** 2025-10-17
**Environment:** Staging
**Test Duration:** 30 minutes
**Peak Load:** 500 concurrent users
## Executive Summary
- ✅ System handled target load successfully
- ⚠️ p95 latency exceeded SLO by 15% during peak
- ❌ Memory leak detected during soak test
- ✅ Error rate remained below 0.1%
## Test Results
### Load Test Results
| Metric | Target | Actual | Status |
|--------|--------|--------|--------|
| p50 Latency | < 100ms | 87ms | ✅ PASS |
| p95 Latency | < 500ms | 574ms | ❌ FAIL |
| p99 Latency | < 1000ms | 1247ms | ❌ FAIL |
| Throughput | > 1000 RPS | 1342 RPS | ✅ PASS |
| Error Rate | < 0.1% | 0.08% | ✅ PASS |
| CPU Usage | < 80% | 72% | ✅ PASS |
| Memory Usage | < 85% | 91% | ❌ FAIL |
### Latency Distribution
p50: 87ms ████████████████░░░░░░░░░░░░░░░░░░░░ (43%) p75: 234ms ████████████████████████████░░░░░░░░ (70%) p90: 412ms ████████████████████████████████████░ (90%) p95: 574ms ████████████████████████████████████▓ (95%) p99: 1247ms████████████████████████████████████▓ (99%)
## Bottlenecks Identified
1. **Database Connection Pool Exhaustion**
- Symptom: p95 latency spike at 400+ concurrent users
- Root cause: Connection pool size limited to 50
- Recommendation: Increase to 200, implement connection pooling retry logic
2. **Memory Leak in User Session Handler**
- Symptom: Memory usage grows 2% per hour during soak test
- Root cause: Event listeners not properly cleaned up
- Recommendation: Implement proper cleanup in session middleware
3. **N+1 Query Problem in Dashboard Endpoint**
- Symptom: Dashboard load time > 1s at high load
- Root cause: Fetching related data in loop vs. single query
- Recommendation: Implement eager loading with JOIN queries
## Recommendations
### Immediate (P0)
- [ ] Fix memory leak in session handler
- [ ] Increase database connection pool size
- [ ] Add connection pool monitoring alerts
### Short-term (P1)
- [ ] Optimize dashboard N+1 queries
- [ ] Implement response caching for static data
- [ ] Add circuit breakers for external API calls
### Long-term (P2)
- [ ] Implement database read replicas
- [ ] Add CDN for static assets
- [ ] Consider horizontal scaling strategy
## Appendices
- [Grafana Dashboard](https://grafana.example.com/d/perf-test-2025-10-17)
- [Raw k6 Results](./results/k6-results.json)
- [Flame Graph](./results/flamegraph.svg)
- [Database Slow Query Log](./results/slow-queries.log)
📚 Level 3: Deep Dive Resources
Official Documentation
k6:
JMeter:
Gatling:
Books & Courses
- "The Art of Application Performance Testing" by Ian Molyneaux
- "Web Performance in Action" by Jeremy Wagner
- "High Performance Browser Networking" by Ilya Grigorik
- Google's Web Performance Optimization Course
Tools & Services
Load Testing:
- k6 - Modern load testing tool
- Apache JMeter - Java-based load testing
- Gatling - Scala-based load testing
- Locust - Python-based load testing
- Artillery - Modern performance testing toolkit
APM & Monitoring:
Profiling:
- Clinic.js - Node.js profiling
- py-spy - Python profiling
- pprof - Go profiling
- Java Flight Recorder
Community Resources
- Performance Testing Guidance for Web Applications (Microsoft)
- Web Performance Working Group (W3C)
- r/webdev Performance Discussions
- k6 Community Forum
Bundled Resources
This skill includes 6 ready-to-use resources in this directory:
- templates/k6-load-test.js - Production-ready k6 load test script
- templates/k6-stress-test.js - k6 stress testing script
- config/jmeter-test-plan.jmx - JMeter test plan template
- templates/grafana-dashboard.json - Performance monitoring dashboard
- scripts/run-perf-tests.sh - Automated test execution script
- resources/performance-checklist.md - Comprehensive SLI/SLO/SLA guide
🎓 Learning Path
Week 1-2: Fundamentals
- Study performance testing types and metrics
- Set up k6 and run basic load tests
- Learn to interpret latency percentiles
Week 3-4: Implementation
- Create realistic test scenarios
- Implement custom metrics and thresholds
- Integrate with monitoring tools (Prometheus/Grafana)
Week 5-6: Analysis & Optimization
- Profile application bottlenecks
- Optimize database queries
- Implement caching strategies
Week 7-8: CI/CD & Production
- Automate performance testing in CI/CD
- Set up SLI/SLO monitoring
- Create alerting and incident response plans
✅ Competency Checklist
By completing this skill, you should be able to:
- Design performance test scenarios for different load patterns
- Implement load tests using k6 and JMeter
- Define and measure SLIs, SLOs, and SLAs
- Analyze latency distributions and identify outliers
- Integrate performance testing into CI/CD pipelines
- Profile applications to identify bottlenecks
- Optimize database queries and connection pools
- Set up APM monitoring and alerting
- Generate performance reports with actionable recommendations
- Implement graceful degradation and circuit breakers
Examples
Basic Usage
// TODO: Add basic example for performance-testing
// This example demonstrates core functionality
Advanced Usage
// TODO: Add advanced example for performance-testing
// This example shows production-ready patterns
Integration Example
// TODO: Add integration example showing how performance-testing
// works with other systems and services
See examples/performance-testing/ for complete working examples.
Integration Points
This skill integrates with:
Upstream Dependencies
- Tools: Common development tools and frameworks
- Prerequisites: Basic understanding of general concepts
Downstream Consumers
- Applications: Production systems requiring performance-testing functionality
- CI/CD Pipelines: Automated testing and deployment workflows
- Monitoring Systems: Observability and logging platforms
Related Skills
- See other skills in this category
Common Integration Patterns
- Development Workflow: How this skill fits into daily development
- Production Deployment: Integration with production systems
- Monitoring & Alerting: Observability integration points
Common Pitfalls
Pitfall 1: Insufficient Testing
Problem: Not testing edge cases and error conditions leads to production bugs
Solution: Implement comprehensive test coverage including:
- Happy path scenarios
- Error handling and edge cases
- Integration points with external systems
Prevention: Enforce minimum code coverage (80%+) in CI/CD pipeline
Pitfall 2: Hardcoded Configuration
Problem: Hardcoding values makes applications inflexible and environment-dependent
Solution: Use environment variables and configuration management:
- Separate config from code
- Use environment-specific configuration files
- Never commit secrets to version control
Prevention: Use tools like dotenv, config validators, and secret scanners
Pitfall 3: Ignoring Security Best Practices
Problem: Security vulnerabilities from not following established security patterns
Solution: Follow security guidelines:
- Input validation and sanitization
- Proper authentication and authorization
- Encrypted data transmission (TLS/SSL)
- Regular security audits and updates
Prevention: Use security linters, SAST tools, and regular dependency updates
Best Practices:
- Follow established patterns and conventions for performance-testing
- Keep dependencies up to date and scan for vulnerabilities
- Write comprehensive documentation and inline comments
- Use linting and formatting tools consistently
- Implement proper error handling and logging
- Regular code reviews and pair programming
- Monitor production metrics and set up alerts
Last Updated: 2025-10-17 Version: 1.0.0 Maintainer: Testing Standards Team