| name | Deployment Strategies |
| description | Comprehensive guide to modern deployment strategies including blue-green, canary, rolling updates, and zero-downtime deployments |
Deployment Strategies
Purpose
Master production deployment strategies:
- Zero-downtime deployments
- Risk mitigation
- Rollback procedures
- Progressive delivery
- Infrastructure patterns
When to Use
Invoke this skill when:
- Planning production deployments
- Designing deployment pipelines
- Troubleshooting failed deploys
- Implementing new deployment strategy
- Reducing deployment risk
Deployment Strategies
1. Blue-Green Deployment
Concept: Two identical environments (Blue=current, Green=new)
┌─────────────┐
│ Load │
│ Balancer │
└──────┬──────┘
│
├─────────────┐
│ │
┌────▼───┐ ┌───▼────┐
│ Blue │ │ Green │
│ (v1.0) │ │ (v2.0) │
│ LIVE │ │ IDLE │
└────────┘ └────────┘
After testing Green:
Switch traffic → Green becomes LIVE
If issues:
Switch back → Instant rollback
Implementation:
// Example using AWS ELB
async function blueGreenDeploy(version: string): Promise<void> {
// 1. Deploy to green environment
await deployToEnvironment('green', version);
// 2. Run smoke tests on green
const testsPass = await runSmokeTests('green');
if (!testsPass) {
throw new Error('Green environment failed tests');
}
// 3. Gradually shift traffic
await shiftTraffic('blue', 'green', {
steps: [10, 25, 50, 100],
intervalMinutes: 5,
});
// 4. Monitor for errors
const errors = await monitorErrors('green', { duration: '10m' });
if (errors > THRESHOLD) {
await rollback('blue');
throw new Error('Rollback triggered due to errors');
}
// 5. Decommission blue
await decommission('blue');
}
Pros: ✅ Instant rollback, ✅ Zero downtime, ✅ Full testing before switch Cons: ❌ 2x infrastructure cost, ❌ Database migrations tricky
2. Canary Deployment
Concept: Gradually roll out to small percentage of users
Phase 1: 5% traffic → v2.0
95% traffic → v1.0
Phase 2: 25% traffic → v2.0
75% traffic → v1.0
Phase 3: 50% traffic → v2.0
50% traffic → v1.0
Phase 4: 100% traffic → v2.0
Implementation:
interface CanaryConfig {
stages: number[]; // [5, 25, 50, 100]
stageDuration: string; // '30m'
successCriteria: {
errorRate: number; // <1%
latencyP95: number; // <500ms
cpuUsage: number; // <70%
};
}
async function canaryDeploy(version: string, config: CanaryConfig): Promise<void> {
for (const percentage of config.stages) {
console.log(`Canary: Routing ${percentage}% to ${version}`);
await routeTraffic(version, percentage);
await sleep(config.stageDuration);
const metrics = await collectMetrics(version, config.stageDuration);
if (!meetsSuccessCriteria(metrics, config.successCriteria)) {
await rollback(version);
throw new Error(`Canary failed at ${percentage}%: ${metrics}`);
}
}
console.log('Canary successful - deployed to 100%');
}
function meetsSuccessCriteria(metrics: Metrics, criteria: SuccessCriteria): boolean {
return (
metrics.errorRate < criteria.errorRate &&
metrics.latencyP95 < criteria.latencyP95 &&
metrics.cpuUsage < criteria.cpuUsage
);
}
Pros: ✅ Low risk, ✅ Real user feedback, ✅ Gradual rollout Cons: ❌ Slower deployment, ❌ Complex routing, ❌ Inconsistent UX
3. Rolling Update
Concept: Update instances one-by-one or in batches
Initial State:
[v1.0] [v1.0] [v1.0] [v1.0] [v1.0]
Step 1: Update first instance
[v2.0] [v1.0] [v1.0] [v1.0] [v1.0]
Step 2: Update second instance
[v2.0] [v2.0] [v1.0] [v1.0] [v1.0]
Step 3: Update third instance
[v2.0] [v2.0] [v2.0] [v1.0] [v1.0]
...continue until all updated
Kubernetes Implementation:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 5
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # Max extra pods during update
maxUnavailable: 1 # Max unavailable pods
template:
spec:
containers:
- name: app
image: my-app:2.0
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
Pros: ✅ Simple, ✅ No extra infrastructure, ✅ Gradual Cons: ❌ Slow rollback, ❌ Mixed versions during deploy
4. Feature Flags (Dark Launch)
Concept: Deploy code but control feature activation
// Deploy new code but keep feature off
class PaymentService {
async processPayment(amount: number): Promise<void> {
if (featureFlags.isEnabled('new-payment-provider')) {
// New code path
return this.newPaymentProvider.process(amount);
} else {
// Old code path (still running)
return this.oldPaymentProvider.process(amount);
}
}
}
// Gradually enable for users
featureFlags.enableFor('new-payment-provider', {
percentage: 10, // 10% of users
userIds: [123, 456], // Specific test users
countries: ['US', 'CA'], // Specific countries
});
Progressive Rollout:
// Week 1: Internal testing
await featureFlags.enable('new-feature', { internal: true });
// Week 2: Beta users
await featureFlags.enable('new-feature', { beta: true });
// Week 3: 5% of production
await featureFlags.enable('new-feature', { percentage: 5 });
// Week 4: 25% of production
await featureFlags.enable('new-feature', { percentage: 25 });
// Week 5: 100%
await featureFlags.enable('new-feature', { percentage: 100 });
// Remove flag after stability
await featureFlags.remove('new-feature');
Pros: ✅ Instant rollback, ✅ A/B testing, ✅ Gradual rollout Cons: ❌ Code complexity, ❌ Technical debt if not cleaned up
5. A/B Testing Deployment
Concept: Run multiple versions for comparison
interface ABTest {
name: string;
variants: {
control: string; // Current version
treatment: string; // New version
};
traffic: {
control: number; // 50%
treatment: number; // 50%
};
metrics: string[]; // ['conversion', 'engagement']
duration: string; // '7days'
}
async function deployABTest(test: ABTest): Promise<void> {
// Deploy both versions
await deploy(test.variants.control);
await deploy(test.variants.treatment);
// Route traffic
await routeTraffic({
[test.variants.control]: test.traffic.control,
[test.variants.treatment]: test.traffic.treatment,
});
// Collect data
const results = await collectMetrics(test.metrics, test.duration);
// Analyze and decide winner
const winner = analyzeResults(results);
console.log(`Winner: ${winner}`);
// Deploy winner to 100%
await deploy(winner, { percentage: 100 });
}
Zero-Downtime Patterns
Health Checks
// Readiness check (ready to receive traffic?)
app.get('/health/ready', (req, res) => {
const checks = [
database.isConnected(),
cache.isConnected(),
// Other dependencies
];
const ready = checks.every(check => check);
res.status(ready ? 200 : 503).json({ ready, checks });
});
// Liveness check (is app alive?)
app.get('/health/live', (req, res) => {
res.status(200).json({ alive: true });
});
Graceful Shutdown
let isShuttingDown = false;
// Stop accepting new requests
process.on('SIGTERM', async () => {
console.log('SIGTERM received, starting graceful shutdown');
isShuttingDown = true;
// Stop accepting new connections
server.close(() => {
console.log('Server closed, no new connections');
});
// Wait for existing requests to complete (with timeout)
await waitForCompletion({ timeout: 30000 });
// Close database connections
await database.close();
// Exit
process.exit(0);
});
// Reject new requests during shutdown
app.use((req, res, next) => {
if (isShuttingDown) {
res.status(503).send('Server is shutting down');
} else {
next();
}
});
Database Migrations
Backward-Compatible Pattern:
Phase 1: Add new column (nullable)
Phase 2: Deploy code that writes to both old and new
Phase 3: Backfill data
Phase 4: Deploy code that reads from new
Phase 5: Remove old column
Never do destructive changes in same deploy as code!
Rollback Strategies
Automatic Rollback Triggers
interface RollbackCriteria {
errorRate: number; // >5% errors
latency: number; // >2000ms p95
availability: number; // <99%
customMetric?: () => boolean;
}
async function deployWithAutoRollback(
version: string,
criteria: RollbackCriteria
): Promise<void> {
const previousVersion = await getCurrentVersion();
try {
await deploy(version);
// Monitor for 10 minutes
const metrics = await monitorDeployment(version, '10m');
if (shouldRollback(metrics, criteria)) {
throw new Error('Metrics exceeded thresholds');
}
} catch (error) {
console.error('Deployment failed, rolling back:', error);
await deploy(previousVersion);
throw error;
}
}
function shouldRollback(metrics: Metrics, criteria: RollbackCriteria): boolean {
return (
metrics.errorRate > criteria.errorRate ||
metrics.latencyP95 > criteria.latency ||
metrics.availability < criteria.availability ||
criteria.customMetric?.()
);
}
Deployment Checklist
Pre-Deployment
- [ ] Code reviewed and approved
- [ ] Tests passing (unit, integration, E2E)
- [ ] Database migrations tested
- [ ] Rollback plan documented
- [ ] Stakeholders notified
- [ ] Feature flags configured
- [ ] Monitoring/alerts set up
- [ ] Load testing completed (if major change)
- [ ] Security scan passed
- [ ] Deployment window scheduled
During Deployment
- [ ] Announcement sent to team
- [ ] Deployment started (timestamp)
- [ ] Health checks passing
- [ ] Metrics monitored
- [ ] Error rates normal
- [ ] Logs reviewed
- [ ] User reports monitored
- [ ] No rollback needed
Post-Deployment
- [ ] Smoke tests passed
- [ ] Metrics stable for 30 minutes
- [ ] No unusual errors
- [ ] Performance acceptable
- [ ] Documentation updated
- [ ] Deployment announcement (success)
- [ ] Post-mortem (if issues)
- [ ] Remove old infrastructure (if blue-green)
Related Skills
incident-response: For handling deployment issuesmonitoring-setup: For deployment metricsdatabase-migrations: For schema changesfeature-flags: For progressive rollout