Claude Code Plugins

Community-maintained marketplace

Feedback

Handle production incidents effectively. Use when responding to outages, conducting post-mortems, or improving reliability. Covers incident response and blameless culture.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name incident-management
description Handle production incidents effectively. Use when responding to outages, conducting post-mortems, or improving reliability. Covers incident response and blameless culture.
allowed-tools Read, Write, Glob, Grep

Incident Management

Incident Severity

Level Impact Response Time
SEV1 Complete outage Immediate
SEV2 Major degradation < 15 min
SEV3 Minor degradation < 1 hour
SEV4 Low impact Next business day

Incident Response

1. Detect

  • Monitoring alerts
  • Customer reports
  • Error logs

2. Triage

  • Assess severity
  • Assign incident commander
  • Create communication channel

3. Investigate

  • Check recent changes
  • Review logs and metrics
  • Identify root cause

4. Mitigate

  • Apply quick fix
  • Rollback if needed
  • Communicate status

5. Resolve

  • Confirm fix
  • Monitor for recurrence
  • Close incident

6. Learn

  • Post-mortem meeting
  • Document findings
  • Create action items

Post-Mortem Template

# Post-Mortem: [Incident Title]

## Summary
[Brief description of what happened]

## Timeline
- HH:MM - [Event]
- HH:MM - [Event]
- HH:MM - [Resolution]

## Impact
- Duration: [X hours]
- Users affected: [X]
- Revenue impact: [if applicable]

## Root Cause
[What caused this incident]

## Contributing Factors
- [Factor 1]
- [Factor 2]

## What Went Well
- [Positive 1]
- [Positive 2]

## What Could Be Improved
- [Improvement 1]
- [Improvement 2]

## Action Items
- [ ] [Action 1] - Owner: [Name]
- [ ] [Action 2] - Owner: [Name]

Blameless Culture

  • Focus on systems, not people
  • "What failed?" not "Who failed?"
  • Share learnings openly
  • Celebrate near-misses