name	infrastructure-skill-builder
description	Transform infrastructure documentation, runbooks, and operational knowledge into reusable Claude Code skills. Convert Proxmox configs, Docker setups, Kubernetes deployments, and cloud infrastructure patterns into structured, actionable skills.
license	MIT
tags	infrastructure, documentation, skill-creation, knowledge-capture, iac

Infrastructure Skill Builder

Convert your infrastructure documentation into powerful, reusable Claude Code skills.

Overview

Infrastructure knowledge is often scattered across:

README files
Runbooks and wiki pages
Configuration files
Troubleshooting guides
Team Slack/Discord history
Mental models of senior engineers

This skill helps you systematically capture that knowledge as Claude Code skills for:

Faster onboarding
Consistent operations
Disaster recovery
Knowledge preservation
Team scaling

When to Use

Use this skill when:

Documenting complex infrastructure setups
Creating runbooks for operations teams
Onboarding new team members to infrastructure
Preserving expert knowledge before team changes
Standardizing infrastructure operations
Building organizational infrastructure library
Migrating from manual to automated operations

Skill Extraction Process

Step 1: Identify Infrastructure Domains

Common domains:

Container Orchestration: Docker, Kubernetes, Proxmox LXC
Cloud Platforms: AWS, GCP, Azure, DigitalOcean
Databases: PostgreSQL, MongoDB, Redis, MySQL
Web Servers: Nginx, Apache, Caddy, Traefik
Monitoring: Prometheus, Grafana, ELK Stack
CI/CD: Jenkins, GitLab CI, GitHub Actions
Networking: VPNs, Load Balancers, DNS, Firewalls
Storage: S3, MinIO, NFS, Ceph
Security: Authentication, SSL/TLS, Firewalls

Step 2: Extract Core Operations

For each domain, document:

Setup/Provisioning: How to create new instances
Configuration: How to configure for different use cases
Operations: Day-to-day management tasks
Troubleshooting: Common issues and resolutions
Scaling: How to scale up/down
Backup/Recovery: Disaster recovery procedures
Monitoring: Health checks and alerts
Security: Security best practices

Infrastructure Skill Template

---
name: [infrastructure-component]-manager
description: Expert guidance for [component] management, provisioning, troubleshooting, and operations
license: MIT
tags: [infrastructure, [component], operations, troubleshooting]
---

# [Component] Manager

Expert knowledge for managing [component] infrastructure.

## Authentication & Access

### Access Methods
```bash
# How to access the infrastructure component
ssh user@host
# or
kubectl config use-context cluster-name

Credentials & Configuration

Where credentials are stored
How to configure access
Common authentication issues

Architecture Overview

Component Topology

How components are organized
Network topology
Resource allocation
Redundancy setup

Key Resources

Resource 1: Purpose and specs
Resource 2: Purpose and specs
Resource 3: Purpose and specs

Common Operations

Operation 1: [e.g., Create New Instance]

# Step-by-step commands
command1 --flags
command2 --flags

# Verification
verify-command

Operation 2: [e.g., Update Configuration]

# Commands and explanations

Operation 3: [e.g., Scale Resources]

# Commands and explanations

Monitoring & Health Checks

Check System Status

# Health check commands
status-command

# Expected output
# What healthy output looks like

Common Metrics

Metric 1: What it means, normal range
Metric 2: What it means, normal range
Metric 3: What it means, normal range

Troubleshooting

Issue 1: [Common Problem]

Symptoms: What you observe Cause: Why it happens Fix: Step-by-step resolution

# Fix commands

Issue 2: [Another Problem]

Symptoms: Cause: Fix:

# Fix commands

Backup & Recovery

Backup Procedures

# How to backup
backup-command

# Verification
verify-backup

Recovery Procedures

# How to restore
restore-command

# Verification
verify-restore

Security Best Practices

Security practice 1
Security practice 2
Security practice 3

Quick Reference

Task	Command
Task 1	`command1`
Task 2	`command2`
Task 3	`command3`

Additional Resources

Official documentation links
Related skills
External references


## Real-World Example: Proxmox Skill

Based on the proxmox-auth skill in this repository:

### Extracted Knowledge

**From**: Proxmox VE cluster documentation + operational experience

**Structured as**:
1. **Authentication**: SSH access patterns, node IPs
2. **Architecture**: Cluster topology (2 nodes, resources)
3. **Operations**: Container/VM management commands
4. **Troubleshooting**: Common errors and fixes
5. **Networking**: Bridge configuration, IP management
6. **GPU Passthrough**: Special container configurations

**Result**: Comprehensive skill covering:
- Quick access to any node
- Container lifecycle management
- GPU-accelerated containers
- Network troubleshooting
- Backup procedures
- Common gotchas and solutions

## Extraction Scripts

### Extract from Runbooks

```bash
#!/bin/bash
# extract-from-runbook.sh - Convert runbook to skill

RUNBOOK_FILE="$1"
SKILL_NAME="$2"

if [ -z "$RUNBOOK_FILE" ] || [ -z "$SKILL_NAME" ]; then
    echo "Usage: $0 <runbook.md> <skill-name>"
    exit 1
fi

SKILL_DIR="skills/$SKILL_NAME"
mkdir -p "$SKILL_DIR"

# Extract sections from runbook
cat > "$SKILL_DIR/SKILL.md" << EOF
---
name: $SKILL_NAME
description: $(head -5 "$RUNBOOK_FILE" | grep -v "^#" | head -1 | xargs)
license: MIT
extracted-from: $RUNBOOK_FILE
---

# ${SKILL_NAME^}

$(cat "$RUNBOOK_FILE")

---

**Note**: This skill was auto-extracted from runbook documentation.
Review and refine before use.
EOF

echo "✓ Created skill: $SKILL_DIR/SKILL.md"
echo "Review and edit to add:"
echo "  - Metadata and tags"
echo "  - Troubleshooting section"
echo "  - Quick reference"
echo "  - Examples"

Extract from Docker Compose

#!/bin/bash
# docker-compose-to-skill.sh - Extract skill from docker-compose.yaml

COMPOSE_FILE="${1:-docker-compose.yaml}"
PROJECT_NAME=$(basename $(pwd))

SKILL_DIR="skills/docker-$PROJECT_NAME"
mkdir -p "$SKILL_DIR"

# Extract services
SERVICES=$(yq eval '.services | keys | .[]' "$COMPOSE_FILE")

cat > "$SKILL_DIR/SKILL.md" << EOF
---
name: docker-$PROJECT_NAME
description: Docker Compose configuration and management for $PROJECT_NAME
license: MIT
---

# Docker $PROJECT_NAME

Manage Docker Compose stack for $PROJECT_NAME.

## Services

$(yq eval '.services | to_entries | .[] | "### " + .key + "\n" + (.value.image // "custom") + "\n"' "$COMPOSE_FILE")

## Quick Start

\`\`\`bash
# Start all services
docker-compose up -d

# Check status
docker-compose ps

# View logs
docker-compose logs -f

# Stop all services
docker-compose down
\`\`\`

## Service Details

### Ports

$(yq eval '.services | to_entries | .[] | select(.value.ports) | "- **" + .key + "**: " + (.value.ports | join(", "))' "$COMPOSE_FILE")

### Volumes

$(yq eval '.services | to_entries | .[] | select(.value.volumes) | "- **" + .key + "**: " + (.value.volumes | join(", "))' "$COMPOSE_FILE")

## Configuration

See \`$COMPOSE_FILE\` for full configuration.

## Common Operations

### Restart Service

\`\`\`bash
docker-compose restart SERVICE_NAME
\`\`\`

### Update Service

\`\`\`bash
docker-compose pull SERVICE_NAME
docker-compose up -d SERVICE_NAME
\`\`\`

### View Service Logs

\`\`\`bash
docker-compose logs -f SERVICE_NAME
\`\`\`

## Troubleshooting

### Service Won't Start

1. Check logs: \`docker-compose logs SERVICE_NAME\`
2. Verify ports not in use: \`netstat -tulpn | grep PORT\`
3. Check disk space: \`df -h\`

### Network Issues

\`\`\`bash
# Recreate network
docker-compose down
docker network prune
docker-compose up -d
\`\`\`
EOF

echo "✓ Created skill from docker-compose.yaml"

Extract from Kubernetes Manifests

#!/bin/bash
# k8s-to-skill.sh - Extract skill from Kubernetes manifests

K8S_DIR="${1:-.}"
APP_NAME="${2:-$(basename $(pwd))}"

SKILL_DIR="skills/k8s-$APP_NAME"
mkdir -p "$SKILL_DIR"

cat > "$SKILL_DIR/SKILL.md" << EOF
---
name: k8s-$APP_NAME
description: Kubernetes deployment and management for $APP_NAME
license: MIT
---

# Kubernetes $APP_NAME

Manage Kubernetes resources for $APP_NAME.

## Resources

$(find "$K8S_DIR" -name "*.yaml" -o -name "*.yml" | while read file; do
    KIND=$(yq eval '.kind' "$file" 2>/dev/null)
    NAME=$(yq eval '.metadata.name' "$file" 2>/dev/null)
    echo "- **$KIND**: $NAME ($(basename $file))"
done)

## Deployment

### Apply All Resources

\`\`\`bash
kubectl apply -f $K8S_DIR/
\`\`\`

### Check Status

\`\`\`bash
# Pods
kubectl get pods -l app=$APP_NAME

# Services
kubectl get svc -l app=$APP_NAME

# Deployments
kubectl get deploy -l app=$APP_NAME
\`\`\`

## Common Operations

### Scale Deployment

\`\`\`bash
kubectl scale deployment $APP_NAME --replicas=3
\`\`\`

### Update Image

\`\`\`bash
kubectl set image deployment/$APP_NAME container=new-image:tag
\`\`\`

### View Logs

\`\`\`bash
kubectl logs -f deployment/$APP_NAME
\`\`\`

### Port Forward

\`\`\`bash
kubectl port-forward svc/$APP_NAME 8080:80
\`\`\`

## Troubleshooting

### Pod Not Starting

\`\`\`bash
# Check pod events
kubectl describe pod POD_NAME

# Check logs
kubectl logs POD_NAME

# Previous instance logs
kubectl logs POD_NAME --previous
\`\`\`

### Service Not Reachable

\`\`\`bash
# Check endpoints
kubectl get endpoints $APP_NAME

# Check service
kubectl describe svc $APP_NAME

# Test from another pod
kubectl run -it --rm debug --image=busybox --restart=Never -- wget -O- http://$APP_NAME
\`\`\`

## Quick Reference

| Task | Command |
|------|---------|
| Apply | \`kubectl apply -f $K8S_DIR/\` |
| Status | \`kubectl get all -l app=$APP_NAME\` |
| Logs | \`kubectl logs -f deployment/$APP_NAME\` |
| Scale | \`kubectl scale deployment $APP_NAME --replicas=N\` |
| Delete | \`kubectl delete -f $K8S_DIR/\` |
EOF

echo "✓ Created Kubernetes skill"

Infrastructure Patterns to Capture

Pattern 1: SSH Access Matrix

## SSH Access

| Host | IP | Purpose | Access |
|------|--------|---------|--------|
| node1 | 192.168.1.10 | Primary | `ssh node1` |
| node2 | 192.168.1.11 | Secondary | `ssh node2` |
| bastion | 203.0.113.5 | Jump host | `ssh -J bastion node1` |

Pattern 2: Service Port Mapping

## Service Ports

| Service | Internal | External | Protocol |
|---------|----------|----------|----------|
| Web | 8080 | 80 | HTTP |
| API | 3000 | 443 | HTTPS |
| DB | 5432 | - | TCP |

Pattern 3: Configuration Files

## Configuration Locations

### Application Config
- Path: `/etc/app/config.yaml`
- Format: YAML
- Requires restart: Yes

### Database Config
- Path: `/var/lib/postgres/postgresql.conf`
- Format: INI
- Requires restart: Yes

Pattern 4: Command Workflows

## Deployment Workflow

1. **Backup current state**
   ```bash
   ./backup.sh

Pull latest code
```
git pull origin main
```
Build application
```
docker build -t app:latest .
```
Deploy
```
docker-compose up -d
```
Verify
```
curl http://localhost/health
```


## Best Practices

### ✅ DO

1. **Document assumptions** - What's required before operations
2. **Include verification** - How to verify each operation succeeded
3. **Add troubleshooting** - Common issues and fixes
4. **Show outputs** - Expected command outputs
5. **Link resources** - Related documentation and skills
6. **Version information** - Software versions, configurations
7. **Security notes** - Security implications of operations
8. **Update regularly** - Keep skills current with infrastructure

### ❌ DON'T

1. **Don't hardcode secrets** - Use placeholders or env vars
2. **Don't skip context** - Explain why, not just how
3. **Don't assume knowledge** - Explain terminology
4. **Don't omit edge cases** - Document special scenarios
5. **Don't forget cleanup** - Include teardown procedures
6. **Don't ignore dependencies** - Document prerequisites
7. **Don't skip testing** - Verify all commands work
8. **Don't leave TODO** - Complete all sections

## Quality Checklist

- [ ] Clear component description
- [ ] Authentication/access documented
- [ ] Architecture overview provided
- [ ] Common operations with examples
- [ ] Troubleshooting section complete
- [ ] Health checks documented
- [ ] Backup/recovery procedures
- [ ] Security considerations noted
- [ ] Quick reference table
- [ ] All commands tested
- [ ] No hardcoded secrets
- [ ] Links to resources

## Quick Start Workflow

```bash
# 1. Identify infrastructure component
COMPONENT="nginx-reverse-proxy"

# 2. Gather documentation
# - Collect README files
# - Export wiki pages
# - Capture team knowledge
# - Document current setup

# 3. Create skill structure
mkdir -p skills/$COMPONENT

# 4. Fill in template
# Use the Infrastructure Skill Template above

# 5. Test all commands
# Verify every command in skill works

# 6. Review and refine
# Have team review for completeness

# 7. Commit to repository
git add skills/$COMPONENT
git commit -m "docs: Add $COMPONENT infrastructure skill"

Version: 1.0.0 Author: Harvested from proxmox-auth skill pattern Last Updated: 2025-11-18 License: MIT Key Principle: Convert tribal knowledge into structured, searchable, actionable skills.

Examples in This Repository

proxmox-auth: Proxmox VE cluster management
docker-*: Docker-based infrastructure
cloudflare-*: Cloudflare infrastructure services

Transform your infrastructure documentation into skills today! 🏗️

Install Skill

SKILL.md