name	Kubernetes Operations
description	Kubernetes operations including deployment, management, troubleshooting, kubectl mastery, and cluster stability. Automatically assists with K8s workloads, networking, storage, and debugging.
allowed-tools	Glob, Grep, Read, Bash, Edit, Write, TodoWrite, WebFetch

Kubernetes Operations

Expert knowledge for Kubernetes cluster management, deployment, and troubleshooting with mastery of kubectl and cloud-native patterns.

Core Expertise

Kubernetes Operations

Workload Management: Deployments, StatefulSets, DaemonSets, Jobs, and CronJobs
Networking: Services, Ingress, NetworkPolicies, and DNS configuration
Configuration & Storage: ConfigMaps, Secrets, PersistentVolumes, and PersistentVolumeClaims
Troubleshooting: Debugging pods, analyzing logs, and inspecting cluster events

Cluster Operations Process

Manifest First: Always prefer declarative YAML manifests for resource management
Validate & Dry-Run: Use kubectl apply --dry-run=client to validate changes
Inspect & Verify: After applying changes, verify with kubectl get, kubectl describe, kubectl logs
Monitor Health: Continuously check status of nodes, pods, and services
Clean Up: Ensure old or unused resources are properly garbage collected

Essential Commands

# Resource management
kubectl apply -f manifest.yaml
kubectl get pods -A
kubectl describe pod <pod-name>
kubectl logs -f <pod-name>
kubectl exec -it <pod-name> -- /bin/bash

# Debugging
kubectl get events --sort-by='.lastTimestamp'
kubectl top nodes
kubectl top pods --containers
kubectl port-forward <pod-name> 8080:80

# Deployment management
kubectl rollout status deployment/<name>
kubectl rollout history deployment/<name>
kubectl rollout undo deployment/<name>

# Cluster inspection
kubectl cluster-info
kubectl get nodes -o wide
kubectl api-resources

Key Debugging Patterns

Pod Debugging

# Pod inspection
kubectl describe pod <pod-name>
kubectl get pod <pod-name> -o yaml
kubectl logs <pod-name> --previous

# Interactive debugging
kubectl exec -it <pod-name> -- /bin/bash
kubectl debug <pod-name> -it --image=busybox
kubectl port-forward <pod-name> 8080:80

Networking Troubleshooting

# Service debugging
kubectl get svc -o wide
kubectl get endpoints
kubectl describe svc <service>

# Network connectivity
kubectl run test-pod --image=busybox -it --rm -- sh
# Inside pod: nslookup, wget, nc commands

Common Issues

# CrashLoopBackOff debugging
kubectl logs <pod> --previous
kubectl describe pod <pod>
kubectl get events --field-selector involvedObject.name=<pod>

# Resource constraints
kubectl top pod <pod>
kubectl describe pod <pod> | grep -A 5 Limits

# State management
kubectl state list
kubectl state show <resource>

Best Practices

Resource Definitions

Use declarative YAML manifests
Implement proper labels and selectors
Define resource requests and limits
Configure health checks (liveness/readiness probes)

Security

Use NetworkPolicies to restrict traffic
Implement RBAC for access control
Store sensitive data in Secrets
Run containers as non-root users

Monitoring

Configure proper logging and metrics
Set up alerts for critical conditions
Use health checks and readiness probes
Monitor resource usage and quotas

For detailed debugging commands, troubleshooting patterns, Helm workflows, and advanced K8s operations, see REFERENCE.md.

Kubernetes Operations

Install Skill

SKILL.md

Kubernetes Operations

Core Expertise

Cluster Operations Process

Essential Commands

Key Debugging Patterns

Best Practices