| name | Kubernetes Operations |
| description | Kubernetes operations including deployment, management, troubleshooting, kubectl mastery, and cluster stability. Automatically assists with K8s workloads, networking, storage, and debugging. |
| allowed-tools | Glob, Grep, Read, Bash, Edit, Write, TodoWrite, WebFetch |
Kubernetes Operations
Expert knowledge for Kubernetes cluster management, deployment, and troubleshooting with mastery of kubectl and cloud-native patterns.
Core Expertise
Kubernetes Operations
- Workload Management: Deployments, StatefulSets, DaemonSets, Jobs, and CronJobs
- Networking: Services, Ingress, NetworkPolicies, and DNS configuration
- Configuration & Storage: ConfigMaps, Secrets, PersistentVolumes, and PersistentVolumeClaims
- Troubleshooting: Debugging pods, analyzing logs, and inspecting cluster events
Cluster Operations Process
- Manifest First: Always prefer declarative YAML manifests for resource management
- Validate & Dry-Run: Use
kubectl apply --dry-run=clientto validate changes - Inspect & Verify: After applying changes, verify with
kubectl get,kubectl describe,kubectl logs - Monitor Health: Continuously check status of nodes, pods, and services
- Clean Up: Ensure old or unused resources are properly garbage collected
Essential Commands
# Resource management
kubectl apply -f manifest.yaml
kubectl get pods -A
kubectl describe pod <pod-name>
kubectl logs -f <pod-name>
kubectl exec -it <pod-name> -- /bin/bash
# Debugging
kubectl get events --sort-by='.lastTimestamp'
kubectl top nodes
kubectl top pods --containers
kubectl port-forward <pod-name> 8080:80
# Deployment management
kubectl rollout status deployment/<name>
kubectl rollout history deployment/<name>
kubectl rollout undo deployment/<name>
# Cluster inspection
kubectl cluster-info
kubectl get nodes -o wide
kubectl api-resources
Key Debugging Patterns
Pod Debugging
# Pod inspection
kubectl describe pod <pod-name>
kubectl get pod <pod-name> -o yaml
kubectl logs <pod-name> --previous
# Interactive debugging
kubectl exec -it <pod-name> -- /bin/bash
kubectl debug <pod-name> -it --image=busybox
kubectl port-forward <pod-name> 8080:80
Networking Troubleshooting
# Service debugging
kubectl get svc -o wide
kubectl get endpoints
kubectl describe svc <service>
# Network connectivity
kubectl run test-pod --image=busybox -it --rm -- sh
# Inside pod: nslookup, wget, nc commands
Common Issues
# CrashLoopBackOff debugging
kubectl logs <pod> --previous
kubectl describe pod <pod>
kubectl get events --field-selector involvedObject.name=<pod>
# Resource constraints
kubectl top pod <pod>
kubectl describe pod <pod> | grep -A 5 Limits
# State management
kubectl state list
kubectl state show <resource>
Best Practices
Resource Definitions
- Use declarative YAML manifests
- Implement proper labels and selectors
- Define resource requests and limits
- Configure health checks (liveness/readiness probes)
Security
- Use NetworkPolicies to restrict traffic
- Implement RBAC for access control
- Store sensitive data in Secrets
- Run containers as non-root users
Monitoring
- Configure proper logging and metrics
- Set up alerts for critical conditions
- Use health checks and readiness probes
- Monitor resource usage and quotas
For detailed debugging commands, troubleshooting patterns, Helm workflows, and advanced K8s operations, see REFERENCE.md.