| name | kubernetes-health |
| description | Comprehensive Kubernetes cluster health diagnostics using dynamic API discovery. Use when checking cluster health, troubleshooting K8s issues, or running health assessments. |
| allowed-tools | Read, Grep, Glob, Bash, Task |
Kubernetes Health Diagnostics
Dynamic, discovery-driven health checks for any Kubernetes cluster configuration
BEFORE YOU START
| Impact |
Value |
| Token Savings |
~70% vs manual kubectl exploration |
| Setup Time |
0 min (uses existing kubectl config) |
| Coverage |
Adapts to installed operators automatically |
Known Issues Prevented
| Problem |
Root Cause |
How This Skill Helps |
| Missing operator health |
Static checklists miss CRDs |
Dynamic API discovery detects all installed operators |
| Stale diagnostics |
Manual checks become outdated |
Real-time cluster API interrogation |
| Incomplete coverage |
Unknown cluster configuration |
Automatically activates relevant sub-agents |
Quick Start
- Verify cluster access: Ensure
kubectl is configured and can reach your cluster
- Run discovery: Execute
discover_apis.py to detect installed operators
- Dispatch agents: Use the orchestrator to run health checks based on discovery
# Step 1: Verify kubectl context
kubectl config current-context
kubectl cluster-info
# Step 2: Run API discovery
uv run .claude/skills/kubernetes-health/scripts/discover_apis.py
# Step 3: Review detected operators and dispatch health agents
Critical Rules
Always
- Verify kubectl context before running health checks
- Use read-only kubectl commands (get, describe, logs)
- Run core health checks before operator-specific checks
- Aggregate results using the provided scoring methodology
Never
- Modify cluster resources during health checks
- Expose secret values in health reports (metadata only)
- Skip context verification for production clusters
- Assume operator presence without API discovery
Common Mistakes
| Mistake |
Why It's Wrong |
Correct Approach |
| Hardcoding operator checks |
Misses installed operators, checks missing ones |
Use API discovery to detect what's installed |
| Sequential agent dispatch |
Slow for multi-operator clusters |
Run operator agents in parallel (same priority) |
| Raw kubectl output |
Token inefficient, hard to parse |
Use scripts for condensed JSON output |
Bundled Resources
Scripts
| Script |
Purpose |
scripts/discover_apis.py |
Discovers all API groups and detects installed operators |
scripts/health_orchestrator.py |
Maps discovered APIs to specialized health agents |
scripts/aggregate_report.py |
Aggregates multi-agent results into unified report |
References
| File |
Contents |
references/operator-checks.md |
Detailed health checks for each supported operator |
references/health-scoring.md |
Scoring methodology and weight assignments |
Templates
| File |
Purpose |
templates/health-report.json |
JSON schema for health report output |
Dependencies
Required
| Package |
Version |
Purpose |
| kubectl |
Latest |
Cluster interaction |
| Python |
>= 3.11 |
Script execution |
| uv |
Latest |
Python script runner |
Optional
| Package |
Version |
Purpose |
| kubernetes |
>= 28.1.0 |
Python client (for advanced discovery) |
Supported Operators
The skill automatically detects and dispatches specialized agents for:
| Operator |
API Group |
Agent |
| Core K8s |
(always) |
k8s-core-health-agent |
| Crossplane |
crossplane.io |
k8s-crossplane-health-agent |
| ArgoCD |
argoproj.io |
k8s-argocd-health-agent |
| Cert-Manager |
cert-manager.io |
k8s-certmanager-health-agent |
| Prometheus |
monitoring.coreos.com |
k8s-prometheus-health-agent |
Health Scoring
| Status |
Score Range |
Criteria |
| HEALTHY |
90-100 |
All checks pass, no warnings |
| DEGRADED |
60-89 |
Some warnings, no critical issues |
| CRITICAL |
0-59 |
Critical issues affecting availability |
Troubleshooting
kubectl connection issues
# Verify context
kubectl config current-context
# Test connectivity
kubectl cluster-info
# Check permissions
kubectl auth can-i get pods --all-namespaces
Discovery returns empty results
- Ensure cluster is reachable
- Check RBAC permissions for API discovery
- Verify kubectl version compatibility
Agent dispatch failures
- Confirm discovered API group matches agent trigger
- Check agent file exists in
.claude/agents/specialized/kubernetes/
- Review agent tool restrictions
Setup Checklist