Claude Code Plugins

Community-maintained marketplace

k8s

@muzhicaomingwang/ai-ideas

1

0

Kubernetes ops skill for deploying, operating, and troubleshooting services on Kubernetes. Use for tasks like writing manifests/Helm, configuring deployments/services/ingress, autoscaling, observability, RBAC, secrets/configmaps, rollout/rollback, incident debugging, and production readiness checks.

Install Skill

1Download skill

2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name	k8s
description	Kubernetes ops skill for deploying, operating, and troubleshooting services on Kubernetes. Use for tasks like writing manifests/Helm, configuring deployments/services/ingress, autoscaling, observability, RBAC, secrets/configmaps, rollout/rollback, incident debugging, and production readiness checks.

k8s

Use this skill for Kubernetes 运维与发布相关工作。

Defaults / assumptions to confirm

Cluster type: managed (EKS/GKE/ACK) vs self-hosted
Packaging: raw YAML vs Helm vs Kustomize
Ingress: NGINX/ALB/APISIX/Istio
Observability stack: Prometheus/Grafana, Loki/ELK, tracing

Workflow

Understand service requirements

Ports, protocols, health checks, resources (CPU/mem), storage needs.
SLOs: latency, availability, RPO/RTO.
Dependencies: DB, cache, MQ, external APIs.

Deployment design

Use Deployment for stateless; StatefulSet for stable identities/storage.
Define readinessProbe and livenessProbe (and startupProbe if needed).
Set resources.requests/limits and choose appropriate QoS.
Use PodDisruptionBudget for availability during maintenance.

Config & secrets

Config: ConfigMap (non-sensitive), mounted or env.
Secrets: Secret (sensitive) + external secret manager if available.
Never commit plaintext secrets; prefer sealed/external secrets.

Networking

Service types and DNS.
Ingress/Gateway routing, TLS termination, timeouts.
NetworkPolicy if cluster enforces it.

Scaling & resilience

HPA based on CPU/memory/custom metrics.
Graceful shutdown (preStop, terminationGracePeriodSeconds).
Retry/backoff at client; avoid retry storms.

Observability

Standard logs with correlation IDs.
Metrics: RPS, p95 latency, error rate, saturation.
Alerts and dashboards; runbook links.

Release operations

Rolling updates, canary/blue-green if needed.
kubectl rollout status + rollback plan.
Post-deploy verification checks and smoke tests.

Troubleshooting checklist

kubectl get/describe pods, events, and logs.
Check probes, image pull, env/config, DNS, network, and resource throttling.
For performance: node pressure, HPA behavior, GC/heap, connection pool limits.

Output expectations when making changes

Provide manifests (or Helm values/templates) + brief deployment notes.
Include resource sizing rationale and probe settings.
Include rollback instructions and verification steps.