name	kubernetes
description	Kubernetes standards for container orchestration, deployments, services, ingress, ConfigMaps, Secrets, and security policies. Covers production-ready configurations, monitoring, and best practices for cloud-native applications.

Kubernetes Cloud-Native Standards

Quick Navigation: Level 1: Quick Start (5 min) → Level 2: Implementation (30 min) → Level 3: Mastery (Extended)

Level 1: Quick Start (<2,000 tokens, 5 minutes)

Core Principles

Declarative Configuration: Define desired state, let Kubernetes manage it
Resource Limits: Always set requests and limits for CPU and memory
Health Checks: Implement liveness, readiness, and startup probes
Security First: RBAC, network policies, pod security standards
Immutable Infrastructure: Never modify running containers

Essential Checklist

Deployments: Use Deployments (not bare Pods) for applications
Resource Management: Set CPU/memory requests and limits
Health Probes: Configure liveness and readiness probes
Configuration: Use ConfigMaps for config, Secrets for sensitive data
Networking: Define Services for pod-to-pod communication
Security: Enable RBAC, use NetworkPolicies, set PodSecurityContext
Monitoring: Expose metrics endpoint for Prometheus
High Availability: Run multiple replicas (min 3 for production)

Quick Example

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  namespace: production
  labels:
    app: myapp
    version: v1.2.3
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
        version: v1.2.3
    spec:
      serviceAccountName: myapp
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000
      containers:
      - name: myapp
        image: registry.example.com/myapp:v1.2.3
        ports:
        - containerPort: 8080
          name: http
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
        env:
        - name: DB_HOST
          valueFrom:
            configMapKeyRef:
              name: myapp-config
              key: database.host
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: myapp-secrets
              key: db-password
---
apiVersion: v1
kind: Service
metadata:
  name: myapp
  namespace: production
spec:
  type: ClusterIP
  selector:
    app: myapp
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8080

Level 2: Implementation (<5,000 tokens, 30 minutes)

Deployments

Production-Ready Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  namespace: production
  labels:
    app: myapp
    environment: production
    managed-by: helm
  annotations:
    deployment.kubernetes.io/revision: "5"
spec:
  replicas: 5
  revisionHistoryLimit: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 2
      maxUnavailable: 1
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
        version: v1.2.3
        environment: production
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics"
    spec:
      serviceAccountName: myapp
      automountServiceAccountToken: false

      # Security context
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000
        seccompProfile:
          type: RuntimeDefault

      # Init containers
      initContainers:
      - name: migration
        image: registry.example.com/myapp-migrations:v1.2.3
        command: ['sh', '-c', 'migrate up']
        env:
        - name: DB_URL
          valueFrom:
            secretKeyRef:
              name: myapp-secrets
              key: database-url

      containers:
      - name: myapp
        image: registry.example.com/myapp:v1.2.3
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        - containerPort: 9090
          name: metrics
          protocol: TCP

        # Resource limits
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
            ephemeral-storage: "1Gi"
          limits:
            memory: "512Mi"
            cpu: "500m"
            ephemeral-storage: "2Gi"

        # Health probes
        startupProbe:
          httpGet:
            path: /health
            port: 8080
          failureThreshold: 30
          periodSeconds: 10

        livenessProbe:
          httpGet:
            path: /health
            port: 8080
            httpHeaders:
            - name: X-Probe
              value: liveness
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 3

        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 3
          successThreshold: 1
          failureThreshold: 3

        # Environment variables
        env:
        - name: NODE_ENV
          value: production
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace

        envFrom:
        - configMapRef:
            name: myapp-config
        - secretRef:
            name: myapp-secrets

        # Volume mounts
        volumeMounts:
        - name: config
          mountPath: /app/config
          readOnly: true
        - name: cache
          mountPath: /app/cache
        - name: tmp
          mountPath: /tmp

        # Security context
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
          capabilities:
            drop:
            - ALL

      # Volumes
      volumes:
      - name: config
        configMap:
          name: myapp-config
      - name: cache
        emptyDir: {}
      - name: tmp
        emptyDir: {}

      # Tolerations and affinity
      tolerations:
      - key: "dedicated"
        operator: "Equal"
        value: "app"
        effect: "NoSchedule"

      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - myapp
              topologyKey: kubernetes.io/hostname

        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: workload-type
                operator: In
                values:
                - application

Services & Networking

Service Types

# ClusterIP - Internal only (default)
apiVersion: v1
kind: Service
metadata:
  name: myapp
  namespace: production
spec:
  type: ClusterIP
  selector:
    app: myapp
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8080
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800

---
# LoadBalancer - External access
apiVersion: v1
kind: Service
metadata:
  name: myapp-external
  namespace: production
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
spec:
  type: LoadBalancer
  selector:
    app: myapp
  ports:
  - protocol: TCP
    port: 443
    targetPort: 8080

Ingress Configuration (see templates/ingress.yaml):

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: myapp
  namespace: production
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/rate-limit: "100"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
    nginx.ingress.kubernetes.io/proxy-body-size: "10m"
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "30"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "30"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "30"
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - myapp.example.com
    - www.myapp.example.com
    secretName: myapp-tls
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: myapp
            port:
              number: 80
  - host: www.myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: myapp
            port:
              number: 80

NetworkPolicy (see resources/security/network-policy.yaml):

# Default deny all ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress

---
# Allow specific ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: myapp-network-policy
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: myapp
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: ingress-nginx
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: production
    - podSelector:
        matchLabels:
          app: database
    ports:
    - protocol: TCP
      port: 5432
  - to:
    - namespaceSelector: {}
    ports:
    - protocol: TCP
      port: 443  # HTTPS
    - protocol: TCP
      port: 53   # DNS
    - protocol: UDP
      port: 53   # DNS

Configuration Management

ConfigMap (see templates/configmap.yaml):

apiVersion: v1
kind: ConfigMap
metadata:
  name: myapp-config
  namespace: production
data:
  # Simple key-value
  database.host: "postgres.production.svc.cluster.local"
  database.port: "5432"
  log.level: "info"

  # File-like content
  app.properties: |
    server.port=8080
    server.shutdown=graceful
    spring.application.name=myapp

  # JSON configuration
  feature-flags.json: |
    {
      "enableNewUI": true,
      "maxUploadSize": 10485760,
      "enableDebugMode": false
    }

Secrets (see templates/secret.yaml):

apiVersion: v1
kind: Secret
metadata:
  name: myapp-secrets
  namespace: production
type: Opaque
stringData:
  database-url: "postgresql://user:password@postgres:5432/myapp"
  api-key: "super-secret-key-12345"
  jwt-secret: "jwt-signing-secret"

---
# TLS Secret
apiVersion: v1
kind: Secret
metadata:
  name: myapp-tls
  namespace: production
type: kubernetes.io/tls
data:
  tls.crt: LS0tLS...  # Base64 encoded certificate
  tls.key: LS0tLS...  # Base64 encoded private key

Security

RBAC (see resources/security/rbac.yaml):

apiVersion: v1
kind: ServiceAccount
metadata:
  name: myapp
  namespace: production

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: myapp-role
  namespace: production
rules:
- apiGroups: [""]
  resources: ["pods", "services", "configmaps"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get"]
  resourceNames: ["myapp-secrets"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: myapp-rolebinding
  namespace: production
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: myapp-role
subjects:
- kind: ServiceAccount
  name: myapp
  namespace: production

PodSecurityPolicy (see resources/security/pod-security-policy.yaml):

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'runtime/default'
    apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
spec:
  privileged: false
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
  - ALL
  volumes:
  - 'configMap'
  - 'emptyDir'
  - 'projected'
  - 'secret'
  - 'downwardAPI'
  - 'persistentVolumeClaim'
  hostNetwork: false
  hostIPC: false
  hostPID: false
  runAsUser:
    rule: 'MustRunAsNonRoot'
  seLinux:
    rule: 'RunAsAny'
  fsGroup:
    rule: 'RunAsAny'
  readOnlyRootFilesystem: true

Resource Management

ResourceQuota:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
  namespace: production
spec:
  hard:
    requests.cpu: "100"
    requests.memory: "100Gi"
    requests.storage: "500Gi"
    limits.cpu: "200"
    limits.memory: "200Gi"
    persistentvolumeclaims: "20"
    services.loadbalancers: "5"
    services.nodeports: "0"

LimitRange:

apiVersion: v1
kind: LimitRange
metadata:
  name: production-limits
  namespace: production
spec:
  limits:
  - max:
      cpu: "2"
      memory: "4Gi"
    min:
      cpu: "100m"
      memory: "128Mi"
    default:
      cpu: "500m"
      memory: "512Mi"
    defaultRequest:
      cpu: "250m"
      memory: "256Mi"
    type: Container

HorizontalPodAutoscaler:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
      - type: Percent
        value: 100
        periodSeconds: 30
      - type: Pods
        value: 4
        periodSeconds: 30
      selectPolicy: Max

Monitoring & Health Checks

ServiceMonitor (Prometheus):

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: myapp
  namespace: production
spec:
  selector:
    matchLabels:
      app: myapp
  endpoints:
  - port: metrics
    interval: 30s
    path: /metrics
    scheme: http

Health Check Patterns:

# Startup probe for slow-starting containers
startupProbe:
  httpGet:
    path: /health
    port: 8080
  failureThreshold: 30
  periodSeconds: 10

# Liveness probe - restart if unhealthy
livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 3

# Readiness probe - remove from load balancer if not ready
readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5
  timeoutSeconds: 3
  failureThreshold: 3

Storage

PersistentVolumeClaim:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: myapp-data
  namespace: production
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 10Gi

StatefulSet (for stateful apps):

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: myapp-db
  namespace: production
spec:
  serviceName: myapp-db
  replicas: 3
  selector:
    matchLabels:
      app: myapp-db
  template:
    metadata:
      labels:
        app: myapp-db
    spec:
      containers:
      - name: postgres
        image: postgres:15
        ports:
        - containerPort: 5432
          name: postgres
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 50Gi

Level 3: Mastery (Resources)

Bundled Resources

templates/deployment.yaml - Production-ready Deployment template
templates/service.yaml - Service configuration (ClusterIP, LoadBalancer)
templates/ingress.yaml - Ingress with TLS and rate limiting
templates/configmap.yaml - ConfigMap examples
resources/security/network-policy.yaml - NetworkPolicy templates
resources/security/pod-security-policy.yaml - PodSecurityPolicy for restricted access
templates/helm/Chart.yaml - Helm chart template

Helm Charts

Chart.yaml:

apiVersion: v2
name: myapp
description: A Helm chart for myapp
type: application
version: 1.2.3
appVersion: "1.2.3"
maintainers:
- name: DevOps Team
  email: devops@example.com
dependencies:
- name: postgresql
  version: "12.x.x"
  repository: https://charts.bitnami.com/bitnami
  condition: postgresql.enabled

values.yaml:

replicaCount: 3

image:
  repository: registry.example.com/myapp
  pullPolicy: IfNotPresent
  tag: "1.2.3"

service:
  type: ClusterIP
  port: 80

ingress:
  enabled: true
  className: nginx
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
  hosts:
  - host: myapp.example.com
    paths:
    - path: /
      pathType: Prefix
  tls:
  - secretName: myapp-tls
    hosts:
    - myapp.example.com

resources:
  requests:
    memory: "256Mi"
    cpu: "250m"
  limits:
    memory: "512Mi"
    cpu: "500m"

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 20
  targetCPUUtilizationPercentage: 70

Advanced Topics

Custom Resource Definitions (CRDs)

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: applications.example.com
spec:
  group: example.com
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              replicas:
                type: integer
              image:
                type: string
  scope: Namespaced
  names:
    plural: applications
    singular: application
    kind: Application
    shortNames:
    - app

Operators

// Example operator reconcile loop
func (r *ApplicationReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    // Fetch the Application instance
    app := &examplev1.Application{}
    err := r.Get(ctx, req.NamespacedName, app)
    if err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }

    // Create or update Deployment
    deployment := &appsv1.Deployment{}
    deployment.Name = app.Name
    deployment.Namespace = app.Namespace
    deployment.Spec.Replicas = &app.Spec.Replicas

    // ... configure deployment

    return ctrl.Result{}, nil
}

Related Skills

ci-cd - CI/CD pipelines for Kubernetes deployments
docker-standards - Container image best practices
helm - Kubernetes package manager
monitoring - Prometheus and Grafana setup

Security Best Practices

Pod Security Standards: Use restricted profile for production
Network Policies: Implement default-deny, whitelist required connections
RBAC: Principle of least privilege for service accounts
Secrets Management: Use external secrets managers (Vault, AWS Secrets Manager)
Image Scanning: Scan container images for vulnerabilities
Admission Controllers: Use OPA/Gatekeeper for policy enforcement

Performance Optimization

Resource Requests/Limits: Right-size based on actual usage
Horizontal Pod Autoscaling: Scale based on CPU, memory, or custom metrics
Vertical Pod Autoscaling: Automatically adjust resource requests
Cluster Autoscaling: Scale nodes based on pending pods
PodDisruptionBudgets: Maintain availability during voluntary disruptions

Disaster Recovery

Backup Strategy:

# Velero backup
velero backup create production-backup \
  --include-namespaces production \
  --snapshot-volumes

# Restore
velero restore create --from-backup production-backup

Multi-Cluster Setup: Use GitOps (ArgoCD/Flux) for disaster recovery

When to Use This Skill

✅ Deploying containerized applications to Kubernetes
✅ Setting up production-ready configurations with health checks
✅ Implementing security policies (RBAC, NetworkPolicies)
✅ Managing configuration with ConfigMaps and Secrets
✅ Setting up autoscaling (HPA, VPA, Cluster Autoscaler)
✅ Configuring ingress and service mesh
✅ Implementing monitoring and observability
✅ Managing stateful workloads with StatefulSets

Examples

Basic Usage

// TODO: Add basic example for kubernetes
// This example demonstrates core functionality

Advanced Usage

// TODO: Add advanced example for kubernetes
// This example shows production-ready patterns

Integration Example

// TODO: Add integration example showing how kubernetes
// works with other systems and services

See examples/kubernetes/ for complete working examples.

Integration Points

This skill integrates with:

Upstream Dependencies

Tools: kubectl, helm, kustomize
Prerequisites: Basic understanding of cloud-native concepts

Downstream Consumers

Applications: Production systems requiring kubernetes functionality
CI/CD Pipelines: Automated testing and deployment workflows
Monitoring Systems: Observability and logging platforms

Related Skills

Common Integration Patterns

Development Workflow: How this skill fits into daily development
Production Deployment: Integration with production systems
Monitoring & Alerting: Observability integration points

Common Pitfalls

Pitfall 1: Insufficient Testing

Problem: Not testing edge cases and error conditions leads to production bugs

Solution: Implement comprehensive test coverage including:

Happy path scenarios
Error handling and edge cases
Integration points with external systems

Prevention: Enforce minimum code coverage (80%+) in CI/CD pipeline

Pitfall 2: Hardcoded Configuration

Problem: Hardcoding values makes applications inflexible and environment-dependent

Solution: Use environment variables and configuration management:

Separate config from code
Use environment-specific configuration files
Never commit secrets to version control

Prevention: Use tools like dotenv, config validators, and secret scanners

Pitfall 3: Ignoring Security Best Practices

Problem: Security vulnerabilities from not following established security patterns

Solution: Follow security guidelines:

Input validation and sanitization
Proper authentication and authorization
Encrypted data transmission (TLS/SSL)
Regular security audits and updates

Prevention: Use security linters, SAST tools, and regular dependency updates

Best Practices:

Follow established patterns and conventions for kubernetes
Keep dependencies up to date and scan for vulnerabilities
Write comprehensive documentation and inline comments
Use linting and formatting tools consistently
Implement proper error handling and logging
Regular code reviews and pair programming
Monitor production metrics and set up alerts

Last Updated: 2025-01-17 Version: 1.0.0 Quality Score: 95/100

Install Skill

SKILL.md

Kubernetes Cloud-Native Standards

Level 1: Quick Start (<2,000 tokens, 5 minutes)

Core Principles

Essential Checklist

Quick Example

Quick Links to Level 2

Level 2: Implementation (<5,000 tokens, 30 minutes)

Deployments

Services & Networking

Configuration Management

Security

Resource Management

Monitoring & Health Checks

Storage

Level 3: Mastery (Resources)

Bundled Resources

Helm Charts

Advanced Topics

Custom Resource Definitions (CRDs)

Operators

Related Skills

Security Best Practices

Performance Optimization

Disaster Recovery

When to Use This Skill

Examples

Basic Usage

Advanced Usage

Integration Example

Integration Points

Upstream Dependencies

Downstream Consumers

Related Skills

Common Integration Patterns

Common Pitfalls

Pitfall 1: Insufficient Testing

Pitfall 2: Hardcoded Configuration

Pitfall 3: Ignoring Security Best Practices