Claude Code Plugins

Community-maintained marketplace

Feedback

Docker, Kubernetes, and container orchestration for data applications

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name containerization
description Docker, Kubernetes, container orchestration, and cloud-native deployment for data applications
sasmp_version 1.3.0
bonded_agent 03-devops-engineer
bond_type PRIMARY_BOND
skill_version 2.0.0
last_updated 2025-01
complexity intermediate
estimated_mastery_hours 120
prerequisites python-programming, cloud-platforms
unlocks mlops, big-data

Containerization & Kubernetes

Production-grade container orchestration for data engineering workloads with Docker and Kubernetes.

Quick Start

# Dockerfile for PySpark data application
FROM python:3.12-slim

# Install Java for Spark
RUN apt-get update && apt-get install -y openjdk-17-jdk-headless && \
    apt-get clean && rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Install dependencies first (cache optimization)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY src/ ./src/
COPY config/ ./config/

# Non-root user for security
RUN useradd -m appuser && chown -R appuser:appuser /app
USER appuser

ENV PYTHONPATH=/app
ENV JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64

ENTRYPOINT ["python", "-m", "src.main"]

Core Concepts

1. Multi-Stage Builds

# Build stage
FROM python:3.12 AS builder

WORKDIR /build
COPY requirements.txt .
RUN pip wheel --no-cache-dir --wheel-dir /wheels -r requirements.txt

# Runtime stage
FROM python:3.12-slim AS runtime

COPY --from=builder /wheels /wheels
RUN pip install --no-cache-dir /wheels/* && rm -rf /wheels

COPY src/ /app/src/
WORKDIR /app

USER 1000
CMD ["python", "-m", "src.main"]

2. Kubernetes Deployment

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: etl-worker
  labels:
    app: etl-worker
spec:
  replicas: 3
  selector:
    matchLabels:
      app: etl-worker
  template:
    metadata:
      labels:
        app: etl-worker
    spec:
      containers:
      - name: etl-worker
        image: company/etl-worker:v1.2.0
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "2000m"
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: url
        - name: LOG_LEVEL
          value: "INFO"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: etl-worker
              topologyKey: kubernetes.io/hostname

3. Kubernetes CronJob for ETL

# cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: daily-etl
spec:
  schedule: "0 2 * * *"  # 2 AM daily
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 3
  jobTemplate:
    spec:
      backoffLimit: 2
      activeDeadlineSeconds: 7200  # 2 hour timeout
      template:
        spec:
          restartPolicy: Never
          containers:
          - name: etl-job
            image: company/etl-pipeline:v1.0.0
            resources:
              requests:
                memory: "4Gi"
                cpu: "2000m"
              limits:
                memory: "8Gi"
                cpu: "4000m"
            env:
            - name: EXECUTION_DATE
              value: "{{ .Date }}"
            volumeMounts:
            - name: config
              mountPath: /app/config
              readOnly: true
          volumes:
          - name: config
            configMap:
              name: etl-config

4. Helm Chart Structure

# Chart.yaml
apiVersion: v2
name: data-pipeline
version: 1.0.0
appVersion: "2.0.0"
description: Data pipeline Helm chart

# values.yaml
replicaCount: 3

image:
  repository: company/data-pipeline
  tag: "latest"
  pullPolicy: IfNotPresent

resources:
  requests:
    memory: "1Gi"
    cpu: "500m"
  limits:
    memory: "4Gi"
    cpu: "2000m"

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

env:
  LOG_LEVEL: INFO
  BATCH_SIZE: "1000"

secrets:
  - name: DATABASE_URL
    secretName: db-credentials
    key: url

5. Docker Compose for Local Dev

# docker-compose.yml
version: '3.8'

services:
  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: datawarehouse
      POSTGRES_USER: admin
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U admin"]
      interval: 5s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

  airflow-webserver:
    image: apache/airflow:2.8.0-python3.11
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_started
    environment:
      AIRFLOW__CORE__EXECUTOR: CeleryExecutor
      AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://admin:${DB_PASSWORD}@postgres/datawarehouse
      AIRFLOW__CELERY__BROKER_URL: redis://redis:6379/0
    ports:
      - "8080:8080"
    volumes:
      - ./dags:/opt/airflow/dags
      - ./plugins:/opt/airflow/plugins

volumes:
  postgres_data:

Tools & Technologies

Tool Purpose Version (2025)
Docker Containerization 25+
Kubernetes Orchestration 1.29+
Helm K8s package manager 3.14+
ArgoCD GitOps deployment 2.10+
Kustomize K8s config management Built-in
containerd Container runtime 1.7+
Podman Docker alternative 4.8+

Troubleshooting Guide

Issue Symptoms Root Cause Fix
OOMKilled Pod restarts, exit code 137 Memory limit exceeded Increase limits, optimize code
CrashLoopBackOff Pod keeps restarting App crash, bad config Check logs: kubectl logs pod
ImagePullBackOff Pod stuck in Pending Image not found, auth Check image name, pull secrets
Pending Pod Pod won't schedule No resources, node selector Check resources, affinity rules

Debug Commands

# Check pod status and events
kubectl describe pod <pod-name>

# View container logs
kubectl logs <pod-name> -c <container-name> --previous

# Execute shell in container
kubectl exec -it <pod-name> -- /bin/sh

# Check resource usage
kubectl top pods

# Debug networking
kubectl run debug --image=busybox -it --rm -- sh

Best Practices

# ✅ DO: Use specific image tags
FROM python:3.12.1-slim

# ✅ DO: Use non-root user
USER 1000

# ✅ DO: Use multi-stage builds
# ✅ DO: Set resource limits
# ✅ DO: Use health checks

# ❌ DON'T: Run as root
# ❌ DON'T: Use latest tag
# ❌ DON'T: Store secrets in images

Resources


Skill Certification Checklist:

  • Can write production Dockerfiles
  • Can deploy applications to Kubernetes
  • Can create Helm charts
  • Can debug container issues
  • Can implement health checks and probes