Claude Code Plugins

Community-maintained marketplace

Feedback

helm-charts-audit

@huseyindeniz/gitops-lab
4
0

Audits Helm charts for anti-patterns, security issues, and best practice violations. Use when asked to audit, review, or check Helm chart quality. Generates a comprehensive report under reports/YYYY-MM-DD/helm-charts-audit.md. (project)

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name helm-charts-audit
description Audits Helm charts for anti-patterns, security issues, and best practice violations. Use when asked to audit, review, or check Helm chart quality. Generates a comprehensive report under reports/YYYY-MM-DD/helm-charts-audit.md. (project)

Purpose

Enforce Helm chart quality and security standards across the helm-charts/ directory through automated checks.

What it checks (13 checks):

  1. Image Tags (no latest, mutable tags) - HIGH
  2. Security Context (runAsNonRoot, no privileged) - HIGH
  3. Resource Limits (requests, limits, memory) - HIGH
  4. RBAC Wildcards (no * permissions) - HIGH
  5. Health Probes (liveness, readiness) - HIGH
  6. Helm Lint (official helm validation) - HIGH
  7. Chart Metadata (apiVersion, version, maintainers) - MEDIUM
  8. Chart Structure (README, NOTES.txt, _helpers.tpl) - MEDIUM
  9. Dependencies (pinned versions, Chart.lock) - MEDIUM
  10. Deprecated APIs (no v1beta1, use stable APIs) - MEDIUM
  11. Argo Rollouts (strategy, analysis, steps) - MEDIUM
  12. Ingress TLS (certificates, annotations) - MEDIUM
  13. GPU Resources (nvidia.com/gpu, tolerations) - LOW

Running Checks

Full audit (all checks):

node .claude/skills/helm-charts-audit/scripts/run_all_checks.mjs

Generate report (all checks + markdown report):

node .claude/skills/helm-charts-audit/scripts/generate_report.mjs

Report saved to: reports/YYYY-MM-DD/helm-charts-audit.md

Individual checks:

node .claude/skills/helm-charts-audit/scripts/check_image_tags.mjs
node .claude/skills/helm-charts-audit/scripts/check_security_context.mjs
node .claude/skills/helm-charts-audit/scripts/check_resource_limits.mjs
node .claude/skills/helm-charts-audit/scripts/check_rbac_wildcards.mjs
node .claude/skills/helm-charts-audit/scripts/check_health_probes.mjs
node .claude/skills/helm-charts-audit/scripts/check_helm_lint.mjs
node .claude/skills/helm-charts-audit/scripts/check_chart_metadata.mjs
node .claude/skills/helm-charts-audit/scripts/check_chart_structure.mjs
node .claude/skills/helm-charts-audit/scripts/check_dependencies.mjs
node .claude/skills/helm-charts-audit/scripts/check_deprecated_apis.mjs
node .claude/skills/helm-charts-audit/scripts/check_argo_rollouts.mjs
node .claude/skills/helm-charts-audit/scripts/check_ingress_tls.mjs
node .claude/skills/helm-charts-audit/scripts/check_gpu_resources.mjs

Quality Rules

1. Image Tags (HIGH)

RULE: Never use mutable tags. latest tag = unpredictable deployments + rollback failures.

Violations:

  • image: nginx:latest - mutable, changes without notice
  • image: nginx - defaults to :latest
  • tag: "" - empty tag in values.yaml
  • tag: head, tag: canary, tag: dev - mutable branch tags

Fix: Use immutable tags like v1.2.3, SHA digests sha256:abc123, or SemVer 1.21.0.


2. Security Context (HIGH)

RULE: Containers must run with minimal privileges. Privileged containers = cluster takeover risk.

Violations:

  • privileged: true - full host access, container escape trivial
  • runAsNonRoot: false - runs as root user UID 0
  • runAsUser: 0 - explicitly root
  • allowPrivilegeEscalation: true - can gain more privileges
  • hostNetwork: true - shares host network namespace
  • hostPID: true - can see/kill host processes
  • hostIPC: true - can access host shared memory
  • readOnlyRootFilesystem: false - malware can write anywhere
  • capabilities.add: [SYS_ADMIN] - near-root level access
  • capabilities.add: [ALL] - equivalent to privileged

Fix: Add proper securityContext with runAsNonRoot: true, allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, capabilities.drop: [ALL].


3. Resource Limits (HIGH)

RULE: All containers must have resource requests and limits. No limits = node OOM + noisy neighbor issues.

Violations:

  • resources: {} - empty resources block
  • Missing requests.cpu - scheduler can't make decisions
  • Missing requests.memory - OOM killer may terminate unexpectedly
  • Missing limits.memory - container can consume all node memory
  • requests > limits - invalid configuration

Fix: Define resources.requests.cpu, resources.requests.memory, resources.limits.memory. Note: CPU limits often intentionally omitted for better performance.


4. RBAC Wildcards (HIGH)

RULE: Follow least-privilege principle. Wildcard permissions = privilege escalation path.

Violations:

  • verbs: ["*"] - grants all actions
  • resources: ["*"] - access to all resource types
  • apiGroups: ["*"] - access across all API groups
  • roleRef.name: cluster-admin - full cluster access
  • verbs: [impersonate] - can act as other users
  • verbs: [escalate, bind] - can grant additional privileges
  • Access to secrets resource - can read all secrets

Fix: Use explicit verbs like [get, list, watch], explicit resources like [pods, services], avoid cluster-admin bindings.


5. Health Probes (HIGH)

RULE: All workloads must have health probes. No probes = stuck containers not restarted + traffic to unready pods.

Violations:

  • Deployment without livenessProbe - stuck containers won't restart
  • Deployment without readinessProbe - traffic sent to unready pods
  • initialDelaySeconds: 0 - probes start immediately, false failures
  • timeoutSeconds: 1 - too short, may cause false failures
  • successThreshold > 1 on livenessProbe - should always be 1
  • failureThreshold > 10 - delays detecting actual failures

Fix: Add livenessProbe and readinessProbe with reasonable initialDelaySeconds (10-30s), periodSeconds (10s), timeoutSeconds (5s).


6. Helm Lint (HIGH)

RULE: Charts must pass official helm lint validation. Lint failures = deployment failures.

Violations:

  • Template syntax errors
  • Missing required fields in Chart.yaml
  • Invalid YAML structure
  • Broken template references

Fix: Run helm lint <chart-path> and fix reported issues.


7. Chart Metadata (MEDIUM)

RULE: Chart.yaml must have complete metadata. Missing metadata = maintenance nightmare.

Violations:

  • apiVersion: v1 - Helm 2 format, upgrade to v2
  • Missing or invalid version - must be SemVer
  • Missing appVersion - hard to track what's deployed
  • Missing description - unclear what chart does
  • Missing maintainers - no ownership
  • name doesn't match directory name - confusing

Fix: Use apiVersion: v2, SemVer version, add description and maintainers with email.


8. Chart Structure (MEDIUM)

RULE: Follow standard Helm chart structure. Non-standard = user confusion + missing features.

Violations:

  • Missing README.md - no documentation
  • Missing templates/NOTES.txt - no post-install instructions
  • Missing templates/_helpers.tpl - no template helpers
  • Missing .helmignore - unnecessary files in package
  • Missing values.schema.json - no values validation
  • Empty templates/ directory

Fix: Create missing files following Helm chart best practices.


9. Dependencies (MEDIUM)

RULE: Pin dependency versions. Floating versions = non-reproducible builds.

Violations:

  • No version on dependency - unpinned
  • version: "*" or version: "^1.0" - floating version
  • Missing Chart.lock - dependency versions not locked
  • repository: file:// - local reference, breaks when published
  • repository: http:// - insecure, use HTTPS
  • Deprecated repository URLs (charts.helm.sh/stable)

Fix: Pin exact versions, run helm dependency update to generate Chart.lock.


10. Deprecated APIs (MEDIUM)

RULE: Use stable Kubernetes APIs. Deprecated APIs = upgrade failures.

Violations:

  • extensions/v1beta1 - removed in K8s 1.22
  • apps/v1beta1, apps/v1beta2 - removed in K8s 1.16
  • networking.k8s.io/v1beta1 Ingress - removed in K8s 1.22
  • batch/v1beta1 CronJob - removed in K8s 1.25
  • policy/v1beta1 PodSecurityPolicy - removed in K8s 1.25

Fix: Update to stable APIs: apps/v1, networking.k8s.io/v1, batch/v1. Run kubectl convert if needed.


11. Argo Rollouts (MEDIUM)

RULE: Rollouts must have valid strategy configuration. Invalid config = failed deployments.

Violations:

  • Rollout without strategy - no deployment strategy
  • Canary without steps - no gradual rollout
  • Canary without analysis - no automated validation
  • BlueGreen without activeService - no active service defined
  • BlueGreen without previewService - can't preview before promotion
  • Missing revisionHistoryLimit - old ReplicaSets accumulate
  • Missing progressDeadlineSeconds - stuck rollouts don't timeout

Fix: Configure proper canary steps with analysis, or blueGreen with activeService/previewService.


12. Ingress TLS (MEDIUM)

RULE: Ingress must have TLS configuration. No TLS = unencrypted traffic.

Violations:

  • Ingress with hosts but no TLS - traffic unencrypted
  • TLS without secretName - certificate source unclear
  • No ingressClassName - may use wrong controller
  • Missing cert-manager annotations - no automated certificates
  • Deprecated kubernetes.io/ingress.class annotation
  • No SSL redirect annotation - HTTP doesn't redirect to HTTPS

Fix: Add TLS section with secretName, use cert-manager.io/cluster-issuer annotation for automated certs.


13. GPU Resources (LOW)

RULE: GPU workloads need proper configuration. Missing config = scheduling failures.

Violations:

  • GPU limits without matching requests - should be equal
  • No GPU toleration - won't schedule on GPU nodes
  • No GPU nodeSelector/affinity - relies only on resource availability
  • No runtimeClassName - may need nvidia runtime

Fix: Set nvidia.com/gpu in both requests and limits (equal values), add GPU tolerations and nodeSelector.


Detection Philosophy

This skill uses VALUE-BASED detection:

  • Detects issues by actual values and patterns, not by variable/field names
  • Future-proof: new charts with issues are automatically detected
  • No need to update scripts when new charts are added

Parsing Strategy

  • Chart.yaml, values.yaml: YAML content parsed via regex patterns
  • templates/*.yaml: Regex-based parsing (Go template syntax breaks YAML parsers)
  • Multi-document YAML: Handles --- separators

Safety

  • Read-only operation (except report generation)
  • No Helm releases modified
  • No cluster changes