| name | helm-charts-audit |
| description | Audits Helm charts for anti-patterns, security issues, and best practice violations. Use when asked to audit, review, or check Helm chart quality. Generates a comprehensive report under reports/YYYY-MM-DD/helm-charts-audit.md. (project) |
Purpose
Enforce Helm chart quality and security standards across the helm-charts/ directory through automated checks.
What it checks (13 checks):
- Image Tags (no latest, mutable tags) - HIGH
- Security Context (runAsNonRoot, no privileged) - HIGH
- Resource Limits (requests, limits, memory) - HIGH
- RBAC Wildcards (no * permissions) - HIGH
- Health Probes (liveness, readiness) - HIGH
- Helm Lint (official helm validation) - HIGH
- Chart Metadata (apiVersion, version, maintainers) - MEDIUM
- Chart Structure (README, NOTES.txt, _helpers.tpl) - MEDIUM
- Dependencies (pinned versions, Chart.lock) - MEDIUM
- Deprecated APIs (no v1beta1, use stable APIs) - MEDIUM
- Argo Rollouts (strategy, analysis, steps) - MEDIUM
- Ingress TLS (certificates, annotations) - MEDIUM
- GPU Resources (nvidia.com/gpu, tolerations) - LOW
Running Checks
Full audit (all checks):
node .claude/skills/helm-charts-audit/scripts/run_all_checks.mjs
Generate report (all checks + markdown report):
node .claude/skills/helm-charts-audit/scripts/generate_report.mjs
Report saved to: reports/YYYY-MM-DD/helm-charts-audit.md
Individual checks:
node .claude/skills/helm-charts-audit/scripts/check_image_tags.mjs
node .claude/skills/helm-charts-audit/scripts/check_security_context.mjs
node .claude/skills/helm-charts-audit/scripts/check_resource_limits.mjs
node .claude/skills/helm-charts-audit/scripts/check_rbac_wildcards.mjs
node .claude/skills/helm-charts-audit/scripts/check_health_probes.mjs
node .claude/skills/helm-charts-audit/scripts/check_helm_lint.mjs
node .claude/skills/helm-charts-audit/scripts/check_chart_metadata.mjs
node .claude/skills/helm-charts-audit/scripts/check_chart_structure.mjs
node .claude/skills/helm-charts-audit/scripts/check_dependencies.mjs
node .claude/skills/helm-charts-audit/scripts/check_deprecated_apis.mjs
node .claude/skills/helm-charts-audit/scripts/check_argo_rollouts.mjs
node .claude/skills/helm-charts-audit/scripts/check_ingress_tls.mjs
node .claude/skills/helm-charts-audit/scripts/check_gpu_resources.mjs
Quality Rules
1. Image Tags (HIGH)
RULE: Never use mutable tags. latest tag = unpredictable deployments + rollback failures.
Violations:
image: nginx:latest- mutable, changes without noticeimage: nginx- defaults to :latesttag: ""- empty tag in values.yamltag: head,tag: canary,tag: dev- mutable branch tags
Fix: Use immutable tags like v1.2.3, SHA digests sha256:abc123, or SemVer 1.21.0.
2. Security Context (HIGH)
RULE: Containers must run with minimal privileges. Privileged containers = cluster takeover risk.
Violations:
privileged: true- full host access, container escape trivialrunAsNonRoot: false- runs as root user UID 0runAsUser: 0- explicitly rootallowPrivilegeEscalation: true- can gain more privilegeshostNetwork: true- shares host network namespacehostPID: true- can see/kill host processeshostIPC: true- can access host shared memoryreadOnlyRootFilesystem: false- malware can write anywherecapabilities.add: [SYS_ADMIN]- near-root level accesscapabilities.add: [ALL]- equivalent to privileged
Fix: Add proper securityContext with runAsNonRoot: true, allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, capabilities.drop: [ALL].
3. Resource Limits (HIGH)
RULE: All containers must have resource requests and limits. No limits = node OOM + noisy neighbor issues.
Violations:
resources: {}- empty resources block- Missing
requests.cpu- scheduler can't make decisions - Missing
requests.memory- OOM killer may terminate unexpectedly - Missing
limits.memory- container can consume all node memory requests > limits- invalid configuration
Fix: Define resources.requests.cpu, resources.requests.memory, resources.limits.memory. Note: CPU limits often intentionally omitted for better performance.
4. RBAC Wildcards (HIGH)
RULE: Follow least-privilege principle. Wildcard permissions = privilege escalation path.
Violations:
verbs: ["*"]- grants all actionsresources: ["*"]- access to all resource typesapiGroups: ["*"]- access across all API groupsroleRef.name: cluster-admin- full cluster accessverbs: [impersonate]- can act as other usersverbs: [escalate, bind]- can grant additional privileges- Access to
secretsresource - can read all secrets
Fix: Use explicit verbs like [get, list, watch], explicit resources like [pods, services], avoid cluster-admin bindings.
5. Health Probes (HIGH)
RULE: All workloads must have health probes. No probes = stuck containers not restarted + traffic to unready pods.
Violations:
- Deployment without
livenessProbe- stuck containers won't restart - Deployment without
readinessProbe- traffic sent to unready pods initialDelaySeconds: 0- probes start immediately, false failurestimeoutSeconds: 1- too short, may cause false failuressuccessThreshold > 1on livenessProbe - should always be 1failureThreshold > 10- delays detecting actual failures
Fix: Add livenessProbe and readinessProbe with reasonable initialDelaySeconds (10-30s), periodSeconds (10s), timeoutSeconds (5s).
6. Helm Lint (HIGH)
RULE: Charts must pass official helm lint validation. Lint failures = deployment failures.
Violations:
- Template syntax errors
- Missing required fields in Chart.yaml
- Invalid YAML structure
- Broken template references
Fix: Run helm lint <chart-path> and fix reported issues.
7. Chart Metadata (MEDIUM)
RULE: Chart.yaml must have complete metadata. Missing metadata = maintenance nightmare.
Violations:
apiVersion: v1- Helm 2 format, upgrade to v2- Missing or invalid
version- must be SemVer - Missing
appVersion- hard to track what's deployed - Missing
description- unclear what chart does - Missing
maintainers- no ownership namedoesn't match directory name - confusing
Fix: Use apiVersion: v2, SemVer version, add description and maintainers with email.
8. Chart Structure (MEDIUM)
RULE: Follow standard Helm chart structure. Non-standard = user confusion + missing features.
Violations:
- Missing
README.md- no documentation - Missing
templates/NOTES.txt- no post-install instructions - Missing
templates/_helpers.tpl- no template helpers - Missing
.helmignore- unnecessary files in package - Missing
values.schema.json- no values validation - Empty
templates/directory
Fix: Create missing files following Helm chart best practices.
9. Dependencies (MEDIUM)
RULE: Pin dependency versions. Floating versions = non-reproducible builds.
Violations:
- No
versionon dependency - unpinned version: "*"orversion: "^1.0"- floating version- Missing
Chart.lock- dependency versions not locked repository: file://- local reference, breaks when publishedrepository: http://- insecure, use HTTPS- Deprecated repository URLs (charts.helm.sh/stable)
Fix: Pin exact versions, run helm dependency update to generate Chart.lock.
10. Deprecated APIs (MEDIUM)
RULE: Use stable Kubernetes APIs. Deprecated APIs = upgrade failures.
Violations:
extensions/v1beta1- removed in K8s 1.22apps/v1beta1,apps/v1beta2- removed in K8s 1.16networking.k8s.io/v1beta1Ingress - removed in K8s 1.22batch/v1beta1CronJob - removed in K8s 1.25policy/v1beta1PodSecurityPolicy - removed in K8s 1.25
Fix: Update to stable APIs: apps/v1, networking.k8s.io/v1, batch/v1. Run kubectl convert if needed.
11. Argo Rollouts (MEDIUM)
RULE: Rollouts must have valid strategy configuration. Invalid config = failed deployments.
Violations:
- Rollout without
strategy- no deployment strategy - Canary without
steps- no gradual rollout - Canary without
analysis- no automated validation - BlueGreen without
activeService- no active service defined - BlueGreen without
previewService- can't preview before promotion - Missing
revisionHistoryLimit- old ReplicaSets accumulate - Missing
progressDeadlineSeconds- stuck rollouts don't timeout
Fix: Configure proper canary steps with analysis, or blueGreen with activeService/previewService.
12. Ingress TLS (MEDIUM)
RULE: Ingress must have TLS configuration. No TLS = unencrypted traffic.
Violations:
- Ingress with hosts but no TLS - traffic unencrypted
- TLS without
secretName- certificate source unclear - No
ingressClassName- may use wrong controller - Missing cert-manager annotations - no automated certificates
- Deprecated
kubernetes.io/ingress.classannotation - No SSL redirect annotation - HTTP doesn't redirect to HTTPS
Fix: Add TLS section with secretName, use cert-manager.io/cluster-issuer annotation for automated certs.
13. GPU Resources (LOW)
RULE: GPU workloads need proper configuration. Missing config = scheduling failures.
Violations:
- GPU limits without matching requests - should be equal
- No GPU toleration - won't schedule on GPU nodes
- No GPU nodeSelector/affinity - relies only on resource availability
- No runtimeClassName - may need nvidia runtime
Fix: Set nvidia.com/gpu in both requests and limits (equal values), add GPU tolerations and nodeSelector.
Detection Philosophy
This skill uses VALUE-BASED detection:
- Detects issues by actual values and patterns, not by variable/field names
- Future-proof: new charts with issues are automatically detected
- No need to update scripts when new charts are added
Parsing Strategy
- Chart.yaml, values.yaml: YAML content parsed via regex patterns
- templates/*.yaml: Regex-based parsing (Go template syntax breaks YAML parsers)
- Multi-document YAML: Handles
---separators
Safety
- Read-only operation (except report generation)
- No Helm releases modified
- No cluster changes