Claude Code Plugins

Community-maintained marketplace

Feedback

AWS ECS container orchestration for running Docker containers. Use when deploying containerized applications, configuring task definitions, setting up services, managing clusters, or troubleshooting container issues.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name ecs
description AWS ECS container orchestration for running Docker containers. Use when deploying containerized applications, configuring task definitions, setting up services, managing clusters, or troubleshooting container issues.
last_updated 2026-01-07
doc_source https://docs.aws.amazon.com/AmazonECS/latest/developerguide/

AWS ECS

Amazon Elastic Container Service (ECS) is a fully managed container orchestration service. Run containers on AWS Fargate (serverless) or EC2 instances.

Table of Contents

Core Concepts

Cluster

Logical grouping of tasks or services. Can contain Fargate tasks, EC2 instances, or both.

Task Definition

Blueprint for your application. Defines containers, resources, networking, and IAM roles.

Task

Running instance of a task definition. Can run standalone or as part of a service.

Service

Maintains desired count of tasks. Handles deployments, load balancing, and auto scaling.

Launch Types

Type Description Use Case
Fargate Serverless, pay per task Most workloads
EC2 Self-managed instances GPU, Windows, specific requirements

Common Patterns

Create a Fargate Cluster

AWS CLI:

# Create cluster
aws ecs create-cluster --cluster-name my-cluster

# With capacity providers
aws ecs create-cluster \
  --cluster-name my-cluster \
  --capacity-providers FARGATE FARGATE_SPOT \
  --default-capacity-provider-strategy \
    capacityProvider=FARGATE,weight=1 \
    capacityProvider=FARGATE_SPOT,weight=1

Register Task Definition

cat > task-definition.json << 'EOF'
{
  "family": "web-app",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "256",
  "memory": "512",
  "executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
  "taskRoleArn": "arn:aws:iam::123456789012:role/ecsTaskRole",
  "containerDefinitions": [
    {
      "name": "web",
      "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest",
      "portMappings": [
        {
          "containerPort": 8080,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {"name": "NODE_ENV", "value": "production"}
      ],
      "secrets": [
        {
          "name": "DB_PASSWORD",
          "valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:db-password"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/web-app",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 60
      }
    }
  ]
}
EOF

aws ecs register-task-definition --cli-input-json file://task-definition.json

Create Service with Load Balancer

aws ecs create-service \
  --cluster my-cluster \
  --service-name web-service \
  --task-definition web-app:1 \
  --desired-count 2 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={
    subnets=[subnet-12345678,subnet-87654321],
    securityGroups=[sg-12345678],
    assignPublicIp=DISABLED
  }" \
  --load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/web-tg/1234567890123456,containerName=web,containerPort=8080" \
  --health-check-grace-period-seconds 60

Run Standalone Task

aws ecs run-task \
  --cluster my-cluster \
  --task-definition my-batch-job:1 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={
    subnets=[subnet-12345678],
    securityGroups=[sg-12345678],
    assignPublicIp=ENABLED
  }"

Update Service (Deploy New Image)

# Register new task definition with updated image
aws ecs register-task-definition --cli-input-json file://task-definition.json

# Update service to use new version
aws ecs update-service \
  --cluster my-cluster \
  --service web-service \
  --task-definition web-app:2 \
  --force-new-deployment

Auto Scaling

# Register scalable target
aws application-autoscaling register-scalable-target \
  --service-namespace ecs \
  --resource-id service/my-cluster/web-service \
  --scalable-dimension ecs:service:DesiredCount \
  --min-capacity 2 \
  --max-capacity 10

# Target tracking policy
aws application-autoscaling put-scaling-policy \
  --service-namespace ecs \
  --resource-id service/my-cluster/web-service \
  --scalable-dimension ecs:service:DesiredCount \
  --policy-name cpu-target-tracking \
  --policy-type TargetTrackingScaling \
  --target-tracking-scaling-policy-configuration '{
    "TargetValue": 70.0,
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ECSServiceAverageCPUUtilization"
    },
    "ScaleOutCooldown": 60,
    "ScaleInCooldown": 120
  }'

CLI Reference

Cluster Management

Command Description
aws ecs create-cluster Create cluster
aws ecs describe-clusters Get cluster details
aws ecs list-clusters List clusters
aws ecs delete-cluster Delete cluster

Task Definitions

Command Description
aws ecs register-task-definition Create task definition
aws ecs describe-task-definition Get task definition
aws ecs list-task-definitions List task definitions
aws ecs deregister-task-definition Deregister version

Services

Command Description
aws ecs create-service Create service
aws ecs update-service Update service
aws ecs describe-services Get service details
aws ecs delete-service Delete service

Tasks

Command Description
aws ecs run-task Run standalone task
aws ecs stop-task Stop running task
aws ecs describe-tasks Get task details
aws ecs list-tasks List tasks

Best Practices

Security

  • Use task roles for AWS API access (not access keys)
  • Use execution roles for ECR/Secrets access
  • Store secrets in Secrets Manager or Parameter Store
  • Use private subnets with NAT gateway
  • Enable CloudTrail for API auditing

Performance

  • Right-size CPU/memory — monitor and adjust
  • Use Fargate Spot for fault-tolerant workloads (70% savings)
  • Enable container insights for monitoring
  • Use service discovery for internal communication

Reliability

  • Deploy across multiple AZs
  • Configure health checks properly
  • Set appropriate deregistration delay
  • Use circuit breaker for deployments
aws ecs update-service \
  --cluster my-cluster \
  --service web-service \
  --deployment-configuration '{
    "deploymentCircuitBreaker": {
      "enable": true,
      "rollback": true
    }
  }'

Cost Optimization

  • Use Fargate Spot for batch workloads
  • Right-size task resources
  • Scale to zero when not needed
  • Use capacity providers for mixed Fargate/Spot

Troubleshooting

Task Fails to Start

Check:

# View stopped tasks
aws ecs describe-tasks \
  --cluster my-cluster \
  --tasks $(aws ecs list-tasks --cluster my-cluster --desired-status STOPPED --query 'taskArns[0]' --output text)

Common causes:

  • Image not found (ECR permissions)
  • Secrets access denied
  • Network configuration (subnets, security groups)
  • Resource limits exceeded

Container Keeps Restarting

Debug:

# Check CloudWatch logs
aws logs get-log-events \
  --log-group-name /ecs/web-app \
  --log-stream-name "ecs/web/abc123"

# Check task details
aws ecs describe-tasks \
  --cluster my-cluster \
  --tasks task-arn \
  --query 'tasks[0].containers[0].{reason:reason,exitCode:exitCode}'

Causes:

  • Health check failing
  • Application crashing
  • Out of memory

Service Stuck Deploying

# Check deployment status
aws ecs describe-services \
  --cluster my-cluster \
  --services web-service \
  --query 'services[0].deployments'

# Check events
aws ecs describe-services \
  --cluster my-cluster \
  --services web-service \
  --query 'services[0].events[:5]'

Causes:

  • Health check failing on new tasks
  • Not enough capacity
  • Target group health checks failing

Cannot Pull Image from ECR

Check execution role has:

{
  "Effect": "Allow",
  "Action": [
    "ecr:GetAuthorizationToken",
    "ecr:BatchCheckLayerAvailability",
    "ecr:GetDownloadUrlForLayer",
    "ecr:BatchGetImage"
  ],
  "Resource": "*"
}

Also check:

  • VPC endpoint for ECR (if private subnet)
  • NAT gateway (if private subnet)
  • Security group allows HTTPS outbound

References