ECS Production Deployment with Task Definitions

TL;DR: Step-by-step guide for deploying services to AWS ECS including task definitions, ECR setup, and CodePipeline integration.

Prerequisites

Tool Setup

  1. Install jq for JSON processing:
brew install jq
  1. Install AWS CLI (v2.4+ recommended):
curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
sudo installer -pkg AWSCLIV2.pkg -target /

Note: Older AWS CLI versions may fail. Verified working: aws-cli/2.4.6

  1. Configure AWS Profile:
aws configure --profile prodecs

Required inputs:

  • AWS Access Key ID
  • AWS Secret Access Key
  • Default region: ap-south-1
  • Default output format: json

ECR Repository Setup

Create the ECR repository for your service:

<account-id>.dkr.ecr.ap-south-1.amazonaws.com/<service-name>

Task Definition Configuration

Sample task definition for a Java service:

{
  "executionRoleArn": "arn:aws:iam::<account-id>:role/ecsTaskRole",
  "containerDefinitions": [
    {
      "name": "service-name",
      "image": "<account-id>.dkr.ecr.ap-south-1.amazonaws.com/service-name:latest",
      "essential": true,
      "logConfiguration": {
        "logDriver": "gelf",
        "options": {
          "gelf-address": "udp://<log-lb-url>:12201/"
        }
      },
      "environment": [
        { "name": "application_environment", "value": "prod" },
        { "name": "JAVA_OPTS", "value": "-Xms2048m" },
        { "name": "spring.profiles.active", "value": "prod" },
        { "name": "jasypt.encryptor.password", "value": "<secret>" }
      ],
      "portMappings": [
        { "hostPort": 8080, "protocol": "tcp", "containerPort": 8080 }
      ]
    }
  ],
  "requiresCompatibilities": ["EC2"],
  "cpu": "1900",
  "memory": "4096",
  "family": "service-name"
}

Security Note: Update jasypt.encryptor.password with actual production secrets. Never commit secrets to version control.

Deployment Steps

  1. Navigate to deployment scripts folder
  2. Execute deployment script:
bash ecs_deployment.sh
  1. Follow prompts for environment-specific values
  2. Monitor AWS CodePipeline for build status
  3. Provide manual approval when prompted (first build takes ~27 minutes)

Post-Deployment Verification

  1. ✅ Verify new instances are running
  2. ✅ Check logging service receives logs
  3. ✅ Confirm APM metrics are flowing
  4. ✅ Verify traffic shifts to new machines
  5. ✅ Stop old instances after traffic migration
  6. ✅ Disable CloudWatch alarms for decommissioned instances

Validation Curls

# Health check
curl --location --request GET 'http://172.xx.xx.xx:8080/actuator/health'

# Cache clear (if needed)
curl --location --request GET 'http://172.xx.xx.xx:8080/service/evictCache'

Cleanup Checklist

When decommissioning old infrastructure:

  • CodeBuild service role
  • CodePipeline
  • CodeBuild project
  • Capacity Provider
  • Task definition (deregister)
  • ECS service
  • ECS cluster
  • Launch template
  • Auto scaling group
  • CloudWatch alarms
  • Target group deregistration
  • EC2 instances

Scaling Considerations

  • Instance type changes: Modify at ASG level, not launch template
  • Task count: Configure desired count in service definition
  • Scale-in policy: Consider -1 at a time for gradual reduction
  • Minimum healthy percent: 100% ensures zero-downtime deployments
  • Maximum percent: 200% allows rolling updates