Run a container on a schedule with ECS

02 Jun 2025 in Tech

I've got a Docker container that I want to run periodically to fetch data and store it in a database. As this is something that needs to run persistently, I'm using Terraform to manage the infrastructure. It took me a while to figure out all the required resources and permissions, so I thought I'd share my solution here.

To securely run a container on a schedule in ECS, I needed to:

Initialise Terraform
Configure any secrets that the container needs
Create an ECS Fargate cluster, and a VPC for the container to run in
Create an IAM role for the container to execute as
Create an IAM role for EventBridge to trigger the ECS task
Define an EventBridge schedule trigger
Define a task

Initialise Terraform

Create providers.tf containing the AWS profile and region you want to use:

hcl
provider "aws" {
  region  = "us-east-2"
  profile = "default"
}

Then run terraform init.

Secrets

The container that I'm running needs access to some secret values as environment variables. I read the two most sensitive items from tfvars, and hard code the role as the Terraform repo is private anyway. The test/area/role secret is really just configuration rather than being a secret:

hcl
variable "AREA_USERNAME" {
  description = "Username for AREA access"
  type        = string
  sensitive   = true
}
variable "AREA_PASSWORD" {
  description = "Password for AREA access"
  type        = string
  sensitive   = true
}
locals {
  my_secrets = {
    "test/area/username" = var.AREA_USERNAME
    "test/area/password" = var.AREA_PASSWORD
    "test/area/role"     = "some_role"
  }
}
resource "aws_secretsmanager_secret" "my_secrets" {
  for_each = local.my_secrets
  name     = each.key
}
resource "aws_secretsmanager_secret_version" "my_secret_values" {
  for_each      = local.my_secrets
  secret_id     = aws_secretsmanager_secret.my_secrets[each.key].id
  secret_string = each.value
}

ECS Cluster and VPC

ECS allows you to scale to zero, but you still need a VPC to spin up containers in. My containers need internet access, so here's how I create an ECS cluster and VPC with internet access:

hcl
# Create an ECS cluster
resource "aws_ecs_cluster" "my_cluster" {
  name = "demo-fargate-cluster"
}
# Create a VPC and network, allowing all egress traffic
resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
}
resource "aws_subnet" "public" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.1.0/24"
  availability_zone       = "us-east-2a"
  map_public_ip_on_launch = true
}
# You could also use a NAT gateway instead. We use an internet gateway
# due to speed / cost reasons in this example
resource "aws_internet_gateway" "gw" {
  vpc_id = aws_vpc.main.id
}
resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.gw.id
  }
}
resource "aws_route_table_association" "public_assoc" {
  subnet_id      = aws_subnet.public.id
  route_table_id = aws_route_table.public.id
}
resource "aws_security_group" "ecs_tasks" {
  name        = "ecs-scheduled-tasks-sg"
  description = "Allow outbound traffic"
  vpc_id      = aws_vpc.main.id
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Execution IAM role

The container that runs does not have access to AWS Secrets Manager by default. To allow access, I create a new IAM role that the container will assume using sts:AssumeRole before running. This new role has a policy attached that allows access to specific secrets in AWS Secrets Manager.

hcl
# Allow ECS to assume this role
resource "aws_iam_role" "ecs_task_execution" {
  name = "ecs-task-execution-role"
  assume_role_policy = jsonencode({
    Version = "2012-10-17",
    Statement = [{
      Effect = "Allow",
      Principal = {
        Service = "ecs-tasks.amazonaws.com"
      },
      Action = "sts:AssumeRole"
    }]
  })
}
# Create a policy that allows access to the secrets we defined earlier
resource "aws_iam_policy" "ecs_execution_secrets_access" {
  name = "ecs-exec-secrets-access"
  policy = jsonencode({
    Version = "2012-10-17",
    Statement = [{
      Effect = "Allow",
      Action = [
        "secretsmanager:GetSecretValue",
        "secretsmanager:DescribeSecret"
      ],
      Resource = [
        aws_secretsmanager_secret.my_secrets["test/area/username"].arn,
        aws_secretsmanager_secret.my_secrets["test/area/password"].arn,
        aws_secretsmanager_secret.my_secrets["test/area/role"].arn,
      ]
    }]
  })
}
# Attach the above policy to the IAM role
resource "aws_iam_role_policy_attachment" "ecs_exec_secrets_access_attach" {
  role       = aws_iam_role.ecs_task_execution.name
  policy_arn = aws_iam_policy.ecs_execution_secrets_access.arn
}
# Attach the ECS Task execution policy
resource "aws_iam_role_policy_attachment" "ecs_task_execution_attach" {
  role       = aws_iam_role.ecs_task_execution.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}

Trigger IAM role

Amazon EventBridge also needs an IAM role in order to trigger the ECS task. Here's an IAM role that has permission to run the defined tasks explicitly.

It took me a long time to realise that not only did I need to provide a runTask permission to the tasks, I also needed to allow runTask to the ECS cluster that was being used.

hcl
# Allow EventBridge to assume this role
resource "aws_iam_role" "eventbridge_invoke_ecs" {
  name = "eventbridge-ecs-invoke-role"
  assume_role_policy = jsonencode({
    Version = "2012-10-17",
    Statement = [{
      Effect = "Allow",
      Principal = {
        Service = "events.amazonaws.com"
      },
      Action = "sts:AssumeRole"
      }
    ]
  })
}
# Create a new IAM policy that allows us to run all of the defined tasks
# We need to explicitly allow ecs:RunTask for the ecs:cluster too
resource "aws_iam_role_policy" "eventbridge_ecs_policy" {
  name = "invoke-ecs"
  role = aws_iam_role.eventbridge_invoke_ecs.id
  policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      {
        Effect   = "Allow",
        Action   = "ecs:RunTask",
        Resource = aws_ecs_task_definition.scheduled_my_command.arn
      },
      {
        Effect   = "Allow",
        Action   = "iam:PassRole",
        Resource = aws_iam_role.ecs_task_execution.arn
      },
      {
        Effect   = "Allow",
        Action   = "ecs:RunTask",
        Resource = "*",
        Condition = {
          "ArnEquals" : {
            "ecs:cluster" : aws_ecs_cluster.my_cluster.arn
          }
        }
      }
    ]
  })
}

EventBridge schedule trigger

We need to define an event rule that runs the container on a schedule. I also create a Cloudwatch log group to capture task logs.

hcl
# Run the task at 23:59 every night
resource "aws_cloudwatch_event_rule" "run_at_23_59" {
  name                = "run-ecs-task-schedule"
  schedule_expression = "cron(59 23 * * ? *)"
}
# And create a Cloudwatch log group to send logs to
resource "aws_cloudwatch_log_group" "my_scheduled_task" {
  name              = "/ecs/my-scheduled-task-logs"
  retention_in_days = 1
}

Task Definition

Finally, we need to define the task to run. You'll need to upload the docker image to ECR before defining a task. You'll also need to specify all of the secrets that the container needs access to.

hcl
# Define the container and command to run, plus CPU/Memory usage
resource "aws_ecs_task_definition" "scheduled_my_command" {
  family                   = "scheduled-my-command"
  requires_compatibilities = ["FARGATE"]
  network_mode             = "awsvpc"
  cpu                      = "256"
  memory                   = "512"
  execution_role_arn = aws_iam_role.ecs_task_execution.arn
  container_definitions = jsonencode([
    {
      name      = "demo",
      image     = "hello-world",
      essential = true,
      secrets = [
        {
          name      = "AREA_USER",
          valueFrom = aws_secretsmanager_secret.my_secrets["test/area/username"].arn
        },
        {
          name      = "AREA_PASSWORD",
          valueFrom = aws_secretsmanager_secret.my_secrets["test/area/password"].arn
        },
        {
          name      = "AREA_ROLE",
          valueFrom = aws_secretsmanager_secret.my_secrets["test/area/role"].arn
        }
      ],
      logConfiguration = {
        logDriver = "awslogs",
        options = {
          awslogs-group         = "/ecs/my-scheduled-task-logs"
          awslogs-region        = "us-east-2"
          awslogs-stream-prefix = "demo"
        }
      }
    }
  ])
}
# Trigger this task using the scheduled event
resource "aws_cloudwatch_event_target" "ecs_my_command_target" {
  rule      = aws_cloudwatch_event_rule.run_at_23_59.name
  role_arn  = aws_iam_role.eventbridge_invoke_ecs.arn
  target_id = "ecs-task-my-command"
  arn = aws_ecs_cluster.my_cluster.arn
  ecs_target {
    task_definition_arn = aws_ecs_task_definition.scheduled_my_command.arn
    launch_type         = "FARGATE"
    network_configuration {
      subnets          = [aws_subnet.public.id]
      security_groups  = [aws_security_group.ecs_tasks.id]
      assign_public_ip = true
    }
  }
  # This is only needed for debugging. See the final section
  # dead_letter_config {
  #   arn = aws_sqs_queue.failed_invocations.arn
  # }
}

Help, it's not working!

Last, but not least, debugging! If your task isn't triggering as expected, you can configure a DeadLetterQueue (DLQ) to receieve the error messages from AWS. If you choose to do this, uncomment the dead_letter_config section of the task definition above.

hcl
resource "aws_sqs_queue" "failed_invocations" {
  name = "eventbridge-ecs-dlq"
}
resource "aws_sqs_queue_policy" "allow_eventbridge" {
  queue_url = aws_sqs_queue.failed_invocations.id
  policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      {
        Sid    = "AllowEventBridgeToSendMessages",
        Effect = "Allow",
        Principal = {
          Service = "events.amazonaws.com"
        },
        Action   = "sqs:SendMessage",
        Resource = aws_sqs_queue.failed_invocations.arn,
        Condition = {
          ArnEquals = {
            "aws:SourceArn" = aws_cloudwatch_event_rule.run_at_23_59.arn
          }
        }
      }
    ]
  })
}

To read messages from the dead letter queue, use the aws CLI tool:

bash
# Fetch the queue URL
aws sqs get-queue-url --queue-name eventbridge-ecs-dlq
# Read messages - change the URL for your ID
aws sqs receive-message \
  --queue-url https://sqs.us-east-2.amazonaws.com/111111111111/eventbridge-ecs-dlq \
  --max-number-of-messages 10 \
  --wait-time-seconds 5 \
  --message-attribute-names All \
  --attribute-names All

Conclusion

The parts of this that gave me the most trouble were figuring out how to debug using dead_letter_queue and that the IAM policy for runTask needed access to the cluster too.

Hopefully this has helped you (or will help me again in the future) to deploy scheduled tasks on ECS.

Initialise Terraform
Secrets
ECS Cluster and VPC
Execution IAM role
Trigger IAM role
EventBridge schedule trigger
Task Definition
Help, it's not working!
Conclusion