1

We would like to launch single Docker container tasks in a versioned and logged - but ad-hoc fashion on AWS. Each container task requires significant vCPU and RAM (maybe 16 vCPU and 64GB RAM).

I can see several services that might help with this, but none are ideally suited;

  1. CloudFormation + EC2: Launch, provision and execute containers on instances (and VPC etc)
  2. Batch (great, but we don't need parallel containers)
  3. Fargate (great, but limited to 4 vCPU and 30GB RAM)
  4. ElasticBeanstalk (single container environment) the task isn't a web app so the loadbalancers etc. aren't required.

Can anyone offer any experience with similar workloads? Are there some obvious services that I've missed?

MLu
  • 23,798
  • 5
  • 54
  • 81
danodonovan
  • 203
  • 2
  • 5
  • Hi @danodonovan if the below response answers your question please upvote and accept it. It's the way to say thanks to people who spent their free time helping others on ServerFault. Thanks :) – MLu Nov 21 '18 at 00:30

2 Answers2

2

Use CloudFormation to set up the stack in a consistent, reproducible way every time you need to run it. The CloudFormation template will create:

  • ECS Cluster
  • EC2 Instance to join the ECS Cluster (either a single instance or Auto-scaling group)
  • ECS Task Definition the container URL can be a parameter for the template
  • Any supporting resources - IAM Role, Security Group, etc.

You can then have a simple shell script or ansible playbook that will create the stack from the CloudFormation template and trigger the task run. Then wait for it to finish and tear down the stack again so you're not paying for idle resources. Optionally if you need to run the task e.g. every morning you can use CloudWatch Events Rule to trigger it periodically.

If the container to run needs to be rebuilt from source from time to time you can also set up a CI/CD pipeline using CodePipeline, CodeBuild, etc, that will rebuild your container in a consistent way every time you make a code change.

Hope that helps :)

MLu
  • 23,798
  • 5
  • 54
  • 81
  • I would go with this approach with one improvement. Provision EC2 instance that joins ECS using AutoScaling Group (ASG) and then instead of tearing down the whole CloudFormation stack, you simply scale the ASG down to 0 instances when you are not running your task and back up to 1 when you need to run again. All resources excluding EC2 are free of charge and you will save some provisioning time. – Andrey Nov 18 '18 at 15:25
  • Hey, just one issue that I faced with this approach when I did this for my workload that in ECS while specifying in the task definition CPU reservation limits(Hard Limits) is at 10000 CPU units roughly equalling to 10vCPU per task. Any way to overcome that? – Piyush Baderia Apr 19 '19 at 20:00
1

Try terraform + ecs. Simple and easy

Sirex
  • 5,447
  • 2
  • 32
  • 54