Kubernetes vs ECS: An Enterprise Container Orchestration Comparison
When to choose Kubernetes and when to choose ECS, based on real experience operating both at enterprise scale
We run both Kubernetes and ECS in production. Not because we wanted to operate two container orchestration platforms, but because different workloads have different requirements, and the honest answer to "Kubernetes or ECS?" is "it depends." After a year of operating both at a major entertainment company, I have opinions backed by production experience on when each platform is the right choice.
The Landscape in 2017
Kubernetes has enormous momentum. The community is thriving, the ecosystem is growing, and the major cloud providers are investing heavily in managed Kubernetes offerings. AWS announced EKS at re:Invent last month, though it is not generally available yet. For now, running Kubernetes on AWS means self-managing the control plane on EC2 instances, which is a non-trivial operational burden.
ECS is AWS-native, fully managed, and tightly integrated with the rest of the AWS ecosystem. It does not have Kubernetes' extensibility or community, but it also does not have Kubernetes' operational complexity. For organizations that are already invested in AWS, ECS provides a container orchestration platform that works without a dedicated platform team to operate it.
ECS: Where It Excels
Operational simplicity. ECS has no control plane to manage. No etcd cluster to back up, no API server to scale, no certificate rotation to handle. AWS manages all of that. You define task definitions, create services, and ECS handles placement, health checking, and replacement. For teams that want to run containers without becoming Kubernetes operators, this is significant.
AWS integration depth. ECS integrates natively with ALB, CloudWatch, IAM, ECR, Secrets Manager, Systems Manager Parameter Store, and CloudMap. Service discovery, logging, secrets injection, and load balancing work out of the box with minimal configuration. The integration is not bolted on; it is part of the platform.
Consider IAM task roles. Each ECS task can assume its own IAM role, scoped to exactly the permissions that task needs. No shared node-level credentials, no kube2iam hacks, no OIDC provider configuration. You define the role in your task definition and it works.
{
"taskRoleArn": "arn:aws:iam::123456789012:role/content-api-task",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecs-task-execution"
}
Fargate. AWS launched Fargate earlier this year, and it eliminates EC2 instance management entirely for ECS workloads. You define CPU and memory requirements, and Fargate provisions the compute. No AMI management, no instance scaling, no patch management. For many workloads, Fargate is the right abstraction: you define what you want to run, and AWS handles where it runs.
Cost for small to medium deployments. Running ECS with Fargate or a small EC2 cluster is straightforward and cost-effective. You do not need a platform team. Your existing AWS engineers can operate ECS services alongside the rest of your infrastructure.
Kubernetes: Where It Excels
Workload portability. Kubernetes runs on every major cloud provider and on-premises. If you have a multi-cloud strategy or need to run the same workloads in your data center and in AWS, Kubernetes provides a consistent API and operational model across environments. ECS is AWS-only.
For our organization, where we are migrating from on-premises to AWS, Kubernetes provides a bridge. We can run the same deployment manifests in our data center during migration and in AWS after migration. The application teams do not need to learn a new deployment model for each environment.
Ecosystem and extensibility. The Kubernetes ecosystem is vast. Helm charts, operators, custom resource definitions, admission controllers, service meshes. If you need Istio for traffic management, Prometheus for monitoring, cert-manager for TLS automation, or Vault for secrets management, the Kubernetes ecosystem has mature solutions.
ECS has nothing comparable. Its extensibility is limited to what AWS provides. If your requirements fit within that boundary, it is fine. If they do not, you are stuck.
Complex scheduling requirements. Kubernetes' scheduler is sophisticated. Affinity and anti-affinity rules, taints and tolerations, pod disruption budgets, priority classes. If you need fine-grained control over where and how your workloads are placed, Kubernetes provides it.
Our GPU-accelerated rendering workloads use Kubernetes node affinity to ensure they land on GPU instances, taints to prevent non-GPU workloads from being scheduled on expensive GPU nodes, and priority classes to preempt lower-priority batch jobs when high-priority rendering jobs arrive.
StatefulSets for stateful workloads. Kubernetes StatefulSets provide stable network identities, ordered deployment and scaling, and persistent volume claims. Running stateful workloads like Kafka, Elasticsearch, or databases on Kubernetes is possible (though still debatable in production). ECS has no equivalent abstraction for stateful workloads.
The Operational Reality
Here is what nobody tells you about Kubernetes on AWS in 2017: running your own control plane is a full-time job.
Our Kubernetes clusters run on EC2 with kops managing the lifecycle. The control plane consists of three master nodes running the API server, controller manager, scheduler, and etcd. We are responsible for:
- Upgrades: Kubernetes releases a new minor version every three months. Upgrading requires careful planning, testing, and execution. Skipping versions is not supported; you must upgrade sequentially.
- etcd operations: etcd is the persistent store for all cluster state. If etcd fails and you do not have backups, you lose your cluster. We run etcd backups every 15 minutes to S3.
- Certificate management: Kubernetes uses TLS certificates extensively for component authentication. These certificates expire, and when they do, things break. We automated certificate rotation, but building that automation took weeks.
- Networking: We use Calico for network policy enforcement and the AWS VPC CNI plugin for pod networking. The CNI plugin has its own set of operational concerns, particularly around IP address exhaustion in VPC subnets (each pod gets a real VPC IP address).
- Monitoring the monitors: The Kubernetes control plane itself needs monitoring: API server latency, etcd disk performance, scheduler queue depth, controller manager sync times.
ECS has none of these concerns. The control plane is managed by AWS. You do not think about it.
Our Decision Framework
After a year of operating both platforms, here is how we decide:
Choose ECS when:
- The workload is straightforward: web services, APIs, batch jobs.
- The team is small and does not have dedicated platform engineers.
- Deep AWS integration (IAM roles, ALB, CloudWatch) is important.
- You want to use Fargate to eliminate instance management.
- There is no multi-cloud or hybrid requirement.
Choose Kubernetes when:
- Workload portability across environments is required.
- The workload has complex scheduling requirements (GPU affinity, preemption).
- You need ecosystem tools (service mesh, operators, custom controllers).
- You have a platform team that can operate and upgrade clusters.
- The workload is stateful and benefits from StatefulSets.
Avoid Kubernetes when:
- You are choosing it because it is popular rather than because you need its capabilities.
- You do not have the team to operate it. Kubernetes does not run itself.
- Your workloads are simple and ECS/Fargate would serve them equally well with less overhead.
What EKS Will Change
AWS announced EKS at re:Invent, and when it becomes available, it will change the calculus significantly. A managed Kubernetes control plane eliminates the hardest part of running Kubernetes on AWS: operating the control plane. If EKS delivers on its promise, the operational gap between Kubernetes and ECS will narrow considerably.
The remaining difference will be complexity versus flexibility. ECS will remain simpler for straightforward workloads. Kubernetes will remain more powerful for complex workloads. But the operational cost of choosing Kubernetes will decrease, making it accessible to smaller teams.
The Honest Take
Both platforms work. Both run production workloads reliably. The choice is not about which technology is better in the abstract; it is about which technology is better for your specific situation, your team, your workloads, and your organizational constraints.
We will continue running both. ECS for the majority of our workloads where simplicity and AWS integration matter most. Kubernetes for the subset of workloads that need its power and flexibility. And when EKS launches, we will evaluate whether consolidating on Kubernetes makes sense given the reduced operational burden.
The worst decision is not choosing the wrong orchestrator. It is spending months debating the choice instead of shipping containers.