Why Right-Sizing Is Not a One-Time Activity
Introduction
If you’ve worked in production long enough, you’ve probably heard this: “Let’s right-size the services and reduce the AWS bill.” So we do it.
- We check CPU and memory metrics for a week.
- We reduce task sizes.
- Costs drop.
Everyone’s happy. And then…. six months later, the bill increases again. Nothing “dramatic” changed. No massive traffic spike. No big architecture shift. Just normal growth. Normal engineering. Normal entropy. That’s when you realize something important: Right-sizing is not a project. It’s a habit.
Production Never Stays the Same
The mistake most teams make is assuming workloads are stable. They’re not.
- Traffic pattern changes.
- Features get added/removed.
- Payload sizes increase.
- Retries get introduced.
- More logging gets enabled.
- Tracing & logging agents get added.
- Background jobs grow quietly.
All of this shifts resource usage. That service that was perfectly tuned at 0.5 vCPU and 2GB RAM last quarter? It might now need more — or sometimes less. But if no one checks, it just drifts.
In ECS/Fargate, Drift Is Expensive
In Fargate, you don’t pay for what you use. You pay for what you allocate. For example, if your task is allocated 2GB but consistently uses 800MB, you’re burning money 24/7.
Now multiply that by:
- 20 microservices
- 2 environments
- 365 days
Small inefficiencies compound quietly. The dangerous part? Nothing breaks. Overprovisioning doesn’t alert you. It just invoices you.
Underprovisioning Is Just as Risky
The opposite mistake is worse. If you aggressively downsize without monitoring:
- CPU throttling increases
- Latency spikes
- Autoscaling becomes unstable
- OOM kills start appearing
And these issues are subtle. They don’t always cause immediate outages. They degrade performance slowly. Right-sizing isn’t just about saving money. It’s about maintaining predictable performance.
What Actually Changes Over Time
Here’s what I’ve seen in real production systems:
- A new feature adds heavier database queries.
- The marketing team launches a campaign, and traffic patterns shift.
- Logging level moves from INFO to DEBUG during an incident…. and never goes back.
- An SDK update increases the memory footprint.
- A sidecar gets added for observability.
- No one plans for these as “capacity changes.” But they are.
And if you don’t revisit sizing regularly, you’re either overpaying or running closer to instability than you think.
The Mindset Shift
Right-sizing should be part of the operational rhythm.
- Not a cost-cutting exercise.
- Not a quarterly panic.
- Not a one-time optimization sprint.
It should be a recurring review:
- What’s the p95 CPU?
- What’s the p95 memory?
- Are we scaling too aggressively?
- Are we allocating far more than we consume?
- Did a recent deployment change behavior?
If you’re doing incident, cost, and performance reviews, sizing should sit alongside them. Infrastructure has uncertainty. Left alone, systems drift.
- They drift in complexity.
- They drift in cost.
- They drift in performance characteristics.
Right-sizing is how you fight that drift. It’s not glamorous work. No one cares about reducing 512MB from a task definition. But across dozens of services, that discipline makes a real difference.
AWS Compute Optimizer: Your Continuous Right-Sizing Assistant
One AWS Service that really helps turn right-sizing from a “one-time task” into an ongoing habit is AWS Compute Optimizer. It analyzes your ECS tasks, EC2 instances, Lambda functions, RDS, and more, and recommends optimal CPU and memory allocations based on actual usage patterns.
The beauty? It’s not just a report you run once. Compute Optimizer continuously tracks your workloads, so you can catch inefficiencies before they become a cost leak. Pair it with regular operational reviews, and suddenly right-sizing isn’t guesswork — it’s a data-driven routine that keeps your cloud spend aligned with reality.
The Real Lesson
Right-sizing is not about squeezing every dollar out of AWS. It’s about alignment. Your infrastructure should reflect:
- Current traffic
- Current architecture
- Current behavior
Not assumptions from six months ago. If you treat right-sizing as a one-time task, it will silently undo itself. If you treat it as ongoing routine work, it becomes one of the easiest long-term wins in cloud and finops operations. Right-sizing isn’t something you do once and forget. It’s a habit that keeps your systems running smoothly and your cloud bills under control. If you want help optimizing your cloud costs and keeping your workloads efficient, reach out to us. Our DevOps and FinOps engineers at To The New have the experience to make it happen.
