Most AWS bills carry 20–40% of avoidable waste, and the fastest ROI comes from a short list of high-leverage moves: right-size over-provisioned EC2/RDS, buy Savings Plans or Reserved Instances for steady baseline workloads, move eligible workloads to Graviton (ARM) for roughly 20% better price-performance, run fault-tolerant batch on Spot for up to ~90% off on-demand, tier cold data in S3 Intelligent-Tiering, and delete idle/orphaned resources (unattached EBS volumes, idle RDS instances, stale snapshots, unused NAT gateways and Elastic IPs). Do those first — they typically reclaim the biggest savings with the least risk — then institutionalize the gains with tagging, budgets and a lightweight FinOps cadence.
This guide walks through each lever in priority order and ends with a prioritized quick-wins checklist and an FAQ. The goal is durable ROI: lower cost and better performance, without compromising reliability.
Start with right-sizing and AWS Compute Optimizer
The single most common source of waste is over-provisioning — instances bought for peak that idle most of the time. AWS Compute Optimizer analyzes CloudWatch utilization across EC2, Auto Scaling groups, EBS, Lambda and ECS-on-Fargate, then recommends better-fit instance types and sizes (enable memory metrics via the CloudWatch agent for accuracy). Look for instances under ~40% sustained CPU/memory and step them down a size, or switch to a newer generation that is cheaper and faster.
Right-sizing is non-disruptive for stateless tiers and should be a recurring review, not a one-off. Treat the recommendations as a starting point and validate against your own latency and headroom SLOs before applying.
# Pull right-sizing recommendations for EC2 across the account
aws compute-optimizer get-ec2-instance-recommendations \
--query 'instanceRecommendations[?finding!=`Optimized`].[instanceArn,currentInstanceType,recommendationOptions[0].instanceType,recommendationOptions[0].performanceRisk]' \
--output table
# Find EC2 instances averaging low CPU over the last 14 days (candidates to downsize)
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 --metric-name CPUUtilization \
--dimensions Name=InstanceId,Value=i-0123456789abcdef0 \
--start-time "$(date -u -d '14 days ago' +%Y-%m-%dT%H:%M:%SZ)" \
--end-time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--period 86400 --statistics AverageSavings Plans vs. Reserved Instances vs. Spot — when each fits
Once you have right-sized, commit to your steady-state baseline to lock in discounts. The three pricing models solve different problems:
| Purchase model | Discount vs. on-demand | Best for | Trade-off |
|---|---|---|---|
| Compute Savings Plans | up to ~66% | Predictable baseline compute that may shift across EC2, Fargate and Lambda | 1- or 3-year commitment; flexible across family/region/OS |
| EC2 Instance Savings Plans | up to ~72% | Stable workloads pinned to one instance family in a region | Less flexible than Compute SP |
| Reserved Instances (RDS/ElastiCache/Redshift/OpenSearch) | up to ~72% | Steady managed-service capacity not covered by Savings Plans | Service/family commitment |
| Spot Instances | up to ~90% | Fault-tolerant, interruptible work: batch, CI, rendering, stateless web, big-data, ML training | 2-minute interruption notice; needs retry/checkpointing |
A practical pattern: cover your always-on baseline with Compute Savings Plans (start with ~60–70% coverage so you keep flexibility), use RIs for RDS/ElastiCache, and run bursty or interruptible capacity on Spot via Auto Scaling or EC2 Fleet with mixed instance types. Review coverage and utilization monthly in Cost Explorer so commitments track real usage.
Migrate to Graviton (ARM) for better price-performance
AWS Graviton processors typically deliver around 20% lower cost and up to ~40% better price-performance than comparable x86 instances for many workloads. Graviton powers EC2 (the g-suffixed types like m7g, c7g, r7g), plus managed services — RDS/Aurora, ElastiCache, OpenSearch, Lambda and Fargate all offer ARM options.
Migration is straightforward for interpreted runtimes (Python, Node.js, Java, Go, .NET) and most container images that publish arm64 variants — often just changing the instance type or setting architecture: arm64 on Lambda. Validate any code with native dependencies, rebuild multi-arch container images, and benchmark before cutting production over. For Lambda and Fargate this is one of the lowest-effort wins available.
Auto-scaling and serverless: pay for what you actually use
Static fleets sized for peak waste money overnight and on weekends. Auto Scaling (target-tracking on CPU, request count or a custom metric) shrinks capacity when demand drops; schedule scale-down for predictable off-hours, and use predictive scaling for cyclical traffic. Stopping non-production EC2/RDS environments nights and weekends alone can cut their cost by more than half.
For spiky or event-driven workloads, serverless removes idle cost entirely. Lambda bills per millisecond — right-size memory (which also scales CPU) and use the Lambda Power Tuning tool to find the cost/latency sweet spot. Fargate removes EC2 capacity management for containers, and Fargate Spot cuts that further for interruptible tasks. For our own autoscaling patterns, see Autoscaling Application with Auto Scaling Groups and AWS Load Balancer and our AWS Lambda best practices.
Storage tiering: S3, EBS and snapshot hygiene
Storage quietly compounds. Apply these in order:
- S3 Intelligent-Tiering for data with unknown or changing access patterns — it moves objects between frequent, infrequent and archive tiers automatically with no retrieval fees on the frequent/infrequent tiers. Use explicit lifecycle policies to expire or transition logs and backups to Glacier/Deep Archive.
- EBS gp3 over gp2 — gp3 is ~20% cheaper per GB and lets you provision IOPS and throughput independently, so you stop paying for capacity just to get performance. Migrating gp2→gp3 is an online volume modification.
- Snapshot and AMI cleanup — old EBS snapshots and deregistered AMIs accumulate silently. Use Data Lifecycle Manager or Recycle Bin policies to retain only what you need.
- Delete unattached EBS volumes — volumes left after instance termination keep billing for nothing.
# Find unattached (available) EBS volumes still costing money
aws ec2 describe-volumes \
--filters Name=status,Values=available \
--query 'Volumes[].{ID:VolumeId,Size:Size,Type:VolumeType,AZ:AvailabilityZone}' \
--output table
# Migrate a gp2 volume to cheaper, faster gp3 (online, no downtime)
aws ec2 modify-volume --volume-id vol-0123456789abcdef0 \
--volume-type gp3 --iops 3000 --throughput 125
# Find EBS snapshots older than a retention threshold to review/delete
aws ec2 describe-snapshots --owner-ids self \
--query 'Snapshots[?StartTime<=`2025-01-01`].[SnapshotId,VolumeId,StartTime,VolumeSize]' \
--output tableCut data transfer and egress costs with CloudFront
Data transfer out to the internet is one of the least-watched line items. Key moves:
- Serve content through Amazon CloudFront. CDN egress is cheaper than direct S3/EC2 egress, and caching at the edge reduces origin load and latency at the same time — a cost and performance win.
- Keep traffic in-AZ and in-region. Cross-AZ and cross-region transfer is billed; co-locate chatty services and use VPC endpoints (Gateway endpoints for S3/DynamoDB are free) to avoid routing internal traffic over NAT gateways or the public internet.
- Watch NAT Gateway costs. NAT gateways charge both an hourly fee and per-GB processing. Consolidate them, route S3/DynamoDB through gateway endpoints, and remove NAT in subnets that don't need outbound internet.
Kill idle and orphaned resources
A recurring sweep for zombie resources reclaims spend fast and is almost pure savings:
- Unattached EBS volumes and unassociated Elastic IPs (idle EIPs are billed hourly).
- Idle RDS/Aurora instances — dev databases left running, or instances with near-zero connections.
- Stale EBS snapshots and old AMIs — automate retention with Data Lifecycle Manager.
- Unused load balancers, NAT gateways and provisioned-but-empty endpoints.
- Old, low-traffic CloudWatch log groups without retention — set retention policies so logs expire.
- Forgotten non-production environments — schedule them to stop outside business hours.
Automate the audit so it runs continuously rather than relying on memory.
Optimize RDS and Aurora
Databases are often the biggest non-compute line item and the easiest to over-provision:
- Right-size and modernize — move to current Graviton-based instance classes and gp3 storage; downsize instances showing low CPU and freeable memory headroom.
- Reserved Instances for steady production databases.
- Stop dev/test databases off-hours (RDS can be stopped for up to 7 days at a time, or automate start/stop).
- Aurora Serverless v2 for variable or unpredictable workloads scales capacity in fine-grained increments so you stop paying for idle headroom.
- Tune before scaling up — add the right indexes, fix N+1 queries and slow queries, and add read replicas or ElastiCache instead of buying a bigger primary. Performance Insights pinpoints the expensive queries.
- Storage and backups — review allocated storage, enable autoscaling with a sane cap, and prune backup retention to policy.
Get visibility: Cost Explorer, Budgets, CUR and tagging
You can't optimize what you can't see, and you can't allocate what you can't tag.
- Cost Explorer for trend analysis, anomaly detection and Savings Plans/RI recommendations and coverage.
- AWS Budgets with alerts on cost and usage thresholds, plus Budget Actions to automatically rein in spend. Enable Cost Anomaly Detection to catch runaway spend within days, not at month-end.
- Cost and Usage Report (CUR) for granular, resource-level data — query it with Athena or feed a BI dashboard for deep cost allocation.
- A tagging strategy (e.g.
env,team,project,cost-center) is the foundation of FinOps. Enforce it with AWS Organizations tag policies and activate cost allocation tags so every dollar maps to an owner. Untagged spend is unaccountable spend.
Trusted Advisor and the Well-Architected Cost pillar
AWS Trusted Advisor continuously flags cost-optimization opportunities — idle load balancers, underutilized instances, unassociated EIPs, low RI/Savings Plans coverage — alongside performance, security and fault-tolerance checks. The Cost Optimization category is a free, always-on second pair of eyes.
The AWS Well-Architected Framework Cost Optimization pillar gives you the discipline behind the tactics: adopt a consumption model, measure efficiency, stop spending on undifferentiated heavy lifting, attribute expenditure to owners, and continually analyze and improve. Run a Well-Architected Review (the free Well-Architected Tool guides it) to surface gaps systematically rather than ad hoc.
Make it stick with FinOps
One-time cleanups regress without an operating model. FinOps brings engineering, finance and product together to manage cloud spend as an ongoing practice:
- Inform — accurate showback/chargeback from tags and the CUR so teams see their own spend.
- Optimize — continuous right-sizing, commitment management, and anomaly response.
- Operate — embed cost into the engineering workflow: budgets in CI/CD, cost gates in code review, and a regular (monthly) cost review with clear ownership.
The payoff is cultural: when engineers see the cost of their architecture decisions, optimization becomes continuous rather than a quarterly fire drill — and savings compound.
Prioritized quick-wins checklist
Start at the top — these are ordered roughly by ROI (impact vs. effort):
| # | Quick win | Effort | Typical impact |
|---|---|---|---|
| 1 | Delete unattached EBS volumes, idle EIPs, stale snapshots/AMIs | Low | Immediate, pure savings |
| 2 | Right-size over-provisioned EC2/RDS via Compute Optimizer | Low–Med | High |
| 3 | Migrate gp2 → gp3 EBS volumes | Low | ~20% on those volumes |
| 4 | Buy Compute Savings Plans for steady baseline | Low | Up to ~66% on covered compute |
| 5 | Stop non-prod EC2/RDS nights & weekends | Low | 50%+ on those environments |
| 6 | Enable S3 Intelligent-Tiering + lifecycle policies | Low | High on cold data |
| 7 | Move eligible Lambda/Fargate/EC2 to Graviton (ARM) | Med | ~20% price-performance |
| 8 | Run interruptible/batch workloads on Spot | Med | Up to ~90% on those jobs |
| 9 | Serve egress via CloudFront; add S3/DynamoDB VPC gateway endpoints | Med | Cuts egress + NAT cost |
| 10 | Set CloudWatch log retention; clean up idle log groups | Low | Recurring savings |
| 11 | Enable Budgets + Cost Anomaly Detection + tagging | Low | Prevents future waste |
| 12 | Run a Well-Architected Cost review; stand up a FinOps cadence | Med | Durable, compounding |
Done in order, these commonly reduce a typical AWS bill by 20–30% or more while improving — not degrading — performance.
How MicroPyramid helps
MicroPyramid has run production AWS workloads for startups and enterprises for over a decade. Our AWS consulting services cover cost-optimization audits, right-sizing, Savings Plans and Graviton migrations, and Well-Architected reviews — turning cloud waste into measurable ROI without sacrificing reliability. If you're moving to AWS or between accounts, our cloud migration services build cost discipline in from day one.
Frequently Asked Questions
What is the fastest way to reduce my AWS bill?
Start with zero-risk cleanups: delete unattached EBS volumes, release idle Elastic IPs, remove stale snapshots and unused load balancers, and stop non-production environments outside business hours. Then right-size over-provisioned instances with AWS Compute Optimizer. These steps reclaim the most savings with the least effort and rarely affect anything in production.
How much can AWS cost optimization actually save?
Most organizations carry 20–40% of avoidable waste, and a disciplined optimization pass — right-sizing, Savings Plans, storage tiering and killing idle resources — commonly reduces the bill by around 20–30%, sometimes more. The exact figure depends on how over-provisioned and uncommitted the environment is to begin with.
Should I use Savings Plans, Reserved Instances, or Spot?
Use Compute Savings Plans for your always-on baseline compute (flexible across EC2, Fargate and Lambda), Reserved Instances for steady managed services like RDS and ElastiCache, and Spot Instances for fault-tolerant, interruptible work such as batch jobs, CI and ML training. Most environments use all three together.
Is migrating to Graviton (ARM) worth it?
For most modern workloads, yes. Graviton typically cuts cost by around 20% with better price-performance. Interpreted runtimes and containers with arm64 images migrate easily; the main caveat is code with native x86 dependencies, so benchmark and test before cutting production over. Lambda and Fargate are especially low-effort wins.
Does optimizing cost hurt performance?
Not when done right — many cost optimizations improve performance. Newer instance generations and Graviton are cheaper and faster, gp3 lets you provision IOPS independently, CloudFront caching lowers both egress cost and latency, and right-sizing removes waste rather than headroom. The aim is efficiency, not under-provisioning, so you always validate against your latency and reliability SLOs.
What tools does AWS provide for cost optimization?
AWS Cost Explorer (trends, anomaly detection, Savings Plans recommendations), AWS Budgets and Cost Anomaly Detection (alerts and automated actions), Compute Optimizer (right-sizing), Trusted Advisor (continuous cost checks), the Cost and Usage Report with Athena (granular analysis), and the Well-Architected Tool for structured reviews — all underpinned by a consistent tagging strategy.