AWS Cost & Performance Optimization Guide

Most AWS bills carry 20–40% of avoidable waste, and the fastest ROI comes from a short list of high-leverage moves: right-size over-provisioned EC2/RDS, buy Savings Plans or Reserved Instances for steady baseline workloads, move eligible workloads to Graviton (ARM) for roughly 20% better price-performance, run fault-tolerant batch on Spot for up to ~90% off on-demand, tier cold data in S3 Intelligent-Tiering, and delete idle/orphaned resources (unattached EBS volumes, idle RDS instances, stale snapshots, unused NAT gateways and Elastic IPs). Do those first — they typically reclaim the biggest savings with the least risk — then institutionalize the gains with tagging, budgets and a lightweight FinOps cadence.

This guide walks through each lever in priority order and ends with a prioritized quick-wins checklist and an FAQ. The goal is durable ROI: lower cost and better performance, without compromising reliability.

Start with right-sizing and AWS Compute Optimizer

The single most common source of waste is over-provisioning — instances bought for peak that idle most of the time. AWS Compute Optimizer analyzes CloudWatch utilization across EC2, Auto Scaling groups, EBS, Lambda and ECS-on-Fargate, then recommends better-fit instance types and sizes (enable memory metrics via the CloudWatch agent for accuracy). Look for instances under ~40% sustained CPU/memory and step them down a size, or switch to a newer generation that is cheaper and faster.

Right-sizing is non-disruptive for stateless tiers and should be a recurring review, not a one-off. Treat the recommendations as a starting point and validate against your own latency and headroom SLOs before applying.

# Pull right-sizing recommendations for EC2 across the account
aws compute-optimizer get-ec2-instance-recommendations \
  --query 'instanceRecommendations[?finding!=`Optimized`].[instanceArn,currentInstanceType,recommendationOptions[0].instanceType,recommendationOptions[0].performanceRisk]' \
  --output table

# Find EC2 instances averaging low CPU over the last 14 days (candidates to downsize)
aws cloudwatch get-metric-statistics \
  --namespace AWS/EC2 --metric-name CPUUtilization \
  --dimensions Name=InstanceId,Value=i-0123456789abcdef0 \
  --start-time "$(date -u -d '14 days ago' +%Y-%m-%dT%H:%M:%SZ)" \
  --end-time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
  --period 86400 --statistics Average

Savings Plans vs. Reserved Instances vs. Spot — when each fits

Once you have right-sized, commit to your steady-state baseline to lock in discounts. The three pricing models solve different problems:

Purchase model	Discount vs. on-demand	Best for	Trade-off
Compute Savings Plans	up to ~66%	Predictable baseline compute that may shift across EC2, Fargate and Lambda	1- or 3-year commitment; flexible across family/region/OS
EC2 Instance Savings Plans	up to ~72%	Stable workloads pinned to one instance family in a region	Less flexible than Compute SP
Reserved Instances (RDS/ElastiCache/Redshift/OpenSearch)	up to ~72%	Steady managed-service capacity not covered by Savings Plans	Service/family commitment
Spot Instances	up to ~90%	Fault-tolerant, interruptible work: batch, CI, rendering, stateless web, big-data, ML training	2-minute interruption notice; needs retry/checkpointing

A practical pattern: cover your always-on baseline with Compute Savings Plans (start with ~60–70% coverage so you keep flexibility), use RIs for RDS/ElastiCache, and run bursty or interruptible capacity on Spot via Auto Scaling or EC2 Fleet with mixed instance types. Review coverage and utilization monthly in Cost Explorer so commitments track real usage.

Migrate to Graviton (ARM) for better price-performance

AWS Graviton processors typically deliver around 20% lower cost and up to ~40% better price-performance than comparable x86 instances for many workloads. Graviton powers EC2 (the g-suffixed types like m7g, c7g, r7g), plus managed services — RDS/Aurora, ElastiCache, OpenSearch, Lambda and Fargate all offer ARM options.

Migration is straightforward for interpreted runtimes (Python, Node.js, Java, Go, .NET) and most container images that publish arm64 variants — often just changing the instance type or setting architecture: arm64 on Lambda. Validate any code with native dependencies, rebuild multi-arch container images, and benchmark before cutting production over. For Lambda and Fargate this is one of the lowest-effort wins available.

Auto-scaling and serverless: pay for what you actually use

Static fleets sized for peak waste money overnight and on weekends. Auto Scaling (target-tracking on CPU, request count or a custom metric) shrinks capacity when demand drops; schedule scale-down for predictable off-hours, and use predictive scaling for cyclical traffic. Stopping non-production EC2/RDS environments nights and weekends alone can cut their cost by more than half.

For spiky or event-driven workloads, serverless removes idle cost entirely. Lambda bills per millisecond — right-size memory (which also scales CPU) and use the Lambda Power Tuning tool to find the cost/latency sweet spot. Fargate removes EC2 capacity management for containers, and Fargate Spot cuts that further for interruptible tasks. For our own autoscaling patterns, see Autoscaling Application with Auto Scaling Groups and AWS Load Balancer and our AWS Lambda best practices.

Storage tiering: S3, EBS and snapshot hygiene

Storage quietly compounds. Apply these in order:

S3 Intelligent-Tiering for data with unknown or changing access patterns — it moves objects between frequent, infrequent and archive tiers automatically with no retrieval fees on the frequent/infrequent tiers. Use explicit lifecycle policies to expire or transition logs and backups to Glacier/Deep Archive.
EBS gp3 over gp2 — gp3 is ~20% cheaper per GB and lets you provision IOPS and throughput independently, so you stop paying for capacity just to get performance. Migrating gp2→gp3 is an online volume modification.
Snapshot and AMI cleanup — old EBS snapshots and deregistered AMIs accumulate silently. Use Data Lifecycle Manager or Recycle Bin policies to retain only what you need.
Delete unattached EBS volumes — volumes left after instance termination keep billing for nothing.

# Find unattached (available) EBS volumes still costing money
aws ec2 describe-volumes \
  --filters Name=status,Values=available \
  --query 'Volumes[].{ID:VolumeId,Size:Size,Type:VolumeType,AZ:AvailabilityZone}' \
  --output table

# Migrate a gp2 volume to cheaper, faster gp3 (online, no downtime)
aws ec2 modify-volume --volume-id vol-0123456789abcdef0 \
  --volume-type gp3 --iops 3000 --throughput 125

# Find EBS snapshots older than a retention threshold to review/delete
aws ec2 describe-snapshots --owner-ids self \
  --query 'Snapshots[?StartTime<=`2025-01-01`].[SnapshotId,VolumeId,StartTime,VolumeSize]' \
  --output table

Cut data transfer and egress costs with CloudFront

Data transfer out to the internet is one of the least-watched line items. Key moves:

Serve content through Amazon CloudFront. CDN egress is cheaper than direct S3/EC2 egress, and caching at the edge reduces origin load and latency at the same time — a cost and performance win.
Keep traffic in-AZ and in-region. Cross-AZ and cross-region transfer is billed; co-locate chatty services and use VPC endpoints (Gateway endpoints for S3/DynamoDB are free) to avoid routing internal traffic over NAT gateways or the public internet.
Watch NAT Gateway costs. NAT gateways charge both an hourly fee and per-GB processing. Consolidate them, route S3/DynamoDB through gateway endpoints, and remove NAT in subnets that don't need outbound internet.

Kill idle and orphaned resources

A recurring sweep for zombie resources reclaims spend fast and is almost pure savings:

Unattached EBS volumes and unassociated Elastic IPs (idle EIPs are billed hourly).
Idle RDS/Aurora instances — dev databases left running, or instances with near-zero connections.
Stale EBS snapshots and old AMIs — automate retention with Data Lifecycle Manager.
Unused load balancers, NAT gateways and provisioned-but-empty endpoints.
Old, low-traffic CloudWatch log groups without retention — set retention policies so logs expire.
Forgotten non-production environments — schedule them to stop outside business hours.

Automate the audit so it runs continuously rather than relying on memory.

Optimize RDS and Aurora

Databases are often the biggest non-compute line item and the easiest to over-provision:

Right-size and modernize — move to current Graviton-based instance classes and gp3 storage; downsize instances showing low CPU and freeable memory headroom.
Reserved Instances for steady production databases.
Stop dev/test databases off-hours (RDS can be stopped for up to 7 days at a time, or automate start/stop).
Aurora Serverless v2 for variable or unpredictable workloads scales capacity in fine-grained increments so you stop paying for idle headroom.
Tune before scaling up — add the right indexes, fix N+1 queries and slow queries, and add read replicas or ElastiCache instead of buying a bigger primary. Performance Insights pinpoints the expensive queries.
Storage and backups — review allocated storage, enable autoscaling with a sane cap, and prune backup retention to policy.

Get visibility: Cost Explorer, Budgets, CUR and tagging

You can't optimize what you can't see, and you can't allocate what you can't tag.

Cost Explorer for trend analysis, anomaly detection and Savings Plans/RI recommendations and coverage.
AWS Budgets with alerts on cost and usage thresholds, plus Budget Actions to automatically rein in spend. Enable Cost Anomaly Detection to catch runaway spend within days, not at month-end.
Cost and Usage Report (CUR) for granular, resource-level data — query it with Athena or feed a BI dashboard for deep cost allocation.
A tagging strategy (e.g. env, team, project, cost-center) is the foundation of FinOps. Enforce it with AWS Organizations tag policies and activate cost allocation tags so every dollar maps to an owner. Untagged spend is unaccountable spend.

Trusted Advisor and the Well-Architected Cost pillar

AWS Trusted Advisor continuously flags cost-optimization opportunities — idle load balancers, underutilized instances, unassociated EIPs, low RI/Savings Plans coverage — alongside performance, security and fault-tolerance checks. The Cost Optimization category is a free, always-on second pair of eyes.

The AWS Well-Architected Framework Cost Optimization pillar gives you the discipline behind the tactics: adopt a consumption model, measure efficiency, stop spending on undifferentiated heavy lifting, attribute expenditure to owners, and continually analyze and improve. Run a Well-Architected Review (the free Well-Architected Tool guides it) to surface gaps systematically rather than ad hoc.

Make it stick with FinOps

One-time cleanups regress without an operating model. FinOps brings engineering, finance and product together to manage cloud spend as an ongoing practice:

Inform — accurate showback/chargeback from tags and the CUR so teams see their own spend.
Optimize — continuous right-sizing, commitment management, and anomaly response.
Operate — embed cost into the engineering workflow: budgets in CI/CD, cost gates in code review, and a regular (monthly) cost review with clear ownership.

The payoff is cultural: when engineers see the cost of their architecture decisions, optimization becomes continuous rather than a quarterly fire drill — and savings compound.

Prioritized quick-wins checklist

Start at the top — these are ordered roughly by ROI (impact vs. effort):

#	Quick win	Effort	Typical impact
1	Delete unattached EBS volumes, idle EIPs, stale snapshots/AMIs	Low	Immediate, pure savings
2	Right-size over-provisioned EC2/RDS via Compute Optimizer	Low–Med	High
3	Migrate gp2 → gp3 EBS volumes	Low	~20% on those volumes
4	Buy Compute Savings Plans for steady baseline	Low	Up to ~66% on covered compute
5	Stop non-prod EC2/RDS nights & weekends	Low	50%+ on those environments
6	Enable S3 Intelligent-Tiering + lifecycle policies	Low	High on cold data
7	Move eligible Lambda/Fargate/EC2 to Graviton (ARM)	Med	~20% price-performance
8	Run interruptible/batch workloads on Spot	Med	Up to ~90% on those jobs
9	Serve egress via CloudFront; add S3/DynamoDB VPC gateway endpoints	Med	Cuts egress + NAT cost
10	Set CloudWatch log retention; clean up idle log groups	Low	Recurring savings
11	Enable Budgets + Cost Anomaly Detection + tagging	Low	Prevents future waste
12	Run a Well-Architected Cost review; stand up a FinOps cadence	Med	Durable, compounding

Done in order, these commonly reduce a typical AWS bill by 20–30% or more while improving — not degrading — performance.

How MicroPyramid helps

MicroPyramid has run production AWS workloads for startups and enterprises for over a decade. Our AWS consulting services cover cost-optimization audits, right-sizing, Savings Plans and Graviton migrations, and Well-Architected reviews — turning cloud waste into measurable ROI without sacrificing reliability. If you're moving to AWS or between accounts, our cloud migration services build cost discipline in from day one.

Frequently Asked Questions

What is the fastest way to reduce my AWS bill?

Start with zero-risk cleanups: delete unattached EBS volumes, release idle Elastic IPs, remove stale snapshots and unused load balancers, and stop non-production environments outside business hours. Then right-size over-provisioned instances with AWS Compute Optimizer. These steps reclaim the most savings with the least effort and rarely affect anything in production.

How much can AWS cost optimization actually save?

Most organizations carry 20–40% of avoidable waste, and a disciplined optimization pass — right-sizing, Savings Plans, storage tiering and killing idle resources — commonly reduces the bill by around 20–30%, sometimes more. The exact figure depends on how over-provisioned and uncommitted the environment is to begin with.

Should I use Savings Plans, Reserved Instances, or Spot?

Use Compute Savings Plans for your always-on baseline compute (flexible across EC2, Fargate and Lambda), Reserved Instances for steady managed services like RDS and ElastiCache, and Spot Instances for fault-tolerant, interruptible work such as batch jobs, CI and ML training. Most environments use all three together.

Is migrating to Graviton (ARM) worth it?

For most modern workloads, yes. Graviton typically cuts cost by around 20% with better price-performance. Interpreted runtimes and containers with arm64 images migrate easily; the main caveat is code with native x86 dependencies, so benchmark and test before cutting production over. Lambda and Fargate are especially low-effort wins.

Does optimizing cost hurt performance?

Not when done right — many cost optimizations improve performance. Newer instance generations and Graviton are cheaper and faster, gp3 lets you provision IOPS independently, CloudFront caching lowers both egress cost and latency, and right-sizing removes waste rather than headroom. The aim is efficiency, not under-provisioning, so you always validate against your latency and reliability SLOs.

What tools does AWS provide for cost optimization?

AWS Cost Explorer (trends, anomaly detection, Savings Plans recommendations), AWS Budgets and Cost Anomaly Detection (alerts and automated actions), Compute Optimizer (right-sizing), Trusted Advisor (continuous cost checks), the Cost and Usage Report with Athena (granular analysis), and the Well-Architected Tool for structured reviews — all underpinned by a consistent tagging strategy.

AWS Cost & Performance Optimization: Tips for Better ROI