Cloud Cost Explained Without Gibberish: Why Your AWS Bill Is So Confusing and How to Read It
What your AWS bill is really showing you, why costs spike unexpectedly, and how to trace every dollar back to your infrastructure.
You open your AWS bill and actually understand every line item.
You know exactly which service is burning money.
You can trace that spike in spend back to the specific change that caused it.
That’s the goal. But most teams never get there.
I’ve seen clients stare at a $47,000 bill he expected to be around $12,000, scrolling through pages of line items that might as well have been written in ✨Elvish✨. Data transfer charges he didn’t know existed, NAT Gateway costs that tripled overnight, and something called “EBS:VolumeUsage.gp3” appearing in regions he’d never heard of.
The bill wasn’t wrong.
AWS billed exactly what he used, he just had no idea what he was actually using.
This happens constantly. Not because engineers are careless, but because AWS billing is genuinely, deliberately complex. It’s designed to capture every possible resource interaction with microsecond precision across hundreds of services. That precision is great for AWS, but it’s terrible for humans trying to figure out why their bill doubled.
In order for that to happen, let’s understand what’s happening under the hood so that when it breaks, and it will break, you know where to look.
Hi I’m Maxine, a cloud infrastructure engineer who spends my days scaling databases, debugging production incidents, and writing about what actually works in production.
You can get a copy of my LLMs for Humans: From Prompts to Production (at 30% off right now) ←
Or for free when you become a paid subscriber.
It’s 20 chapters of practical applied AI with real production context, not theory. And it’ll help you get smarter about using AI tools in infrastructure workflows.
Checkout my work:
Plus, if you’re thinking about making a career move into cloud or DevOps and want a structured path to get there, get a copy of my The DevOps Career Switch Blueprint.
Okay, let’s get into it
Why AWS Bills Are Structured to Confuse You
Let me be clear about something: AWS doesn’t make bills confusing on purpose to trick you. They make bills confusing because their billing model accurately reflects how their infrastructure actually works, and their infrastructure is mind-bendingly complex.
Every AWS bill is organized around a few core concepts that seem simple until you try to use them.
Service
The product you’re using. EC2, S3, RDS, Lambda. Easy enough.
Usage Type
This is where it gets weird. A single service might have dozens of usage types. EC2 alone has separate line items for instance hours, EBS storage, EBS IOPS, data transfer in, data transfer out, data transfer between availability zones, Elastic IP addresses, and about forty other things.
Operation
What you actually did. GetObject versus PutObject in S3. Different prices, same service, same usage type.
Region
Services cost different amounts in different regions. US East is almost always cheapest because that’s where AWS built first and has the most capacity.
Here’s a single line item from a real bill:
$0.00 per GB - US East (Northern Virginia) data transfer
from US East (Northern Virginia) to Amazon EC2 instances
in the same Availability Zone
And here’s another:
$0.01 per GB - US East (Northern Virginia) data transfer
from US East (Northern Virginia) to US West (Oregon)
Same service. Same general action. One is free. One costs money.
Multiply this by every service you use and you start to understand the problem.
The bill isn’t lying to you, it’s telling you exactly what happened, it’s just telling you in a language optimized for machine parsing, not human comprehension.
The Anatomy of Your Cost and Usage Report
The Cost Explorer in the AWS console is fine for high-level trends. But if you actually want to understand your spend, you need the Cost and Usage Report, or CUR.
This is the raw billing data that AWS generates. It’s a CSV file, sometimes millions of rows, dumped into an S3 bucket. Most teams never look at it, that’s a big mistake.
The CUR contains columns you won’t see anywhere else:
LineItem/UsageType
The specific thing being billed, this is where you find gems like “USW2-DataTransfer-Regional-Bytes” which tells you data is moving between availability zones in us-west-2.
LineItem/ResourceId
The ARN of the actual resource, this is how you trace a cost back to a specific EC2 instance or S3 bucket.
LineItem/BlendedRate vs lineItem/UnblendedRate
Blended averages your Reserved Instance discounts across all usage. Unblended shows you the actual rate for each line item. Most teams should look at unblended to understand what’s really happening.
Product/region
Where the resource lives. Essential for understanding data transfer costs.
Here’s the thing nobody tells you about the CUR: it’s designed for data warehouses, not spreadsheets. You’re supposed to load this into Athena or Redshift and query it with SQL.
Setting up CUR delivery to Athena:
resource "aws_cur_report_definition" "main" {
report_name = "detailed-cost-report"
time_unit = "HOURLY"
format = "Parquet"
compression = "Parquet"
additional_schema_elements = ["RESOURCES"]
s3_bucket = aws_s3_bucket.cur_bucket.id
s3_region = "us-east-1"
s3_prefix = "cur"
report_versioning = "OVERWRITE_REPORT"
refresh_closed_reports = true
}
That additional_schema_elements = ["RESOURCES"] part is critical. Without it, you don’t get resource IDs and you can’t trace costs back to specific infrastructure.
The CUR takes 24 hours to start populating after you enable it, and historical data isn’t backfilled. Start it now, even if you’re not ready to analyze it yet.
The Five Cost Categories That Hide in Plain Sight
After looking at hundreds of AWS bills across different organizations, I’ve found that nearly all confusion falls into five categories. Master these and you’ll understand most of what you’re paying for.
Data Transfer
The silent killer. AWS doesn’t charge much for moving data in, they charge a lot for moving it out or between regions. Every time your application in us-east-1 talks to your database in us-west-2, you’re paying twice: once to leave the first region, once to enter the second.
Data transfer between availability zones in the same region costs $0.01 per GB in each direction. That sounds trivial until your microservices are chattering back and forth constantly. I’ve seen this single cost category exceed compute costs for high-traffic applications.
NAT Gateway
You need this to let private subnets talk to the internet. AWS charges $0.045 per hour it exists, which adds up to around $32 per month just for having it running, but they also charge $0.045 per GB that flows through it.
That second charge is the one that surprises people. If your instances are pulling container images, downloading packages, or calling external APIs, all of that goes through NAT and you pay for every byte.
EBS Volumes
You’re charged for provisioned capacity, not used capacity, so if you provision a 500GB volume and use 50GB, you pay for 500GB. And you keep paying even when the instance it’s attached to is stopped.
Worse: gp3 volumes have separate charges for IOPS and throughput if you provision above the baseline. The baseline is 3,000 IOPS and 125 MB/s. Most people don’t need more. But Terraform examples online often show provisioned IOPS because the author was benchmarking something and forgot to remove it.
Elastic IPs
Free if attached to a running instance, charged if attached to a stopped instance, and Charged if not attached to anything. That IP address you’re “saving for later” costs about $3.60 per month per IP.
S3 Request Costs
Storage is cheap. Requests are not. GET requests cost $0.0004 per 1,000 requests. PUT/COPY/POST/LIST requests cost $0.005 per 1,000. If you have a static site serving millions of requests directly from S3 instead of through CloudFront, your request costs will dwarf your storage costs.
Reading the Cost Explorer Like It’s Telling You Something
Cost Explorer isn’t useless. It’s just easy to use wrong.
The default view shows you total spend over time. This tells you almost nothing about what’s actually happening. The power is in the filters and groupings.
Always group by Usage Type rather than by Service. Services are too broad. Knowing you spent $5,000 on EC2 doesn’t help. Knowing you spent $3,200 on EC2-EBS:VolumeUsage.gp3 and $1,100 on EC2-NatGateway-Hours tells you exactly where to look.
Filter to a specific time range when something changed. If your bill spiked on March 15th, set the range to March 14th through March 16th. Then switch to hourly granularity. Find the exact hour it started.
Use the Linked Account filter if you’re in an organization. Cost allocation across accounts is how you figure out which team or project is actually responsible.
The most useful query I run regularly:
Group by: Usage Type
Filter: Service = EC2
Granularity: Daily
Time Range: Last 30 days
This shows you exactly which EC2-related charges are growing. The first time I ran this on a client’s account, we found data transfer costs growing exponentially week over week. A developer had accidentally configured cross-region replication on an application database and nobody noticed for two months.
When Bills Go Wrong
The Phantom Volume Problem
Symptom: EBS charges keep growing even though you haven’t launched new infrastructure.
Technical cause: When you terminate an EC2 instance, attached EBS volumes are not automatically deleted unless you explicitly configured that behavior. The default for root volumes is to delete. The default for additional volumes is to persist. Every quick test server with an extra data volume leaves behind a ghost.
Fix: Run this query against the AWS CLI regularly:
aws ec2 describe-volumes \
--filters Name=status,Values=available \
--query 'Volumes[*].[VolumeId,Size,CreateTime]' \
--output table
Any volume in “available” status isn’t attached to anything. It’s just costing you money.
The Cross-AZ Surprise
Symptom: Data transfer costs spike after deploying a new service.
Technical cause: Your load balancer is in three availability zones. Your application is in one. Every health check and every request that routes to a different AZ than where the request came in generates cross-AZ data transfer charges.
Fix: Ensure your application runs in every AZ where your load balancer accepts traffic. Or reduce the number of AZs if you don’t actually need the redundancy.
The Sleeping NAT
Symptom: NAT Gateway charges are huge but your instances barely make external calls.
Technical cause: Something in your private subnet is making calls you don’t know about. Common culprits: CloudWatch agents sending metrics, SSM agents checking for commands, container runtime pulling images on every deployment.
Fix: VPC Flow Logs. Enable them temporarily on your NAT Gateway’s ENI. Find out what’s actually flowing through. I once found a background job polling an external API every 5 seconds for a status check that never changed.
The Terraform Cost Traps
Infrastructure as code doesn’t save you from cost surprises. Sometimes it causes them.
The gp3 IOPS Default Problem:
resource "aws_ebs_volume" "data" {
availability_zone = "us-east-1a"
size = 100
type = "gp3"
iops = 16000 # This line costs you $800/month
throughput = 1000 # This line costs you $680/month
}
Those performance settings came from a blog post about high-performance databases. You copied them for a logging volume. The baseline is free, everything above it is not.
The Old Snapshot Accumulation:
resource "aws_ami_from_instance" "golden" {
source_instance_id = aws_instance.base.id
name = "golden-image-${timestamp()}"
}
Every time Terraform runs, this creates a new AMI with new snapshots. Old ones aren’t deleted. They accumulate forever.
Fix: Use a lifecycle rule to clean up, or better, build AMIs in a separate process with explicit retention.
The Multi-Region Provider Multiplication:
provider "aws" {
alias = "dr"
region = "us-west-2"
}
resource "aws_s3_bucket" "logs" {
provider = aws.dr
bucket = "company-logs-dr"
}
resource "aws_s3_bucket_replication_configuration" "main" {
# ...replication to dr region
}
This looks like proper disaster recovery. It’s also doubling your storage costs and adding data transfer charges for every object replicated. Sometimes that’s worth it. But I’ve seen teams replicate test environment logs to a DR region because nobody questioned the pattern.
The Parts I’m Still Figuring Out
Honestly? Cost allocation tags are a disaster in practice.
The theory is beautiful. Tag every resource with cost center, project, environment, team. Then slice your bill by those dimensions. Perfect accountability.
The reality is that around half of AWS services support cost allocation tags inconsistently, tags get fat-fingered or forgotten, and enforcement is a constant battle. I’ve seen organizations spend more engineering time maintaining tagging compliance than they save in cost optimization.
Some teams make it work. They have automation that refuses to provision untagged resources. They have weekly reports that shame teams with low tag coverage. It requires ongoing investment.
For smaller teams, I’ve honestly seen better results from account-level separation. Put each project in its own AWS account. The bill for that account is the project’s cost. No tagging required.
Neither approach is wrong, but pick one and commit to it. Halfway tagging is worse than no tagging because it gives you false confidence in incomplete data.
What Understanding Your Bill Actually Gets You
When you can actually read your AWS bill, things change.
Cost alerts become actionable instead of just scary
Capacity planning starts from real numbers instead of guesses
Architecture discussions include actual price implications
Finance stops asking you to “explain this charge” every month
You catch runaway costs in days instead of quarters
The biggest win I ever had was finding $4,000+ per month in cross-region data transfer that existed because someone had hardcoded a region in a config file three years earlier. The service had been migrated. The config hadn’t. Nobody knew.
Finding it took about an hour with Cost Explorer once I knew what to look for.
The AWS bill isn’t designed to be understood. It’s designed to be accurate. Those aren’t the same thing. But accuracy means the information is there if you know how to extract it.
What’s the most surprising charge you’ve ever found buried in your AWS bill?
I’d love to hear about it in the comments.
With Love and DevOps,
Maxine
If you made it this far and you’re managing cloud infrastructure with Terraform, you might want to keep this one close too.
What Is Infrastructure as Code? A Beginner’s Guide to Terraform and Cloud Infrastructure
is where I start people who are new to IaC or who understand it conceptually but haven’t had to debug it in a real environment yet. It covers the mental model behind declarative infrastructure so that articles like this one make sense end to end, not just the code snippets.
And if you’re working with AI in your stack or trying to understand where LLMs actually fit in a production system without the hype, LLMs for Humans: From Prompts to Production is the guide I wish existed when I started. Written by an engineer for engineers, covering RAG, function calling, and the operational reality of running AI in real systems.
Last Updated: May 2026
Sources and Further Reading
AWS Cost and Usage Report Data Dictionary
Understanding Your AWS Billing and Cost Management Dashboard




the real mystery isn't why the bill is confusing - it's that AWS has no incentive to make it legible. clarity would make comparison shopping trivial, and they know it.