May 8, 2026 9 min read

RDS: the quiet doubler on the AWS bill

Multi-AZ doubles instance and storage hours. A read replica adds another 100 percent. Backup retention past 100 percent of storage is paid. None look big in isolation. Together they explain the four-figure RDS line that started at $400. The audit, the math, the fixes.

RDS bills double on schedule. The day production goes Multi-AZ, instance and storage hours both double. The day a read replica ships to take pressure off the primary, instance hours add another 100 percent. The day backup retention crosses 14 days on a database past the free tier, the snapshot line shows up. Each change is a Tuesday afternoon Terraform diff. Six months later the workload that started at $400 a month sits at $2,400 and the team owns five different drivers.

The pattern is mechanical, which is good news, because the audit is mechanical too. Below is what to check, in the order that pays back fastest.

The pricing in one paragraph

RDS pricing has four moving parts. First, instance hours, $0.171 per hour for an m6i.large primary in us-east-1. Multi-AZ doubles them. Replicas multiply them by replica count. Second, storage, $0.115 per GB-month for gp3, doubled on Multi-AZ. Third, IOPS, free up to 12,000 on gp3 and $0.020 per IOPS-month above. Fourth, backup, free up to 100 percent of provisioned storage and $0.095 per GB-month past that. None of the four is a surprise on its own. The arithmetic across all four is the surprise.

The audit query

Run this against every account in the org to get the inventory and the orphan snapshots:

aws rds describe-db-instances \
  --query 'DBInstances[].[DBInstanceIdentifier,DBInstanceClass,Engine,MultiAZ,StorageType,AllocatedStorage,BackupRetentionPeriod]' \
  --output table

aws rds describe-db-snapshots \
  --snapshot-type manual \
  --query 'DBSnapshots[].[DBSnapshotIdentifier,DBInstanceIdentifier,AllocatedStorage,SnapshotCreateTime]' \
  --output table

The first table tells you which instances are on Multi-AZ, which are on gp2, which have long backup retention. The second table tells you which manual snapshots are still alive past the database that produced them. The orphan snapshot list is usually the longest column on the report and the highest-ROI cleanup.

Five fixes, in order of payback

1. Move dev and staging off Multi-AZ

The Multi-AZ design promise is a 60-second failover SLA. Dev environments do not need that. Staging rarely does. Disable it on every non-production instance and instance hours and storage cut in half on those workloads. The change is a one-line ModifyDBInstance call, online, no downtime on the primary because the standby simply gets removed.

2. Move every gp2 volume to gp3

gp3 was launched for RDS in late 2022. It includes 12,000 IOPS and 250 MB/s baseline at $0.115/GB-month. gp2 charges the same per-GB rate but only 3 IOPS per GB, so a 500 GB gp2 volume gets 1,500 IOPS while a 500 GB gp3 volume gets 12,000. For most databases under 1 TB, gp3 is faster and 15 to 30 percent cheaper because you no longer over-provision storage to buy IOPS. The migration is online:

aws rds modify-db-instance \
  --db-instance-identifier mydb \
  --storage-type gp3 \
  --apply-immediately

3. Cover stable production instances with RIs

A 1-year No Upfront RI saves about 30 percent on instance hours, applies to both primary and standby on Multi-AZ, and applies to read replicas of the same class. The break-even is roughly four months in, so the math is favorable on anything running for the rest of the year. Three-year terms save 50 to 60 percent and are a fair commitment for production workloads on stable instance classes. The audit is easier than EC2 because RDS instances do not get cycled out the way fleet instances do.

4. Sweep orphan snapshots and long retention

Manual snapshots stay alive past the database. Teams take a snapshot before a destructive migration, never delete it, delete the database six months later, and the snapshot keeps billing $0.095/GB-month with nothing to attach it to. The cleanup is mechanical: list manual snapshots, cross-reference against current databases, delete the orphans. On a 1 TB database with 35-day retention, dropping to 14 days cuts the backup line by more than half.

5. Audit replica count

Read replicas are easy to add and easy to forget. Load tests that needed eight replicas leave eight replicas billing primary-instance hours indefinitely. CloudWatch ReadIOPS on each replica tells you which ones are actually serving traffic. The pattern we see most often: two heavily-used replicas and four idle ones, all the same instance class as the primary, billing four primary-instance equivalents for nothing.

RDS vs Aurora, briefly

Aurora pricing is a different shape. Aurora Standard charges per I/O at $0.20 per million plus $0.10/GB-month storage. Aurora I/O Optimized charges $0.225/GB-month with no I/O charge. RDS with gp3 stays cheaper than either for predictable, small databases under 100 GB. For chatty workloads where I/O per GB stored runs high, Aurora I/O Optimized wins. The RDS cost calculator shows all three side by side on the same instance class so you can pick the cheapest path before you migrate.

Aurora Serverless v2

Worth a separate paragraph because it is its own pricing model. Aurora Serverless v2 charges per ACU-second at $0.12 per ACU-hour. An ACU is roughly 2 GB of memory and matched compute. A workload that runs at 2 ACU 24/7 costs $175/month, higher than a comparable provisioned r6i.large. Serverless wins when the workload is spiky and scales down to 0.5 ACU for most of the day. It loses when the workload is steady, because the per-ACU rate is set up for spikes, not always-on.

The before-lunch change

Run the audit query, look at the first row that comes back as gp2 with Multi-AZ on a non-prod environment, and disable Multi-AZ. That single change typically cuts a $400/month instance to $200/month with no SLA impact in dev. Plug your own instance class and storage into the RDS cost calculator to see the exact dollar value before you ship the diff.

Keep reading

More from the blog

May 8, 2026 · 9 min read

DynamoDB: the on-demand vs provisioned tipping point

On-demand DynamoDB is roughly 7x the per-request cost of fully-utilized provisioned. The tipping point is 14 to 18 percent sustained utilization. Below, on-demand wins. Above, switch and turn on autoscaling. The math, the GSI multiplier, the audit query.

May 8, 2026 · 8 min read

S3 Intelligent-Tiering vs Standard-IA: when each is the wrong choice

Both classes look like a free lunch at first glance. The math says otherwise on small objects, short-lived data, and predictable access patterns. The decision tree we run on every audit, with the dollar thresholds where each class breaks even.