Cloud Horizon Get the free audit

May 8, 2026 9 min read

Cosmos DB: the RUs vs storage tradeoff (and the model picker most teams skip)

Manual provisioned, Autoscale, and Serverless are not interchangeable. The wrong model on a Cosmos workload costs two to four times the right one. The autoscale tipping point, the serverless ceiling, the multi-region multiplier, and the indexing audit that cuts RUs in half before any pricing change.

Azure Cosmos DB has three throughput pricing models. Manual provisioned, Autoscale, and Serverless. The portal lets you switch between them with one dropdown. Most teams pick Manual once at container creation and never reconsider. Then a year later, the bill is two to four times what the same workload would cost on the right model. The math is not subtle and the switch is a five-minute change.

Manual: cheap if and only if you are above 65 percent

Manual provisioned charges $0.008 per 100 RU/s per hour, $5.84 per 100 RU/s per month. The rate is flat. Whether you use every RU or none of them, you pay for the peak you reserved.

That is the model's strength and its trap. Above roughly 65 percent steady utilization, no other model is cheaper. Below it, you are paying for capacity you are not using. Most production containers run at 20 to 40 percent average utilization because peak is sized for traffic spikes. Manual is the wrong default at that utilization.

Autoscale: the right default for variable workloads

Autoscale charges $0.012 per 100 max RU/s per hour, 1.5x the Manual rate. The catch the marketing buries: it bills at the highest RU/s observed in each hour, with a 10 percent floor of the maximum.

On a container with 4000 max RU/s, the floor is 400 RU/s billing. Above that, every hour costs whatever the peak that hour was. A container averaging 30 percent utilization (1200 RU/s) bills at 1200, costs $0.144 per hour, $105 per month. The same container on Manual at 4000 RU/s would cost $233 per month. Autoscale wins by 55 percent.

The break-even with Manual is around 65 percent steady utilization, because Autoscale's 1.5x rate offsets the capacity savings. Above 65 percent steady, switch to Manual. Below, Autoscale.

Serverless: a different shape entirely

Serverless does not bill for provisioned capacity. It bills for the RUs your application actually consumes, $0.25 per million RUs. No minimum, no commit, no autoscale floor.

The break-even versus Autoscale on a given workload is around 50 million RUs per month. Below that, Serverless wins. Above, Autoscale wins because the per-RU rate is lower at scale. Serverless caps at 5000 RU/s peak per container and 50 GB storage, and it does not support multi-region replication. Any workload that needs replication is on Provisioned by default.

The right shape for Serverless: dev and test environments, bursty event handlers, low-traffic SaaS tenant containers. The Cosmos DB calculator prices all three models on the same workload so the right answer is on the page.

The multi-region multiplier most teams forget

Cosmos DB pricing scales linearly with region count. A 4000 RU/s container replicated to three regions costs three times the single-region throughput cost. Storage replicates and bills per region too. This is the line item that quietly triples a Cosmos bill when someone enables a "global distribution" toggle for resilience without reading the pricing implication.

Multi-region writes doubles the per-region rate on top of that. A 4000 RU/s container with multi-region writes across three regions costs six times the single-region single-write rate. The premium pays for conflict resolution and last-write-wins coordination. Most applications do not actually write from multiple regions and would be fine on single-write multi-region replication, which is half the cost.

The audit query covers both:

az cosmosdb sql container list \
  --account-name my-cosmos \
  --resource-group prod-rg \
  --database-name app \
  --query "[].{name:name, throughput:options.throughput, autoscale:options.autoscaleSettings.maxThroughput}" \
  --output table

az cosmosdb show \
  --name my-cosmos \
  --resource-group prod-rg \
  --query "{regions:locations[].locationName, multiWrite:enableMultipleWriteLocations, freeTier:enableFreeTier}"

Indexing: the free 50 percent

Cosmos DB indexes every property of every document by default. Every write costs RUs proportional to the number of indexed properties. On a 50-property document, a write that should cost 5 RU instead costs 30 RU. Six times the throughput, billed at peak rates.

Custom indexing on hot fields cuts RU per write by 50 to 70 percent on most write-heavy workloads. The policy is one JSON block on the container:

{
  "indexingPolicy": {
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
      { "path": "/userId/?" },
      { "path": "/createdAt/?" }
    ],
    "excludedPaths": [
      { "path": "/*" }
    ]
  }
}

The trade-off: queries on excluded paths now require a scan, billed at higher RU per read. The decision is workload specific. The audit pattern: list every container, sample recent queries from Application Insights, identify which paths are actually queried and which are dead writes. Run the change in dev first, observe RU per write drop, then promote.

Free tier, applied somewhere useful

Cosmos DB has a permanent free tier: 1000 RU/s and 25 GB storage on one Cosmos account per subscription. Roughly $58 per month off, forever. Most teams enable it on a dev subscription, leave it there, and never think about it again. On production, it would still cover the first 1000 RU/s of the largest container.

The audit move: identify the container in production where 1000 RU/s is the largest fraction of total throughput, and create the new free-tier-enabled Cosmos account there. Then migrate the container in. The migration is online with change-feed replication.

Run the model audit this week

The query takes thirty seconds. Identify three things: which containers are on Manual at low utilization (Autoscale candidates), which are on Provisioned at low monthly RUs (Serverless candidates), and which have multi-region writes enabled where only one region writes. Plug each into the calculator with the peak RU/s and average utilization, and the dollar delta is the savings on the change.

For the broader Azure database picture, the DTU vs vCore decision tree covers the same model-mistake pattern on Azure SQL. The two audits run in the same hour and usually return the largest single-quarter savings on an Azure tenant.

Keep reading

More from the blog