Cloud Horizon Get the free audit

May 8, 2026 8 min read

Lambda cost: the three knobs that matter (and the one that does not)

Lambda pricing has two components and three knobs that move the bill. Memory, architecture, duration. Provisioned Concurrency is the fourth knob most teams should not touch. The math, the audit query, and the change you can ship before lunch.

Lambda is one of the AWS services that looks free until it does not. The unit prices are tiny. $0.20 per million invocations. $0.0000166667 per GB-second. The tail of decimal places makes the bill feel theoretical right up to the day a serverless workload bills $40,000 in a month.

Once you cross 50 million invocations a month or so, three knobs determine almost the entire bill. Memory tuning. Architecture. Duration. Get those right and Lambda stays cheap. Get them wrong and Lambda is the line item the CFO calls about.

Knob 1: memory tuning

Lambda allocates CPU proportional to memory. A function at 128 MB gets a fraction of a vCPU. A function at 1769 MB gets one full vCPU. Above that you get multi-core, with the cap at 10 GB and roughly 6 vCPUs.

That is the trick. CPU-bound work is often cheaper at higher memory because duration drops faster than the GB-second rate rises. A function that runs 800 ms at 256 MB might run 200 ms at 1024 MB. The GB-seconds are identical, the wall-clock time is a quarter, the user gets a faster response.

AWS publishes a tool called Lambda Power Tuning that runs each function across memory levels and shows you the cost-versus-speed sweet spot. It is a Step Functions state machine, deploy once, run for 30 seconds per function. The output is a chart showing cost on one axis and duration on the other. Pick the elbow.

Knob 2: ARM64 (Graviton)

ARM64 Lambdas are 20 percent cheaper per GB-second. Performance is equal or slightly better on most modern runtimes. The list of supported runtimes covers Python, Node.js, Java, .NET, Go, Ruby, and custom runtimes via the AL2023 base image.

The change is one Terraform attribute:

resource "aws_lambda_function" "handler" {
  # ...
  architectures = ["arm64"]  # was ["x86_64"]
}

The redeploy is a plan and apply. The bill drops 20 percent on the GB-second component the next billing cycle. The only blocker is functions that bundle x86-only native binaries. In 2026 those are rare, but check before bulk-flipping. Common offenders are old versions of pillow, numpy compiled wheels, certain headless Chromium binaries. The fix is to rebuild the layer for arm64, usually one CI tweak.

Knob 3: duration

Lambda is billed in 1 ms increments. Reducing duration by half halves the GB-second component. The biggest wins are not in the function code, they are at the boundary.

Cold starts are the loudest part of "duration" but rarely the biggest contributor to cost. The cost villains are functions that wait. Wait on RDS. Wait on a slow third-party API. Wait on a DNS lookup that should be cached. Every millisecond a Lambda spends idle is a billed millisecond.

The patterns that cut duration most:

  • Connection pooling with RDS Proxy or Data API for relational databases
  • Aggressive HTTP keep-alive on outbound calls, single client at module scope
  • Region-local dependencies, never cross-region for a Lambda hot path
  • Removing unused dependencies from the package, smaller cold start, faster init
  • Init-time prefetch of static config so request-time only does the work

The fourth knob: Provisioned Concurrency, mostly do not touch

Provisioned Concurrency keeps a configurable number of warm Lambda containers ready. It eliminates cold starts. It also bills per hour whether or not traffic flows.

For latency-sensitive user-facing endpoints, Provisioned Concurrency is the right answer. For an API that serves a million requests an hour from a steady traffic pattern, the per-hour Provisioned Concurrency rate is also worth running the math on. There is a break-even where provisioned beats on-demand.

For everything else, leave Provisioned Concurrency alone. The most common audit finding we have on this is teams that turned Provisioned Concurrency on years ago for a launch event and forgot to turn it off. We see five-figure monthly bills on dead functions still keeping 10 instances warm 24/7.

The audit query

The Cost and Usage Report has a usage type column. Filter on anything containing "Lambda" and group by function name. The functions sorted by GB-seconds at the top are your tuning candidates. Cross-reference with the Lambda console to see which ones are still on x86 architecture. That gets you 80 percent of the savings on a typical AWS account.

For the recursive-trigger pattern, group invocations by function name and look for one that bills several million invocations per day with low duration each. The classic shape is a Lambda that writes to S3 and is also subscribed to S3 events on the same bucket. Self-trigger loops bill silently until somebody notices.

What the calculator does for you

The free Lambda cost calculator shows the three knobs side by side. Plug in invocations, duration, memory, and architecture, and see x86 vs arm64 monthly cost on the same workload. Five-second sanity check before you commit to a Lambda design or after you suspect one is over budget.

For real numbers on a real account, the 14-day audit pulls actual Lambda spend, identifies x86 functions that should be ARM64, memory mismatches that Power Tuning would fix, and any recursive trigger patterns. Free, read-only IAM role, one-page summary.

Run this on your real account

Free 14-day Lambda audit

We pull your actual Lambda spend, find the x86 functions, the memory mismatches, and the trigger loops. Free, read-only, one-page report.

Keep reading

More from the blog