Practical models for CDN TTLs, cache keys and origin shielding to cut origin load

How CDN configuration reduces origin work and why that matters

Reducing origin requests lowers server compute and cross region network transfer which reduces energy use and often cost. Three configuration levers in the CDN are central to that reduction. Time to live values determine how long an object is served from cache. Cache keys determine how many distinct cache entries exist for the same logical resource. Origin shielding instructs the CDN to funnel cache misses through a single intermediate location so the origin sees fewer, larger bursts of traffic rather than many small requests from every edge.

A simple model to quantify TTL effects

To tune TTLs for sustainability it helps to reason with a tractable traffic model. Consider requests for a single object arriving at a particular CDN edge as a Poisson process with rate r requests per second. For Poisson arrivals the probability that the time since the previous request exceeds the TTL is e^{-r * TTL}. That probability is the cache miss rate at that edge. The origin request rate produced by that object at that edge is r times the miss probability. Put another way the expected origin requests per second for that object at a single POP equals r * e^{-r * TTL}.

Example calculation using round numbers to illustrate the shape of the curve. If r equals 0.1 requests per second which is six requests per minute then a TTL of 600 seconds yields miss rate e^{-0.1 * 600} which is e^{-60} and effectively zero. If r equals 0.001 requests per second which is about one request every 16 minutes then a TTL of 600 seconds gives miss rate e^{-0.001 * 600} which is about e^{-0.6} or 0.55 so more than half of requests cause origin fetches. Increasing TTL reduces origin requests exponentially in this model. Use these kinds of computations to see whether a candidate TTL will materially reduce origin load for the traffic pattern you observe.

Applying the model to many objects and edges

Real sites serve many different objects across many POPs. Aggregate origin rate is the sum across objects and edges. Objects with very low per edge request rate generate most of the origin churn unless they share cache entries across edges or are grouped by shielding. For sustainability focus initial effort on the small fraction of objects that produce the largest number of origin requests or the highest bytes per origin request. Those objects will typically be high traffic HTML pages, frequently requested images, or API endpoints used by many clients.

Practical decision rule for choosing TTLs

Pick TTL so that the expected per edge origin request rate for an object is acceptably small. A compact rule of thumb is to compute r from historical edge logs and choose TTL so that r * e^{-r * TTL} is less than a target origin rate for that object. Targets depend on origin capacity and sustainability goals. If you want to reduce origin requests by 90 percent relative to zero caching, choose TTL so that e^{-r * TTL} is 0.1 which gives TTL = ln(10) / r approximately 2.3 / r. For an object receiving 0.01 requests per second at one POP choose TTL near 230 seconds to cut origin requests by 90 percent at that POP.

Cache keys and the cost of fragmentation

The cache key determines which requests share a cached response. Common components are host and path. Additional components can be query string parameters, selected request headers, and cookies. Every additional dimension multiplied into the key fragments the cache which reduces hit rates for any given POP and increases origin fetches.

Principles for cache key design

Keep the key as small as possible while preserving correctness. That is the single most important rule. Do not include opaque or high cardinality values unless they actually change the response. Avoid including cookies or headers that vary per user. Where personalization is required use separate cached layers that can be composed with a small personalized payload delivered client side or with edge compute that caches shared parts and stitches personalization without shattering the cache.

When query strings are used to control display parameters normalize them. Decide which parameters are cache relevant and strip or reorder the rest at the edge. Many CDNs provide query string whitelists and canonicalization options. Use those to prevent query parameter order or irrelevant tracking parameters from creating distinct cache entries.

Vary header and selective header inclusion

The Vary response header signals which request headers influence the response. Use it sparingly. If you must vary on Accept for image format negotiation that is reasonable. Avoid varying on many headers such as user agent unless different agents legitimately require different representations. Where negotiation is necessary prefer server side content negotiation that canonicalizes the response to a small set of cacheable variants, or use separate URLs for variants so cache keys remain explicit and limited.

Origin shielding and its multiplicative effect

Origin shielding designates a single shield POP to fetch objects from the origin on behalf of many edge POPs. Without shielding each edge miss can produce a separate origin request. With shielding misses consolidate to the shield POP which can serve many edge misses from a cached copy and send only one origin request for a miss that the shield did not have.

How shielding reduces origin requests

Assume N edge POPs each see requests for the same object at rate r per POP. Without shielding expected origin requests per second across all POPs is N * r * e^{-r * TTL}. If a shield POP sits between edges and origin, the shield sees aggregated requests for that object at rate N * r. The shield miss probability becomes e^{-N * r * TTL} and the origin request rate becomes N * r * e^{-N * r * TTL}. For positive r and moderate N the exponential term typically becomes much smaller which produces a large reduction in origin traffic.

Rough numeric intuition. If r equals 0.01 per second and N equals 50 edges then N * r equals 0.5 per second. A TTL of 60 seconds yields shield miss probability e^{-30} which is negligible. Without shielding each edge would have miss probability e^{-0.6} which is about 0.55 so aggregated origin rate would be substantial. Shielding therefore converts many small misses into a single shield miss or origin fetch and dramatically reduces origin work.

When shielding is most useful and when it is not

Shielding is most effective for objects that are shared globally or across many POPs and where the shield POP has low additional latency to the edge. It is less effective when content is inherently regional and edges mostly serve content local to them or when origin responses must be personalized per POP. Consider shielding when origin cost per request is high or when the origin capacity needs protection from request storms. Shielding does not remove the need for appropriate TTLs and cache keys and should be combined with them.

Freshness strategies that avoid origin spikes

Aggressive TTLs reduce origin requests but increase staleness. Use revalidation and stale serving directives to balance freshness with origin protection. Two cache control extension directives are particularly helpful. Stale while revalidate allows the CDN to serve a stale object while the cache fetches an updated copy in the background. Stale if error lets the CDN continue serving stale content if the origin is temporarily unavailable. Both reduce traffic spikes and improve perceived availability.

Where supported, prefer conditional requests with validators such as ETag or Last Modified so the origin only sends 304 Not Modified when content has not changed. Conditional requests are cheaper than full responses and preserve strict freshness semantics while cutting bytes. Combine conditional validation with shielded fetches to minimize the number of validators hitting the origin.

Cache invalidation and purging

Purge operations create origin load and complexity. Design workflows that minimize manual purges. Use short TTLs for truly dynamic content and longer TTLs for stable assets. For content updates prefer targeted invalidation by URL or tag. Where possible use cache tags so a single update invalidates a logical set of URLs without broad blunt purges.

Asset type guidance and configuration patterns

Set different policies by asset type. Images and static media usually deserve long TTLs and aggressive CDN caching. Version assets by filename to keep TTLs long while allowing instant updates when needed. Static assets may also benefit from immutable cache directives when content is content addressed or versioned.

HTML pages require a balanced approach. For editorial sites a TTL on full pages combined with stale-while-revalidate usually provides good freshness at low origin cost. For e commerce cart and checkout endpoints set short TTLs and avoid caching per user. For API endpoints consider response schemas that separate stable shared data and volatile per user data so only the stable part is widely cached.

Configuration checklist to reduce origin load and emissions

Measure per object and per POP request rate from CDN logs to compute r.
Use the Poisson model to test candidate TTLs so that r * e^{-r * TTL} meets your origin rate targets.
Minimize cache key cardinality by excluding irrelevant query parameters, cookies and headers.
Apply shielding for objects requested across many POPs or when origin protection is required.
Enable conditional requests and validators and use stale while revalidate where supported.
Prefer cache tags and targeted invalidation over broad purges.

Measuring impact and validating changes

Quantify origin request rate by comparing origin logs before and after changes and by measuring bytes transferred from origin. Monitor error rates and latency to ensure user experience is not degraded. Use synthetic tests to validate correctness for different cache keys and header combinations. Finally estimate sustainability impact by converting reduced bytes and server time to energy and emissions using your hosting or infrastructure provider’s published factors or established measurement methods.

Small experiments that scale

Start with a single high traffic asset or a representative bucket of assets. Apply a conservative TTL change or enable shielding for that set and measure origin request reduction and user facing latency. If results are positive roll changes out to other assets incrementally. This approach contains risk and produces measured sustainability improvements rather than guesswork.

Next technical choices to explore

Consider edge compute that performs lightweight personalization without shattering caches. Investigate image CDNs that perform format negotiation while preserving cacheability. Examine your analytics and monitoring to ensure you track cache hit ratio, per asset origin rate, and bytes saved as primary indicators of sustainability oriented CDN improvements.