Cache Key Architecture

Cache key architecture defines how CDNs and reverse proxies identify, store, and serve cached objects based on normalized request attributes. Establishing production-ready patterns for deterministic cache keys aligns directly with Static Asset Fingerprinting Fundamentals to prevent stale content delivery, eliminate cache fragmentation, and optimize edge hit ratios.

Cache keys dictate CDN storage partitioning, TTL boundaries, and invalidation scope. Deterministic key generation prevents origin overload and unpredictable cache behavior. Header normalization and query string handling directly impact cache efficiency. Fingerprinting strategies must align with Content Hashing vs Semantic Versioning for predictable deployments.

Core Components of a Cache Key

A cache key is a composite string derived from specific HTTP request attributes. The CDN hashes this string to locate the cached object in edge storage. Misconfigured components immediately fragment cache storage and degrade performance.

Component Purpose Production Rule
Host Header Multi-tenant routing Normalize to lowercase. Strip www prefixes if canonicalized.
URI Path Resource identifier Include verbatim. Never strip trailing slashes inconsistently.
Query String Dynamic parameters Exclude entirely for static assets. Retain only v= or ?hash= if required.
Accept-Encoding Compression negotiation Include to prevent serving uncompressed payloads to gzip/brotli clients.
Cookie / Authorization Session state Strictly exclude. Cache keys must remain stateless for public assets.

Normalization Workflow:

  1. Intercept incoming request at the edge.
  2. Lowercase all header keys and URI paths.
  3. Strip tracking parameters (utm_*, fbclid, gclid).
  4. Reconstruct the key string using only whitelisted attributes.
  5. Generate SHA-256 hash of the normalized string for internal storage lookup.

Fingerprint Integration & Hash Selection

Embedding content hashes directly into URI paths enables immutable caching. When the hash changes, the CDN treats the request as a new object, bypassing manual invalidation. This requires strict build pipeline determinism.

Evaluate collision resistance via MD5 vs SHA-256 for Assets before selecting your hashing algorithm. SHA-256 is the industry standard for production deployments due to its negligible collision probability and cryptographic security.

Avoid timestamp-based keys (app.js?t=1698772100). Timestamps change on every deployment, forcing full cache purges and bypassing long-TTL edge storage.

Framework Configuration (Vite/Webpack Deterministic Hashing):

// vite.config.js
export default {
 build: {
 rollupOptions: {
 output: {
 // Deterministic chunk naming prevents cache key drift across builds
 chunkFileNames: 'assets/[name]-[hash].js',
 assetFileNames: 'assets/[name]-[hash].[ext]',
 },
 },
 },
};

Verification Step: Run a diff against build outputs across two consecutive deployments to ensure identical content produces identical hashes:

diff <(sha256sum dist/assets/*.js | sort) <(sha256sum dist/assets/*.js | sort)

Query Parameters vs Filename Hashing

CDNs default to caching every unique query parameter combination. This behavior multiplies origin storage requirements and reduces cache hit ratios. The architectural choice between query-based versioning and filename hashing dictates edge behavior and operational overhead.

Strategy Cache Key Impact Invalidation Complexity Edge Performance
Query Parameters (/app.js?v=1.2) High fragmentation. Each v= value creates a new cache entry. Manual purge required for base path. Slower. Requires regex matching and parameter stripping at the edge.
Filename Hashing (/app-a1b2c3.js) Zero fragmentation. Each file maps to exactly one key. Automatic. New hash = new key. Old keys expire naturally. Fastest. Direct URI path lookup. No normalization overhead.

Detailed implementation comparisons are covered in Implementing cache keys with query parameters vs filenames.

Edge Case Handling: Dynamic parameters appended to static paths (/logo.png?size=large) must be stripped before cache key generation. Configure the edge to route to a single canonical key while passing dynamic parameters to the origin only on cache misses.

CDN Configuration & Edge Rules

Enforce cache key architecture using provider-specific configuration syntax. Deploy these rules at the edge to guarantee deterministic key generation.

Cloudflare Custom Cache Key

{
 "cache_key": {
 "ignore_query_strings_order": true,
 "include_query_string": false,
 "custom_key": {
 "header": ["host", "accept-encoding"]
 }
 }
}

Context: Normalizes request attributes to prevent cache fragmentation from non-deterministic query strings while retaining essential headers for compression. Deploy via Cloudflare API or Terraform cloudflare_ruleset resource.

Fastly VCL Cache Key Manipulation

sub vcl_hash {
 set req.hash = req.url.path;
 if (req.http.Accept-Encoding) {
 set req.hash += req.http.Accept-Encoding;
 }
 hash_data(req.hash);
 return (hash);
}

Context: Explicitly constructs the cache hash using only the URI path and compression headers, ignoring volatile query parameters. Apply via Fastly CLI: fastly compute publish or VCL snippet upload.

Nginx Reverse Proxy Deterministic Routing

proxy_cache_key "$scheme$proxy_host$uri$is_args$args";

location ~* \.[a-f0-9]{8,}\.(js|css|png|jpg)$ {
 set $args "";
 proxy_cache_valid 200 301 302 365d;
 add_header Cache-Control "public, immutable";
}

Context: Strips query strings from fingerprinted static assets at the proxy layer to enforce long TTLs without cache duplication. Place inside server {} block.

AWS CloudFront Cache Policy

{
 "CachePolicy": {
 "ParametersInCacheKeyAndForwardedToOrigin": {
 "HeadersConfig": {
 "HeaderBehavior": "whitelist",
 "Headers": { "Quantity": 1, "Items": ["Accept-Encoding"] }
 },
 "CookiesConfig": { "CookieBehavior": "none" },
 "QueryStringsConfig": { "QueryStringBehavior": "none" }
 }
 }
}

Context: CLI deployment via aws cloudfront create-cache-policy --cache-policy-config file://policy.json. Enforces strict header/query control at the distribution level.

Akamai Property Manager

{
 "behaviors": [
 {
 "name": "caching",
 "options": {
 "behavior": "MAX_AGE",
 "mustRevalidate": false,
 "ttl": "31536000"
 }
 },
 {
 "name": "cacheKeyQueryParameters",
 "options": {
 "behavior": "ignore",
 "parameters": ["utm_source", "utm_medium", "utm_campaign"]
 }
 }
 ]
}

Context: Key normalization and origin request routing. Apply via Akamai CLI (akamai property-manager) or PAPI v4.

Invalidation Workflows & Purge Strategies

Automated invalidation must respect fingerprinted keys. Manual purges targeting base paths break immutable caching assumptions and trigger origin stampedes.

Step-by-Step CI/CD Purge Workflow:

  1. Tag all deployed assets with a Surrogate-Key header during origin response generation.
  2. On successful deployment, trigger a batch purge via CDN API using the surrogate key.
  3. Verify cache miss ratio drops below 5% within 60 seconds.

Automated Purge Trigger (cURL Example):

curl -X POST https://api.fastly.com/service/$SERVICE_ID/purge/all \
 -H "Fastly-Key: $FASTLY_API_TOKEN" \
 -H "Surrogate-Key: static-assets-v2"

Origin Fallback During Key Rotation: Configure stale-while-revalidate and stale-if-error directives to serve expired fingerprinted assets while the CDN fetches the new hash. This prevents 404 spikes during hash collisions or build pipeline delays.

Monitoring Commands: Track fragmentation and origin shield efficiency using provider metrics:

# Cloudflare GraphQL API query for cache hit ratio
curl -X POST https://api.cloudflare.com/client/v4/graphql \
 -H "Authorization: Bearer $CF_TOKEN" \
 -d '{"query":"{viewer{zones(filter:{zoneTag:\"$ZONE_ID\"}){httpRequests1dGroups(limit:1){dimensions{date}sum{cacheResponseBytes,edgeResponseBytes}}}}}}"}'

Common Pitfalls & Resolutions

Issue Root Cause Resolution
Cache fragmentation from non-deterministic query strings CDN defaults to caching every unique query parameter combination, multiplying origin storage and reducing hit ratios. Configure cache key rules to ignore or normalize query strings, retaining only fingerprint hashes or version parameters.
Stale assets served after deployment due to missing Vary headers Cache key does not account for Accept-Encoding or Content-Type, causing mismatched compression or MIME type delivery. Explicitly include Accept-Encoding in cache key construction and configure origin to send correct Vary headers.
Premature invalidation of fingerprinted assets Manual purge operations target base paths instead of exact fingerprinted URIs, breaking immutable caching assumptions. Automate purges using surrogate keys or directory-level tags; never purge exact fingerprinted paths unless correcting build errors.

Frequently Asked Questions

Should cache keys include the full query string for fingerprinted assets? No. Fingerprinted assets should use path-based keys with query strings stripped or normalized to prevent fragmentation and maintain immutable caching guarantees.

How do I handle cache invalidation when using content hash filenames? Invalidate via surrogate keys or directory tags rather than exact URLs. New hashes generate new keys automatically, rendering old keys obsolete without manual purging.

Does Accept-Encoding need to be part of the cache key? Yes, unless the CDN performs automatic content negotiation and serves a single compressed variant. Including it prevents serving uncompressed assets to gzip/brotli clients.

What is the operational impact of non-deterministic cache keys? Non-deterministic keys cause cache fragmentation, increased origin load, higher CDN egress costs, and unpredictable invalidation behavior during deployments.