MD5 vs SHA-256 for Static Asset Fingerprinting
Bundlers, CDNs, and browsers each have a stake in which hash function produces your asset fingerprints. The algorithm determines collision probability, SRI compatibility, and — to a lesser extent — build speed. This reference compares MD5, SHA-1, SHA-256, and BLAKE3 as used in modern build pipelines, explains why fingerprinting does not strictly require cryptographic strength, and documents the default choices made by Webpack, Vite, Rollup, and esbuild.
Why the Algorithm Choice Matters
For static asset fingerprinting, a hash performs two distinct jobs that have conflicting requirements:
-
Cache-busting identifier — the hash becomes part of the filename or query string so that CDN edge nodes treat each content revision as a new object. The security bar here is low: you only need uniqueness across your own asset set. A hash collision in this context means two distinct files share the same fingerprinted URL, causing one to silently overwrite the other in the cache.
-
Cryptographic integrity proof — when you ship a
<script integrity="sha256-…">attribute, the browser re-hashes the downloaded bytes and rejects any mismatch. This is Subresource Integrity (SRI). Here, the bar is high: a collision means an attacker can substitute a malicious file that produces the same hash value.
Because the jobs are different, the right algorithm depends on which job you are doing. For pure cache-busting, xxhash or truncated MD5 is acceptable; for subresource integrity validation, only SHA-256, SHA-384, or SHA-512 are accepted by browsers.
Algorithm Reference
MD5
MD5 produces a 128-bit (32 hex character) digest. It was designed in 1991 and is now cryptographically broken: researchers can produce deliberate collisions in seconds on commodity hardware. It remains prevalent in legacy Rails, Sprockets, and older Webpack configurations purely because it was the default before anyone cared about SRI.
Throughput on a modern x86_64 CPU is roughly 1.0–1.5 GB/s. For a typical frontend build producing 10–50 MB of output, the hash step takes under 50 ms regardless of algorithm — speed is not a meaningful differentiator.
Use MD5 for: nothing new. Remove it from any pipeline you touch.
SHA-1
SHA-1 produces a 160-bit (40 hex character) digest. It is also cryptographically broken for collision resistance (the SHAttered attack, 2017), but it appeared in some older Webpack 3 and Grunt pipelines. It is not accepted for SRI.
Use SHA-1 for: nothing new.
SHA-256
SHA-256 produces a 256-bit (64 hex character) digest. It is part of the SHA-2 family and is the current standard for both cache-busting fingerprints and SRI. Collision resistance is 2¹²⁸ operations — practically infinite for any asset pipeline. Throughput on x86_64 with hardware acceleration (Intel SHA Extensions) is 400–800 MB/s. On ARM64 (AWS Graviton, Apple M-series), hardware acceleration brings this to 500–900 MB/s.
Webpack 5 uses SHA-256 (via Node.js crypto) as the underlying algorithm for [contenthash]. You control only the truncation length, not the algorithm itself.
Use SHA-256 for: all new pipelines, SRI generation, anything that might grow beyond a few hundred assets.
BLAKE3
BLAKE3 produces a 256-bit digest (or any length via its extendable-output mode). It is not cryptographically broken and outperforms SHA-256 significantly: 1–6 GB/s on modern hardware without hardware acceleration. It is not yet a W3C-approved algorithm for SRI, so it cannot replace SHA-256 there.
esbuild uses its own fast non-cryptographic hash (derived from xxhash internally, sometimes described as a “content hash” without specifying the algorithm) because it optimises for build speed above all else. For SRI you would still compute SHA-256 separately.
Use BLAKE3 for: custom pipeline scripts where you need fast file fingerprinting and SRI is handled separately.
Comparison Table
| Property | MD5 | SHA-1 | SHA-256 | BLAKE3 |
|---|---|---|---|---|
| Output bits | 128 | 160 | 256 | 256 (variable) |
| Hex length (full) | 32 | 40 | 64 | 64 |
| Collision resistance | Broken | Broken | 2¹²⁸ ops | 2¹²⁸ ops |
SRI (integrity=) |
Rejected | Rejected | Accepted | Not standardised |
| HW acceleration | No | No | Yes (SHA-NI) | No (SIMD) |
| Speed (x86_64) | ~1.2 GB/s | ~900 MB/s | ~700 MB/s | ~4 GB/s |
| Webpack 5 default | No | No | Yes (internal) | No |
| Vite / Rollup default | No | No | Yes (via Rollup) | No |
| esbuild default | No | No | No | xxhash-like |
Fingerprinting Does Not Require Cryptographic Strength
A common misconception is that you need a cryptographically strong hash for cache-busting. You do not. The threat model for cache-busting is accidental collision inside your own build output — two of your own files happening to produce the same fingerprint. There is no adversary crafting files to produce deliberate collisions.
The mathematics of the birthday problem determine accidental collision probability. For an 8-hex-character (32-bit) truncation of any decent hash function, the collision probability across N assets is approximately 1 - e^(-N²/2³²). At N = 10,000 assets, that is about 1.2%. At N = 1,000 assets, it is about 0.012%. The underlying hash algorithm (MD5 vs SHA-256) barely matters for this calculation — what matters is the truncation length.
However, SHA-256 is still the right choice because:
- It is the only option for SRI, and using one algorithm everywhere is simpler.
- You may later add SRI to assets you currently fingerprint without it.
- Webpack 5, Vite, and Rollup all use SHA-256 by default — there is no cost to staying on the default.
- MD5 is banned in many regulated environments (FIPS 140-2, PCI DSS) regardless of the use case.
For an analysis of the birthday-bound collision probabilities at specific truncation lengths, see the guide to safely truncating content hash length.
Bundler Defaults
Webpack 5: SHA-256 via [contenthash]
Webpack 5 computes [contenthash] using Node.js crypto with SHA-256. The hash covers the module’s final byte content after all transformations. You control only the output length.
// webpack.config.js
module.exports = {
mode: 'production',
output: {
filename: '[name].[contenthash:8].js',
chunkFilename: '[name].[contenthash:8].chunk.js',
assetModuleFilename: 'assets/[name].[contenthash:8][ext]'
},
optimization: {
moduleIds: 'deterministic',
chunkIds: 'deterministic',
runtimeChunk: 'single'
}
};
The :8 suffix truncates to 8 hex characters. Increase to :12 or :16 for monorepos or projects with thousands of chunks. The moduleIds: 'deterministic' setting is critical: without it, module IDs are assigned by order of processing, causing hash changes in unrelated chunks when any single file is added or removed.
Older Webpack 4 configurations used [hash] (build-wide hash — the same value for every output file) and [chunkhash] (chunk-level hash that does not account for extracted CSS). Both are superseded by [contenthash].
Prior to Webpack 5, the internal algorithm was md4 (a fast non-standard hash, not MD5). Webpack 5 switched to SHA-256. If you see md4 in old configs or error messages, that is the old Webpack 4 internal — it is not exposed to you.
Vite: SHA-256 via Rollup
Vite delegates asset hashing to Rollup during production builds. The [hash] token in Rollup output patterns is derived from the content using SHA-256. Length control uses the same colon syntax.
// vite.config.js
import { defineConfig } from 'vite';
export default defineConfig({
build: {
rollupOptions: {
output: {
entryFileNames: 'entry/[name]-[hash:8].js',
chunkFileNames: 'chunks/[name]-[hash:8].js',
assetFileNames: 'assets/[name]-[hash:8][extname]'
}
}
}
});
For details on the full Vite hashing configuration, see the Vite asset pipeline configuration guide.
Rollup 4: SHA-256
Rollup 4 uses SHA-256 internally. The [hash] token in output file names is an 8-character truncation by default. You can control the character count by setting output.hashCharacters (accepts 'hex', 'base64', 'base36') and implicit length via the pattern:
// rollup.config.js
export default {
input: 'src/index.js',
output: {
dir: 'dist',
format: 'es',
entryFileNames: '[name]-[hash].js',
chunkFileNames: '[name]-[hash].js',
assetFileNames: 'assets/[name]-[hash][extname]'
}
};
For CDN deployment patterns with Rollup, see Rollup asset optimisation.
esbuild: xxhash-derived non-cryptographic hash
esbuild uses a fast non-cryptographic hash internally (based on xxhash). The [hash] token in esbuild entry/chunk naming is always 8 hex characters and cannot be changed via configuration — esbuild does not expose the algorithm or the length as tunable options.
// esbuild build script
const esbuild = require('esbuild');
esbuild.build({
entryPoints: ['src/app.js'],
bundle: true,
outdir: 'dist',
entryNames: '[name]-[hash]',
assetNames: 'assets/[name]-[hash]',
chunkNames: 'chunks/[name]-[hash]',
splitting: true,
format: 'esm'
});
Because esbuild’s hash is not SHA-256, you cannot reuse the fingerprint value in SRI attributes. Generate SRI hashes with a separate step after the esbuild output is produced. See integrating esbuild with CDN fingerprinting workflows for a complete example.
SRI Requires Full-Length SHA-256 (or Stronger)
When you add a <script integrity="sha256-…"> attribute, the browser verifies the file by:
- Downloading the resource.
- Computing SHA-256 over the received bytes.
- Base64-encoding the result.
- Comparing with the value in the
integrityattribute.
The integrity value must be a full, untruncated SHA-256 (or SHA-384, SHA-512) hash, base64-encoded. The 8-character hex truncation you use for cache-busting filenames is not the same thing and cannot be used here.
# Generate a full SHA-256 for SRI — run after the file is in its final form
openssl dgst -sha256 -binary dist/assets/app.a1b2c3d4.js | openssl base64 -A
# Insert the output (without trailing newline) into your HTML template:
# <script src="/assets/app.a1b2c3d4.js"
# integrity="sha256-<output here>"
# crossorigin="anonymous"></script>
Build plugins handle this automatically. For Webpack use webpack-subresource-integrity; for Vite use vite-plugin-sri3. Both plugins run after the output is hashed, read the final file bytes, compute full SHA-256, and inject the integrity attribute into the HTML.
Verifying the Algorithm in Use
You can confirm which algorithm produced a given hash by checking its length and character set in the output filenames, then verifying against a known file:
# Pick any output file with a fingerprint, e.g. dist/app.a1b2c3d4.js
FILE=dist/assets/app.a1b2c3d4.js
FINGERPRINT=a1b2c3d4
# Compute full SHA-256 of the file
FULL=$(openssl dgst -sha256 "$FILE" | awk '{print $2}')
echo "Full SHA-256 : $FULL"
echo "First 8 chars: ${FULL:0:8}"
# Does the fingerprint match the first N chars of SHA-256?
if [[ "${FULL:0:${#FINGERPRINT}}" == "$FINGERPRINT" ]]; then
echo "MATCH — bundler is using SHA-256"
else
echo "NO MATCH — algorithm may differ (check bundle output plugin)"
fi
If the fingerprint does not match the SHA-256 prefix, the bundler is using a different algorithm (common with esbuild’s xxhash or an older Webpack 4 md4 config).
Migrating an Existing Pipeline from MD5 to SHA-256
If you are maintaining a pipeline that still uses MD5 (common in Rails/Sprockets apps ported to Node.js, or projects that started with Webpack 3), the migration to SHA-256 involves three phases: changing the bundler output configuration, updating any tooling that reads asset manifests, and invalidating the CDN entries that exist under the old MD5-based filenames.
Step 1: Update the bundler output configuration
For Webpack 4 → 5 migrations, the primary change is replacing the old [hash] build-hash token with [contenthash]. Webpack 5 handles the rest internally.
// Before (Webpack 4 — uses md4 internal hash via [hash] or [chunkhash])
module.exports = {
output: {
filename: '[name].[hash:8].js',
chunkFilename: '[name].[chunkhash:8].chunk.js'
}
};
// After (Webpack 5 — uses SHA-256 via [contenthash])
module.exports = {
mode: 'production',
output: {
filename: '[name].[contenthash:8].js',
chunkFilename: '[name].[contenthash:8].chunk.js',
assetModuleFilename: 'assets/[name].[contenthash:8][ext]'
},
optimization: {
moduleIds: 'deterministic',
chunkIds: 'deterministic',
runtimeChunk: 'single'
}
};
For Rails/Sprockets pipelines migrating to a Node.js build step, the old .md5.json manifest file must be replaced by the new manifest format. The fingerprinted filename structure changes (MD5 produces 32-char hex; your new config produces 8-char SHA-256 truncations), so any server-side code that reads asset paths from the manifest must handle both formats during the cutover period.
Step 2: Generate and compare manifests before cutover
Run the old and new builds side by side and diff the manifests:
# Build with old config, capture manifest
npm run build:legacy
cp dist/asset-manifest.json /tmp/manifest-old.json
# Build with new config, capture manifest
npm run build
cp dist/asset-manifest.json /tmp/manifest-new.json
# Show which logical asset names changed (they all will, because the hash format changed)
diff <(jq -r 'keys[]' /tmp/manifest-old.json | sort) \
<(jq -r 'keys[]' /tmp/manifest-new.json | sort)
# Show the new filenames
jq . /tmp/manifest-new.json
All fingerprinted filenames will differ between the old and new builds because the hash algorithm changed. This is expected and deliberate — the goal is to ensure the new SHA-256-based fingerprints are stable across repeated builds before cutting over traffic.
Step 3: Deploy new files before updating the HTML
Upload the newly fingerprinted files to the CDN origin without deploying the updated HTML. Wait for the origin sync to complete, then deploy the HTML that references the new SHA-256 fingerprinted URLs. This ensures there is never a window where the browser loads an HTML page referencing assets that do not yet exist at the CDN.
# Sync new dist files to S3 / CDN origin — all new SHA-256 filenames
aws s3 sync dist/ s3://your-bucket/assets/ \
--cache-control "public, max-age=31536000, immutable" \
--exclude "*.html"
# After confirming sync, deploy the HTML
aws s3 cp dist/index.html s3://your-bucket/ \
--cache-control "no-cache, must-revalidate"
# Optionally purge the HTML cache at the CDN edge
# (Cloudflare example)
curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/purge_cache" \
-H "Authorization: Bearer $CF_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"files": ["https://yourdomain.com/", "https://yourdomain.com/index.html"]}'
The old MD5-fingerprinted files can remain at the CDN temporarily — they will not be served once the HTML no longer references them, and they will naturally evict from edge caches after the TTL expires.
Step 4: Verify the new fingerprints are SHA-256
After deploying, verify that the fingerprinted filenames in the live HTML match the SHA-256 truncation:
# Extract fingerprinted script URLs from live HTML
curl -s https://yourdomain.com/ | grep -oE '/assets/[^"]+\.[a-f0-9]{8,16}\.js'
# Pick one and verify its hash against the file you built locally
FILE=dist/assets/app.a1b2c3d4.js
FINGERPRINT=$(echo "$FILE" | grep -oE '[a-f0-9]{8,16}')
FULL_SHA256=$(openssl dgst -sha256 "$FILE" | awk '{print $2}')
echo "Fingerprint in filename: $FINGERPRINT"
echo "First ${#FINGERPRINT} chars of SHA-256: ${FULL_SHA256:0:${#FINGERPRINT}}"
If these match, the migration is complete. If they do not, the bundler may still be using an older algorithm — check whether any custom hashFunction option is set in webpack.config.js.
CDN Behaviour with Different Hash Lengths
Hash length has no direct effect on how CDNs route requests — CDNs treat the URL path as an opaque string for cache key purposes. However, the hash length indirectly affects three CDN-related concerns:
Cache key collision risk at the edge
Some CDN configurations normalise URLs before generating cache keys: stripping trailing slashes, lowercasing paths, or removing duplicate query parameters. These transformations do not affect hex fingerprints in path segments, but you should verify that your CDN does not truncate long URLs at a limit shorter than your longest asset path. In practice, all major CDNs (Cloudflare, CloudFront, Fastly, Akamai) support URL paths of at least 8,192 characters — far beyond any realistic asset path.
Manifest size and build artefact footprint
Each additional hex character in a hash adds one byte to every occurrence in the asset manifest. For a manifest with 5,000 entries, moving from 8 to 16 hex characters adds 40 KB to the manifest file — negligible. The filename length increase in your dist/ directory is similarly trivial.
Log file readability
Longer fingerprints are harder to read in access logs and error reports. This is a tooling consideration, not a correctness concern. If your monitoring or on-call tooling displays asset paths, consider whether 8 or 12 characters produces more actionable log lines. The difference between main.a1b2c3d4.js and main.a1b2c3d4e5f6.js in a log entry is marginal, but it matters when you are debugging a production incident at 3 AM.
Build Pipeline and CI Integration
A complete CI asset fingerprinting pipeline has four phases: build, verify, upload, and deploy. The algorithm check belongs in the verify phase.
# .github/workflows/deploy.yml
name: Build and deploy
on:
push:
branches: [main]
jobs:
build-and-deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version-file: '.nvmrc'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Build assets
run: npm run build
env:
NODE_ENV: production
- name: Verify hash uniqueness
run: |
# Extract all fingerprint segments from dist filenames
# and fail the build if any two files share the same fingerprint
node -e "
const fs = require('fs');
const files = fs.readdirSync('dist/assets');
const hashes = files.map(f => f.match(/[a-f0-9]{8,16}/)?.[0]).filter(Boolean);
const seen = new Set();
const dupes = [];
for (const h of hashes) {
if (seen.has(h)) dupes.push(h);
seen.add(h);
}
if (dupes.length) {
console.error('Hash collision detected: ' + [...new Set(dupes)].join(', '));
process.exit(1);
}
console.log('All ' + hashes.length + ' asset fingerprints are unique.');
"
- name: Upload assets to CDN origin (before HTML)
run: |
aws s3 sync dist/ s3://${{ secrets.S3_BUCKET }}/ \
--exclude "*.html" \
--cache-control "public, max-age=31536000, immutable"
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
- name: Deploy HTML entry points
run: |
aws s3 cp dist/index.html s3://${{ secrets.S3_BUCKET }}/index.html \
--cache-control "no-cache, must-revalidate"
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
The explicit ordering (assets first, HTML second) is what prevents users from loading a page that references not-yet-uploaded asset files during the deploy window. This is the same atomic deploy pattern described in cache-key architecture.
Ensuring Deterministic Hashes
Algorithm choice is meaningless if the hash changes on every build for the same source code. Non-deterministic inputs produce non-deterministic hashes. See deterministic build outputs for the full debugging workflow, but the short checklist is:
moduleIds: 'deterministic'andchunkIds: 'deterministic'in Webpack..nvmrcorvolta.Date.now()andBUILD_TIMEinjections from production builds.package-lock.jsonoryarn.lockand do not allow floating versions in CI.
Common Pitfalls
| Problem | Cause | Fix |
|---|---|---|
| SRI mismatch after deploy | Using the 8-char filename hash for integrity attribute |
Generate full SHA-256 via openssl dgst -sha256 -binary separately |
| Hashes change on every CI run | Non-deterministic module IDs or timestamp injection | Set moduleIds: 'deterministic' and audit for Date.now() in build config |
| MD5 in existing Webpack config | Legacy md4 config (Webpack 4) was not updated |
Migrate to [contenthash] — Webpack 5 uses SHA-256 automatically |
| esbuild fingerprint not matching SHA-256 | esbuild uses xxhash internally, not SHA-256 | Compute SRI separately after esbuild output; do not reuse the filename hash |
| CDN rejects long hash URLs | Overly strict URL-length or character-set rules on legacy CDN | Confirm alphanumeric hex is accepted; 8–16 chars is well within all CDN limits |
Frequently Asked Questions
Is MD5 acceptable for CDN cache-busting if I am not using SRI?
Technically it avoids cryptographic collision attacks (which require an adversary), but it is broken for accidental collision resistance at scale and is banned in FIPS environments. SHA-256 is the default in every modern bundler, so there is no practical reason to choose MD5.
Does switching from MD5 to SHA-256 slow down my build?
No. Both algorithms complete in under 50 ms for a typical frontend build. The hash step is not the bottleneck — transpilation, minification, and tree-shaking dominate build time.
Can I use BLAKE3 for asset fingerprinting?
BLAKE3 works for cache-busting fingerprints but is not accepted by browsers for SRI. If you do not use SRI, BLAKE3 is fine via a custom build plugin. If you use SRI, you must also generate SHA-256 separately, at which point you might as well use SHA-256 for the filename hash too.
What is md4 that appears in old Webpack error messages?
md4 was Webpack 4’s internal hashing algorithm for [hash] and [chunkhash]. It is a fast, non-standard hash distinct from MD5. Webpack 5 replaced it with SHA-256. If you see md4 in a Webpack 5 error, you have a Webpack 4 plugin still using the legacy API.
Related
- Safely truncating content hash length — birthday-bound collision probabilities at 8, 12, and 16 hex characters, and bundler config to set hash length
- How to choose between content hash and version hash — decision matrix for per-file vs per-release fingerprinting
- Preventing hash collisions in large frontend projects — CI guardrails and manifest collision detection
- Subresource integrity validation — browser SRI mechanics and full-hash generation pipeline
- Static Asset Fingerprinting Fundamentals — parent overview covering cache keys, HTTP headers, and deployment patterns