feat(caching): new serialization format#20970
Conversation
🦋 Changeset detectedLatest commit: 26efab0 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
This PR is packaged and the instant preview is available (26efab0). Install it locally:
npm i -D webpack@https://pkg.pr.new/webpack@26efab0
yarn add -D webpack@https://pkg.pr.new/webpack@26efab0
pnpm add -D webpack@https://pkg.pr.new/webpack@26efab0 |
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (78.71%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.
Additional details and impacted files@@ Coverage Diff @@
## main #20970 +/- ##
===========================================
- Coverage 91.12% 36.23% -54.90%
===========================================
Files 570 419 -151
Lines 57747 47616 -10131
Branches 15458 13049 -2409
===========================================
- Hits 52622 17252 -35370
- Misses 5125 30364 +25239
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Merging this PR will degrade performance by 50.31%
Warning Please fix the performance issues or acknowledge them on CodSpeed. Performance Changes
Tip Investigate this regression by commenting Comparing Footnotes
|
|
It's hard to get a good benchmark until #20972 lands |
There was a problem hiding this comment.
Pull request overview
This PR replaces webpack’s persistent filesystem cache implementation and serialization stack with a new compact binary format (Encoder/Decoder + FileStore) and a segmented on-disk cache strategy (DiskCacheStrategy + cache index/segments), updating existing serializable classes and tests accordingly.
Changes:
- Introduce a new binary serialization system (
lib/serialization/*) with type registration viaTypeRegistry. - Replace the pack-file filesystem cache with a new disk cache layout (
DiskCacheStrategy, cache index + immutable segments). - Update core webpack classes to use the new Encoder/Decoder serializer context API and add focused unit tests for the new serializer behavior.
Reviewed changes
Copilot reviewed 137 out of 138 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| .changeset/new-cache-serialization-format.md | Marks the cache format/serialization change as a major release note. |
| lib/AsyncDependenciesBlock.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/ContextModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/DependenciesBlock.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/Dependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/ExportsInfo.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/ExternalModule.js | Updates serializer context typedefs and serializer registration typing docs. |
| lib/FileSystemInfo.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/InitFragment.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/Module.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/NormalModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/RawModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/WebpackOptionsApply.js | Switches filesystem cache strategy from PackFileCacheStrategy to DiskCacheStrategy. |
| lib/asset/RawDataUrlModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/cache/DiskCacheStrategy.js | Adds new segmented on-disk cache strategy implementation. |
| lib/cache/IdleFileCachePlugin.js | Updates strategy type reference to DiskCacheStrategy. |
| lib/cache/MemoryCachePlugin.js | Refactors to use shared MemoryStore helper. |
| lib/cache/MemoryStore.js | Adds reusable in-memory cache store with optional generational GC. |
| lib/cache/MemoryWithGcCachePlugin.js | Refactors to delegate memory GC behavior to MemoryStore. |
| lib/cache/ResolverCachePlugin.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/cache/format/CacheIndex.js | Adds serializable cache index + entry/segment metadata types. |
| lib/cache/format/CacheSegment.js | Adds serializable segment container for immutable segment payloads. |
| lib/cache/format/SegmentManager.js | Adds segment persistence/loading, GC, compaction, and orphan sweeping. |
| lib/container/ContainerEntryModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/container/ContainerExposedDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/container/FallbackDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/container/FallbackModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/container/RemoteModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/css/CssModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/AMDDefineDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/AMDRequireArrayDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/AMDRequireContextDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/AMDRequireDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CachedConstDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CommonJsExportRequireDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CommonJsExportsDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CommonJsFullRequireDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CommonJsRequireContextDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CommonJsRequireDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CommonJsSelfReferenceDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ConstDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ContextDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ContextElementDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CreateScriptUrlDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CssIcssExportDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CssIcssImportDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CssIcssSymbolDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CssImportDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/CssUrlDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/DllEntryDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ExportsInfoDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ExternalModuleDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ExternalModuleInitFragment.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ExternalModuleInitFragmentDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/HarmonyAcceptDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/HarmonyEvaluatedImportSpecifierDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/HarmonyExportExpressionDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/HarmonyExportHeaderDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/HarmonyExportImportedSpecifierDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/HarmonyExportSpecifierDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/HarmonyImportDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/HarmonyImportSpecifierDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/HtmlScriptSrcDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/HtmlSourceDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ImportContextDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ImportDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/JsonExportsDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/LocalModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/LocalModuleDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ModuleDecoratorDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ModuleDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ModuleInitFragmentDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/ProvidedDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/PureExpressionDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/RequireEnsureDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/RequireHeaderDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/RequireResolveContextDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/RequireResolveHeaderDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/RuntimeRequirementsDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/StaticExportsDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/URLContextDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/URLDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/UnsupportedDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/WebAssemblyExportImportedDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/WebAssemblyImportDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dependencies/WorkerDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dll/DelegatedModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/dll/DllModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/errors/HookWebpackError.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/errors/JSONParseError.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/errors/ModuleBuildError.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/errors/ModuleError.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/errors/ModuleParseError.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/errors/ModuleWarning.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/errors/WebpackError.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/index.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/json/JsonData.js | Keeps external serializer registration, now backed by new util/serialization API. |
| lib/optimize/ConcatenatedModule.js | Updates serializer context typedef to Decoder. |
| lib/serialization/AggregateErrorSerializer.js | Removes legacy AggregateError serializer (replaced by builtins codec). |
| lib/serialization/ArraySerializer.js | Removes legacy array serializer middleware. |
| lib/serialization/DateObjectSerializer.js | Removes legacy Date serializer (now native tag in binary format). |
| lib/serialization/Decoder.js | Adds binary decoder implementation. |
| lib/serialization/Encoder.js | Adds binary encoder implementation. |
| lib/serialization/ErrorObjectSerializer.js | Removes legacy Error serializer (replaced by builtins codec). |
| lib/serialization/FileMiddleware.js | Removes legacy pack-file middleware implementation. |
| lib/serialization/FileStore.js | Adds file-based store supporting compression and separate values. |
| lib/serialization/Lazy.js | Adds shared lazy value helpers (create/is/serialize/deserialize). |
| lib/serialization/MapObjectSerializer.js | Removes legacy Map serializer (now native tag in binary format). |
| lib/serialization/NullPrototypeObjectSerializer.js | Removes legacy null-proto object serializer (now native tag). |
| lib/serialization/PlainObjectSerializer.js | Removes legacy plain-object serializer (now native tag). |
| lib/serialization/Reader.js | Adds binary reader implementation. |
| lib/serialization/RegExpObjectSerializer.js | Removes legacy RegExp serializer (now native tag). |
| lib/serialization/Serializer.js | Replaces middleware pipeline with Encoder/Decoder-based implementation. |
| lib/serialization/SerializerMiddleware.js | Removes legacy middleware base (replaced by direct binary format). |
| lib/serialization/SetObjectSerializer.js | Removes legacy Set serializer (now native tag). |
| lib/serialization/SingleItemMiddleware.js | Removes legacy single-item middleware. |
| lib/serialization/TypeRegistry.js | Adds codec registry + lazy loader mechanism for serializable types. |
| lib/serialization/Writer.js | Adds binary writer implementation. |
| lib/serialization/builtins.js | Adds built-in codecs for Error types (and AggregateError when available). |
| lib/serialization/format.js | Defines the binary serialization tag format constants. |
| lib/serialization/types.js | Removes legacy serialization type typedefs file. |
| lib/sharing/ConsumeSharedModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/sharing/ProvideSharedDependency.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/sharing/ProvideSharedModule.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/stats/DefaultStatsFactoryPlugin.js | Replaces typedef imports with local structural error typedefs. |
| lib/util/LazySet.js | Updates serializer context typedefs in docs to Decoder/Encoder. |
| lib/util/internalSerializables.js | Updates internal serializables mapping for new cache/format classes. |
| lib/util/makeSerializable.js | Reworks makeSerializable to register codecs via TypeRegistry. |
| lib/util/registerExternalSerializer.js | Updates serializer context typedefs to Decoder/Encoder. |
| lib/util/serialization.js | Reworks public serialization utilities to use TypeRegistry + new Serializer/FileStore. |
| lib/wasm-async/AsyncWebAssemblyModulesPlugin.js | Updates serializer context typedefs to Decoder/Encoder. |
| test/BinaryMiddleware.unittest.js | Removes legacy BinaryMiddleware unit tests. |
| test/Serializer.unittest.js | Adds unit tests for new Serializer, lazy values, separate files, and references. |
| tooling/print-cache-file.js | Updates tooling to deserialize with the new file serializer and print results. |
| types.d.ts | Updates public type declarations for new serialization API and cache strategy. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return this._getIndex() | ||
| .then(async (index) => { | ||
| const entry = index.entries.get(identifier); | ||
| if (!entry) return undefined; | ||
| if (entry.etag !== stringEtag) return null; | ||
| entry.lastAccess = Date.now(); | ||
| const segment = await this.segmentManager.loadSegment( | ||
| index, | ||
| entry.segmentId | ||
| ); | ||
| return segment.get(identifier); |
| const oldMap = await this.loadSegment(index, id); | ||
| /** @type {Map<string, { etag: string | null, data: Data }>} */ | ||
| const fresh = new Map(); | ||
| for (const identifier of live) { | ||
| const entry = index.entries.get(identifier); | ||
| if (!entry) continue; | ||
| fresh.set(identifier, { | ||
| etag: entry.etag, | ||
| data: oldMap.get(identifier) | ||
| }); | ||
| } |
|
Ran some local benchmarks comparing the old (ObjectMiddleware + BinaryMiddleware pipeline) vs new (Encoder/Decoder) serialization on representative webpack-like data (5000 modules, 15000 dependencies, realistic path lengths and repeated type strings). Deserialization performance concern Serialization is ~1.6x faster — the middleware pipeline elimination clearly pays off. However, deserialization shows a slight regression in this test shape, and in other data shapes (more nested objects, Sets) the gap widened to ~1.5x slower for the new format. The whole point of persistent cache is to speed up subsequent builds — deserialization (cache restore) runs on every cold start, while serialization only runs when content changes. A regression on the read path directly undermines the primary purpose of the cache system. The root cause is architectural: the old BinaryMiddleware batches consecutive same-type values (N nulls → 2 bytes via RLE, N booleans → bit-packed, N integers → single header + N values), which gives the CPU predictable linear reads. The new Encoder dispatches per-value via tag bytes, which introduces more branches per value decoded. It would be great to see deserialization benchmarks on real-world projects (e.g. the webpack codebase itself, or a large app with thousands of modules) to quantify the actual impact on cold-start cache restore time. The architectural simplification (4-layer middleware → single-pass Encoder/Decoder) and string deduplication are solid improvements — the deserialization path is the main thing worth investigating before this lands. |
|
@avivkeller We can't ship it in webpack@5, it is a breaking change, also a lot of changes in one PR, I can't review and validate, for such changes we need discussion firstly, I don't know architecture and to be honestly from this logic we bring nothing expect a lot of breaking changes |
Tip
I'm testing how this PR impacts benchmarks. Local benchmarks show a 4-5x improvement in caching time, but I want to open a PR to validate those gains.
Replaces webpack's persistent cache serialization stack and filesystem cache layout with a new compact binary serializer and segmented disk cache format.
The old middleware/object-serializer based pack-file cache is replaced by:
DiskCacheStrategy, which stores a validated cache index plus immutable cache segments.lib/serialization/{Encoder,Decoder,Reader,Writer,FileStore,Lazy,TypeRegistry}, which provide the new binary serialization system.lib/cache/format/{CacheIndex,CacheSegment,SegmentManager}, which track entries, segment metadata, build snapshots, dependency resolution state, garbage collection, and compaction.serialize/deserializecontext API.