Flink: Fix HashKeyGenerator SelectorKey cache ignoring writeParallelism and distributionMode by Below0 · Pull Request #15740 · apache/iceberg

Below0 · 2026-03-23T16:53:54Z

Problem

HashKeyGenerator.SelectorKey was missing writeParallelism and distributionMode from its equals() and hashCode() methods. As a result, computeIfAbsent always hit the cache after the first record for a given table, silently reusing a stale KeySelector even when these values changed.

This contradicts the class-level Javadoc which states:

"Caching ensures that a new key selector is also created when … the user-provided metadata changes (e.g. distribution mode, write parallelism)."

Fix

Add writeParallelism and distributionMode to SelectorKey's fields, equals(), hashCode(), and toString(). The effective values passed to the cache key match those used in the computeIfAbsent lambda — distributionMode normalized via firstNonNull(..., NONE) and writeParallelism capped at maxWriteParallelism.

Note

writeParallelism and distributionMode should remain stable per table during a streaming job. Changing these values mid-stream — especially when equality fields are set — can cause routing changes that break equality delete co-location, as the subtask assignment is not monotonic across different writeParallelism values (i.e., the subtask set for parallelism N is not guaranteed to be a subset of the set for parallelism N+1).

Making the subtask assignment monotonic (e.g., via a consistent ordering based on maxWriteParallelism) could address this limitation in a follow-up.

Testing

Added two regression tests to TestHashKeyGenerator:

testCacheMissOnWriteParallelismChange
testCacheMissOnDistributionModeChange

Closes #15731

mxm

Thank you for the PR @Below0! Changes look good. Could you keep only the Flink 2.1 changes for now? We backport to the other Flink versions in a separate step.

Below0 · 2026-03-25T01:48:15Z

@mxm
Sure! I'll remove the changes for the other Flink versions and keep only the Flink 2.1 changes.

…sm and distributionMode

mxm

LGTM

mxm · 2026-03-25T14:06:20Z

+            MoreObjects.firstNonNull(dynamicRecord.distributionMode(), DistributionMode.NONE),
+            Math.min(dynamicRecord.writeParallelism(), maxWriteParallelism));


I see you added the same logic as for creating the actual KeySelector below. Should we consolidate the SelectorKey and the getKeySelector (below) parameters?

We can do that in a follow-up.

pvary · 2026-03-25T15:07:49Z

Merged to main.
Thanks @Below0 for the fix and @mxm for the review!

mxm · 2026-03-25T15:08:15Z

Thanks @Below0! Could you create a PR for backporting to 1.20 and 2.0?

Below0 · 2026-03-25T15:30:33Z

@mxm
Sure! I'll create backport PRs for v1.20 and v2.0.

…sm and distributionMode (apache#15740)

github-actions Bot added the flink label Mar 23, 2026

Below0 force-pushed the fix-selector-key-cache-missing-fields branch 3 times, most recently from 2a44c8b to 68dc0d9 Compare March 23, 2026 17:08

Below0 marked this pull request as draft March 23, 2026 17:12

Below0 marked this pull request as ready for review March 23, 2026 17:34

mxm reviewed Mar 24, 2026

View reviewed changes

Below0 force-pushed the fix-selector-key-cache-missing-fields branch 2 times, most recently from c148ead to 9fea422 Compare March 25, 2026 02:10

Flink: Fix HashKeyGenerator SelectorKey cache ignoring writeParalleli…

d1c0d5b

…sm and distributionMode

Below0 force-pushed the fix-selector-key-cache-missing-fields branch from 9fea422 to d1c0d5b Compare March 25, 2026 02:11

Below0 requested a review from mxm March 25, 2026 05:08

mxm approved these changes Mar 25, 2026

View reviewed changes

pvary approved these changes Mar 25, 2026

View reviewed changes

pvary merged commit 027f088 into apache:main Mar 25, 2026
16 checks passed

Below0 mentioned this pull request Mar 25, 2026

Flink: Backport: Fix HashKeyGenerator SelectorKey cache ignoring writeParallelism and distributionMode #15762

Merged

manuzhang pushed a commit to manuzhang/iceberg that referenced this pull request Mar 30, 2026

Flink: Fix HashKeyGenerator SelectorKey cache ignoring writeParalleli…

ecf6209

…sm and distributionMode (apache#15740)

ldudas-marx pushed a commit to ldudas-marx/iceberg that referenced this pull request Mar 31, 2026

Flink: Fix HashKeyGenerator SelectorKey cache ignoring writeParalleli…

5120c68

…sm and distributionMode (apache#15740)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flink: Fix HashKeyGenerator SelectorKey cache ignoring writeParallelism and distributionMode#15740

Flink: Fix HashKeyGenerator SelectorKey cache ignoring writeParallelism and distributionMode#15740
pvary merged 1 commit into
apache:mainfrom
Below0:fix-selector-key-cache-missing-fields

Below0 commented Mar 23, 2026

Uh oh!

mxm left a comment

Uh oh!

Below0 commented Mar 25, 2026

Uh oh!

mxm left a comment

Uh oh!

mxm Mar 25, 2026

Uh oh!

mxm Mar 25, 2026

Uh oh!

Uh oh!

pvary commented Mar 25, 2026

Uh oh!

mxm commented Mar 25, 2026

Uh oh!

Below0 commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		MoreObjects.firstNonNull(dynamicRecord.distributionMode(), DistributionMode.NONE),
		Math.min(dynamicRecord.writeParallelism(), maxWriteParallelism));

Uh oh!

Conversation

Below0 commented Mar 23, 2026

Problem

Fix

Note

Testing

Uh oh!

mxm left a comment

Choose a reason for hiding this comment

Uh oh!

Below0 commented Mar 25, 2026

Uh oh!

mxm left a comment

Choose a reason for hiding this comment

Uh oh!

mxm Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

mxm Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pvary commented Mar 25, 2026

Uh oh!

mxm commented Mar 25, 2026

Uh oh!

Below0 commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants