Skip to content

Spec: Deprecate Position delete files with row data#14045

Merged
pvary merged 2 commits into
apache:mainfrom
pvary:deprecate_pos_del
Sep 15, 2025
Merged

Spec: Deprecate Position delete files with row data#14045
pvary merged 2 commits into
apache:mainfrom
pvary:deprecate_pos_del

Conversation

@pvary

@pvary pvary commented Sep 10, 2025

Copy link
Copy Markdown
Contributor

Deprecate Position delete files with row data based on the following discussion thread: https://lists.apache.org/thread/8jw6pb2vq3ghmdqf1yvy8n5n6gg1fq5s

The voting thread: https://lists.apache.org/thread/tfy96bqmz1bmdxr73x17w3xxj3yzs606

The #13870 could be useful when we do the actual deprecation

@github-actions github-actions Bot added core Specification Issues that may introduce spec changes. labels Sep 10, 2025
Comment thread format/spec.md Outdated

### Position Delete Files with Row Data

The feature allowing position delete files to include row data, which was introduced in the V2 Iceberg specification, is now deprecated in V3. The Java reference implementation will deprecate this feature in version 1.11.0 and remove it in version 1.12.0, including for V2 tables. Other implementations are encouraged to follow the same deprecation and removal schedule.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit weird since we never introduce this feature anywhere. Also, the tense used in these sentences, is now deprecated, will deprecate this feature will become invalid over time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature is there in 1.10.0. Technically available (while it is not used anywhere as far as we know).
1.11.0 is not released yet, so it is not deprecated yet. That is why I have used future tense here.

But I'm open to suggestions, as English is not my mother language.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we would mention this here at all

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK either way but I lean towards mentioning it in the implementation notes (as done here) just because it was technically a capability enabled in the spec and even though we do not know of any writers actually producing the row data, on the offhand that some unknown implementation did produce it and wants to later use the Java library with the expectation to consume that, they can reference this historical context.

In terms of spec wording, I think I'd remove
"Other implementations are encouraged to adopt the same deprecation and removal timeline" because I'm not sure it changes much. If in the slim chance an unknown implementation cares about producing position deletes with row data for whatever reason then they probably also want to consume it (for whatever reason)?

@wypoon wypoon Sep 10, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is what I suggest for this section:

"Although the spec allows for including the deleted row itself (in addition to the path and position of the row in the data file) in v2 position delete files, writing the row is optional and no implementation currently writes it. The ability to write and read the row is supported in the Java implementation but is deprecated in version 1.11.0."

In PositionDelete.java, the deprecated methods have the comment "is deprecated in version 1.11.0", so we keep the same tense here. I don't think we need to say "will be removed in [the next release]" in the spec, as that is implied. In code, the method along with the javadoc comment will be removed, but we don't want to have to go back to the spec and update this section to say "... and is removed in version ..." (once it is done).

If other implementations do not even support writing/reading the deleted row, do we even need to say anything about them (in particular, that they should not do so in the future)? This appendix is "Implementation Notes", after all.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @wypoon's wording.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the changes suggested by @wypoon

@pvary pvary merged commit e0da08c into apache:main Sep 15, 2025
43 checks passed
@pvary

pvary commented Sep 15, 2025

Copy link
Copy Markdown
Contributor Author

Since the vote has been passed, I have merged the PR.
Thanks for everyone who reviewed it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Specification Issues that may introduce spec changes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants