Skip to content

Core: Store split offset for delete files#7011

Merged
szehon-ho merged 7 commits into
apache:masterfrom
singhpk234:feature/store-split-offset-delete-files
Mar 24, 2023
Merged

Core: Store split offset for delete files#7011
szehon-ho merged 7 commits into
apache:masterfrom
singhpk234:feature/store-split-offset-delete-files

Conversation

@singhpk234

@singhpk234 singhpk234 commented Mar 4, 2023

Copy link
Copy Markdown
Contributor

Fixes #6659

@github-actions github-actions Bot added the core label Mar 4, 2023
Comment thread core/src/main/java/org/apache/iceberg/FileMetadata.java Outdated
@rdblue

rdblue commented Mar 5, 2023

Copy link
Copy Markdown
Contributor

Overall this looks fine, but I see CI failures.

@github-actions github-actions Bot added the spark label Mar 6, 2023
@singhpk234 singhpk234 changed the title [WIP] Core: Store split offset for delete files Core: Store split offset for delete files Mar 6, 2023
@jackye1995 jackye1995 self-requested a review March 6, 2023 17:48
@singhpk234 singhpk234 force-pushed the feature/store-split-offset-delete-files branch from 5e4868d to 0d1f29b Compare March 6, 2023 19:46

@jackye1995 jackye1995 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, I checked all existing Util classes, did not find a good one to put this method either, so I am good with adding a KryoUtil class as well, unless there are other better suggestions.

@szehon-ho szehon-ho left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @singhpk234 for this, it is great, great to see the KryoUtil refactoring as well, just one suggestion for test for consideration.

Comment thread core/src/test/java/org/apache/iceberg/TestSplitPlanning.java Outdated

@amogh-jahagirdar amogh-jahagirdar left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @singhpk234 this is great, minor comments and a question.

Comment thread core/src/main/java/org/apache/iceberg/FileMetadata.java
Comment thread core/src/main/java/org/apache/iceberg/FileMetadata.java
@rdblue

rdblue commented Mar 8, 2023

Copy link
Copy Markdown
Contributor

This looks fine to me overall, but I'd prefer to use ImmutableList or long[] for split offsets in the builder to avoid an unnecessary new util class and copy method.

@singhpk234 singhpk234 force-pushed the feature/store-split-offset-delete-files branch from 40ee7ed to 19b35b3 Compare March 8, 2023 17:32
@singhpk234

Copy link
Copy Markdown
Contributor Author

This looks fine to me overall, but I'd prefer to use ImmutableList or long[] for split offsets in the builder to avoid an unnecessary new util class and copy method.

Ack made the changes

@amogh-jahagirdar amogh-jahagirdar left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me @singhpk234 !

@singhpk234 singhpk234 force-pushed the feature/store-split-offset-delete-files branch from 19b35b3 to 733a8c1 Compare March 23, 2023 22:00
@singhpk234 singhpk234 requested a review from szehon-ho March 23, 2023 22:02
@singhpk234 singhpk234 force-pushed the feature/store-split-offset-delete-files branch from 733a8c1 to 3fd1781 Compare March 23, 2023 22:24

@szehon-ho szehon-ho left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks @singhpk234

@szehon-ho

Copy link
Copy Markdown
Member

Approved for test run.

@szehon-ho szehon-ho merged commit faf974e into apache:master Mar 24, 2023
@szehon-ho

Copy link
Copy Markdown
Member

Thanks @singhpk234 for change, and @rdblue @jackye1995 @amogh-jahagirdar for extra review!

private Map<Integer, ByteBuffer> upperBounds = null;
private ByteBuffer keyMetadata = null;
private Integer sortOrderId = null;
private List<Long> splitOffsets = null;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@singhpk234 I think we would need to copy/reset this field in clear() and in copy()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree @nastra, i think it go missed, let me raise a pr for this along with tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Store split offsets for delete files

6 participants