Core: Enhance remove snapshots efficiency by executing them in bulk#12670
Conversation
| class RemoveSnapshots implements MetadataUpdate { | ||
| private final Set<Long> snapshotIds; | ||
|
|
||
| public RemoveSnapshot(long snapshotId) { | ||
| this.snapshotId = snapshotId; | ||
| public RemoveSnapshots(Set<Long> snapshotIds) { | ||
| this.snapshotIds = snapshotIds; | ||
| } | ||
|
|
||
| public long snapshotId() { | ||
| return snapshotId; | ||
| public Set<Long> snapshotIds() { | ||
| return snapshotIds; | ||
| } | ||
|
|
||
| @Override | ||
| public void applyTo(TableMetadata.Builder metadataBuilder) { | ||
| metadataBuilder.removeSnapshots(ImmutableSet.of(snapshotId)); | ||
| metadataBuilder.removeSnapshots(snapshotIds); | ||
| } |
There was a problem hiding this comment.
We should leaveRemoveSnapshot update as is since this is still a valid update type in the protocol. I think the solution should involve some sort of buffering of the snapshots to remove instead of eagerly rewriting (it's the eager rewriting of the current snapshots in the builder that leads to the n^2 behavior). Only at the end, once all the snapshots to remove have been accumulated, would we want to go ahead and rewrite. Let me take a deeper look and get back with a concrete solution
There was a problem hiding this comment.
We should leaveRemoveSnapshot update as is since this is still a valid update type in the protocol.
@amogh-jahagirdar On the protocol, we already defined a list of snapshots to be removed... However, we just pass a single snapshot ID in the list. Why can't we change to have all snapshot IDs? Is there a reason for this to be like that?
There was a problem hiding this comment.
@ricardopereira33 Ah you are completely right I think, I misread the protocol, RemoveSnapshots https://github.com/apache/iceberg/blame/main/open-api/rest-catalog-open-api.yaml#L2833 already operates on a list of snapshots. it's seems like the implementation should actually be more open. Currently looks like there's some strict validation that there's only 1 element https://github.com/apache/iceberg/blame/main/core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java#L559
I'm not sure the context on this. But fundamentally I think you are correct there's no protocol violation here. I'd need to see why this was implemented this way to begin with I also see this ToDo https://github.com/apache/iceberg/blame/main/core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java#L420
For general library compatibility though we probably need to keep this update type, potentially deprecating it in favor of the new one.
There was a problem hiding this comment.
I can keep the RemoveSnapshot as a deprecated class 👍🏽
I'm bringing the contributors of that validation to the loop: @nastra @rdblue. Do you know why do we have this validation to restrict just 1 snapshot?
There was a problem hiding this comment.
yes the RemoveSnapshot class needs to stay so that we don't break existing APIs, but should be marked deprecated (and refer to using the new API - we do it similarly in other places that are deprecated). Then you can introduce a new RemoveSnapshots API, which the TableMetadata Builder would call
e3110ee to
48cfacd
Compare
| .put(MetadataUpdate.RemovePartitionStatistics.class, REMOVE_PARTITION_STATISTICS) | ||
| .put(MetadataUpdate.AddSnapshot.class, ADD_SNAPSHOT) | ||
| .put(MetadataUpdate.RemoveSnapshot.class, REMOVE_SNAPSHOTS) | ||
| .put(MetadataUpdate.RemoveSnapshots.class, REMOVE_SNAPSHOTS) |
There was a problem hiding this comment.
@nastra @amogh-jahagirdar In order to keep both RemoveSnapshot (deprecated) and RemoveSnapshots (new), we need to define a custom process when we are parsing the metadata updates from the JSON. Currently, RemoveSnapshot is pointing to the keyword remove-snapshots.
To add a new MetadataUpdate option
RemoveSnapshots, the alternatives are:
- Reuse the same keyword, and add a custom condition to distinguish
RemoveSnapshotandRemoveSnapshots. - Create a new keyword and have different parsing process for each metadata update.
However, the current protocol expects a list of values for the key remove-snapshots, so even if we replace the RemoveSnapshot completely, we are still using the protocol correctly. WDYT about this?
There was a problem hiding this comment.
I'm not sure if this is what was meant by the first alternative presented but I think we could just have 2 cases in the interim (until RemoveSnapshot is removed after a deprecation cycle)
.put(MetadataUpdate.RemoveSnapshots.class, REMOVE_SNAPSHOTS)
.put(MetadataUpdate.RemoveSnapshot.class, REMOVE_SNAPSHOTS)
When serializing the update, we'd be able to distinguish if it's removeSnapshot or RemoveSnapshots (instanceOf). you could update writeRemoveSnapshots to take in just a RemoveSnapshots instance, and for the RemoveSnapshot case, you'd construct a RemoveSnapshots passing in a list with a single snapshot element.
Then in the future once RemoveSnapshot is removed after deprecation, we could then just remove the additional handling for that case, and update the tests which have some expectations on the produced update types.
Let me know if that makes sense or not!
There was a problem hiding this comment.
Reuse the same keyword, and add a custom condition to distinguish RemoveSnapshot and RemoveSnapshots
@amogh-jahagirdar yes, that was exactly what I meant for the first option 😆 I will prepare the changes today, thank you!
| MetadataUpdate metadataUpdate; | ||
| if (snapshotIds.size() == 1) { | ||
| Long snapshotId = Iterables.getOnlyElement(snapshotIds); | ||
| metadataUpdate = new MetadataUpdate.RemoveSnapshot(snapshotId); |
There was a problem hiding this comment.
To keep the same behaviour as before, and to not break existing tests.
|
@ricardopereira33 overall the approach LGTM, I just left a few minor comments that would be good to address |
b7241e9 to
cf6b395
Compare
bbc63cc to
d363cf2
Compare
nastra
left a comment
There was a problem hiding this comment.
thanks @ricardopereira33 for fixing this. Let's also wait for @amogh-jahagirdar in case he has any feedback (we might also need to update the deprecation message depending on whether this will make it into 1.9.0)
There was a problem hiding this comment.
Approved with a nit. If we could address that along with the deprecation message to be 1.9.0, that'd be great
Thank you @ricardopereira33 for taking this on!
Edit: Ignore the nit, but we'll still need to update the deprecation message.
|
@nastra @amogh-jahagirdar Pushed the change you requested. Thank you guys for the fast reviews and feedback! 💪🏽 |
|
Thanks @ricardopereira33 and thanks @nastra for reviewing! |
| } | ||
|
|
||
| if (!snapshotIdsToRemove.isEmpty()) { | ||
| changes.add(new MetadataUpdate.RemoveSnapshots(snapshotIdsToRemove)); |
There was a problem hiding this comment.
Question: Since we are sending RemoveSnapshots instead of RemoveSnapshot, isn't it a breaking change? Old server cannot understand this message. We need to update (about implement handling of new payload) the server to understand this new payload?
There was a problem hiding this comment.
@ajantha-bhat the MetadataUpdateParser will still send a RemoveSnapshot for a single element for older clients
There was a problem hiding this comment.
@ajantha-bhat check out this thread #12670 (comment)
Essentially the protocol was always RemoveSnapshots (servers were always handling this kind of message), and we're not deviating from that. RemoveSnapshot was just an internal in-memory representation. Let me know if that makes sense.
There was a problem hiding this comment.
From the protocol level, it's backward compatible while since we have the following check
snapshotIds != null && snapshotIds.size() == 1, the new client against an old version which uses the old MetadataUpdateParser would still fail because of this check, right?
There was a problem hiding this comment.
@sfc-gh-aixu: Thats what I observed too and started this thread. We had to update the server to new Iceberg version to make it work.
…n bulk (apache#12670)" This reverts commit 06f667a.
…anges Rather than in Bulk (#13100)
…hot Changes Rather than in Bulk (apache#13100)
This PR changes the
remove-snapshotsoperation to run in bulk instead of single operations on the Rest Catalog server.The remove-snapshots operation signature already accepts a list of snapshots IDs. However, each snapshot ID is stored in a dedicated
Metadataobject. This makes the operations not very efficient, causing performance issues, mentioned on: #12642.The PR changes:
MetadataUpdateclassRemoveSnapshottoRemoveSnapshots.RemoveSnapshotsinstead of an individual snapshot ID.