Skip to content

Core: Fix metadata table scans when snapshot is null#3812

Merged
rdblue merged 1 commit into
apache:masterfrom
bryanck:metadata-table-fix
Dec 29, 2021
Merged

Core: Fix metadata table scans when snapshot is null#3812
rdblue merged 1 commit into
apache:masterfrom
bryanck:metadata-table-fix

Conversation

@bryanck

@bryanck bryanck commented Dec 27, 2021

Copy link
Copy Markdown
Contributor

This PR fixes an issue when scanning the snapshots (and history) metadata tables. When the current snapshot for a table is not set, then no snapshots will be returned during the scan. A table can get in this state if it has snapshots, then a REPLACE TABLE operation is performed. In this case, the snapshots are still present but will not be returned because the current snapshot is not set. If some data is committed to the table, and the current snapshot is set to something, then the previous snapshots that were not returned will show up again.

Here is an example on how to reproduce with Spark SQL:

create table local.default.foobar (id int) using iceberg;  
insert into local.default.foobar values (1);  
replace table local.default.foobar (id int, data string) using iceberg;  
select * from local.default.foobar.snapshots; -- no snapshots returned  

@bryanck bryanck changed the title Fixes metadata table scans when snapshot is null Core: Fix metadata table scans when snapshot is null Dec 27, 2021
@github-actions github-actions Bot added the core label Dec 27, 2021
@danielcweeks danielcweeks requested a review from rdblue December 27, 2021 22:39
@bryanck

bryanck commented Dec 27, 2021

Copy link
Copy Markdown
Contributor Author

To give a little bit more detail, the SnapshotsTableScan and HistoryTableScan inherit from StaticTableScan to bypass the null snapshot check. However, during planning, newRefinedSearch() is called, which creates a StaticTableScan that does not override the null check. So this PR inherits newRefinedSearch() in the subclasses so they will instantiate the subclass (and bypass the null check).

@rdblue rdblue merged commit 246afd6 into apache:master Dec 29, 2021
@rdblue

rdblue commented Dec 29, 2021

Copy link
Copy Markdown
Contributor

Thanks, @bryanck!

@bryanck bryanck deleted the metadata-table-fix branch January 3, 2022 05:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants