Skip to content

Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands#6633

Merged
aokolnychyi merged 1 commit into
apache:masterfrom
aokolnychyi:fix-merge-planning
Jan 24, 2023
Merged

Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands#6633
aokolnychyi merged 1 commit into
apache:masterfrom
aokolnychyi:fix-merge-planning

Conversation

@aokolnychyi

Copy link
Copy Markdown
Contributor

This PR fixes predicate pushdown for copy-on-write MERGE commands, which was broken after #6534. This change contains a test that would previously fail and lead to a data correctness issue.

@github-actions github-actions Bot added the spark label Jan 20, 2023
val readRelation = buildRelationWithAttrs(relation, operationTable, metadataAttrs)
val readAttrs = readRelation.output

val (targetCond, joinCond) = splitMergeCond(cond, readRelation)

@aokolnychyi aokolnychyi Jan 20, 2023

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted changes in #6534 for copy-on-write operations. It was not safe as pushing the join condition into a filter on the left side is not safe in LeftOuter and FullOuter joins. It changes the output, which can lead to loosing records that did not match the condition (see the new test).

@amogh-jahagirdar amogh-jahagirdar left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation @aokolnychyi I added the test before the fix done here, and stepped through the debugger to see why it was failing before the fix. Also stepped through the MoR pushdown cases, I understand it better now. This fix makes sense to me, thanks!

@RussellSpitzer RussellSpitzer left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We went over an in-depth walkthrough of this code, looks like this is the right thing to do

@aokolnychyi aokolnychyi merged commit 127f887 into apache:master Jan 24, 2023
@aokolnychyi

Copy link
Copy Markdown
Contributor Author

Thank you, @amogh-jahagirdar @RussellSpitzer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants