Skip to content

Spark 3.3: [FOLLOW UP] Support AS OF SPARK SQL syntax for TimeTravel E2E#5156

Merged
rdblue merged 2 commits into
apache:masterfrom
singhpk234:fix/asOfTimeTravel
Jun 29, 2022
Merged

Spark 3.3: [FOLLOW UP] Support AS OF SPARK SQL syntax for TimeTravel E2E#5156
rdblue merged 2 commits into
apache:masterfrom
singhpk234:fix/asOfTimeTravel

Conversation

@singhpk234

@singhpk234 singhpk234 commented Jun 29, 2022

Copy link
Copy Markdown
Contributor

Closes :

  1. [SUPPORT] Add AS OF syntax support #4365
  2. When will time travel be supported in SparkSQL? #1405
  3. Does Spark sql support time-travel? #2618

About the Changes :

  1. Make SparkSession Catalog implement newly introduced Catalog methods
  2. respect newly added data frame reader options via SupportsCatalogOptions
  3. Adapt the timestamp precision returned by spark to iceberg requirements.

P.S This change would not have been possible without the awesome work done by @huaxingao in spark to support AS OF grammar in spark natively.


Testing :

Added new UT's


cc @rdblue, @aokolnychyi , @RussellSpitzer , @jackye1995 , @huaxingao , @flyrain

@github-actions github-actions Bot added the spark label Jun 29, 2022

@Override
public Optional<String> extractTimeTravelVersion(CaseInsensitiveStringMap options) {
return Optional.ofNullable(PropertyUtil.propertyAsString(options, "versionAsOf", null));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did these strings come from? Are they standard in Spark?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I picked these string from the PR that introduced time travel in spark, but these doesn't seems to be a standard in Spark as there is no defined constants for the same.

To not conflict with existing options that are already in iceberg such as as-of-timestamp & SNAPSHOT_ID. Hence used the above.

// AS OF expects the timestamp if given in long format will be of seconds precision
long timestampInSeconds = TimeUnit.MILLISECONDS.toSeconds(timestamp);
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
String formattedDate = sdf.format(new Date(timestamp));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally prefer to use LocalDateTime and its associated format method, since java.util.Date has a time zone associated with it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, will make the changes :) !

@rdblue

rdblue commented Jun 29, 2022

Copy link
Copy Markdown
Contributor

I made a couple of comments about minor things, but overall I think this looks good and will commit when tests are passing.

@rdblue rdblue merged commit 3f3a987 into apache:master Jun 29, 2022
@rdblue

rdblue commented Jun 29, 2022

Copy link
Copy Markdown
Contributor

Thanks, @singhpk234!

namrathamyske pushed a commit to namrathamyske/iceberg that referenced this pull request Jul 10, 2022
…5156)

Co-authored-by: Prashant Singh <psinghvk@amazon.com>
namrathamyske pushed a commit to namrathamyske/iceberg that referenced this pull request Jul 10, 2022
…5156)

Co-authored-by: Prashant Singh <psinghvk@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants