Skip to content

Spec: Add implementation note on current-snapshot-id#12334

Merged
Fokko merged 8 commits into
apache:mainfrom
Fokko:fd-current-snapshot
Mar 6, 2025
Merged

Spec: Add implementation note on current-snapshot-id#12334
Fokko merged 8 commits into
apache:mainfrom
Fokko:fd-current-snapshot

Conversation

@Fokko

@Fokko Fokko commented Feb 19, 2025

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot added the Specification Issues that may introduce spec changes. label Feb 19, 2025
Comment thread format/spec.md Outdated
Co-authored-by: Russell Spitzer <russell.spitzer@GMAIL.COM>
Comment thread format/spec.md Outdated
Fokko and others added 2 commits February 19, 2025 22:04
* Add spec notes for snapshot id assignment

* Add launage for recommended generation of snapshot id

---------

Co-authored-by: Fokko Driesprong <fokko@apache.org>
Comment thread format/spec.md Outdated
Comment thread format/spec.md Outdated
Comment thread format/spec.md
Comment thread format/spec.md Outdated

The reference implementation uses a type 4 uuid and XORs the 4 most significant bytes with the 4 least significant bytes then ANDs with the maximum long value to arrive at a pseudo-random snapshot id with a low probability of collision.

The reference Java implementation writes `-1` for "no current snapshot" with V1 and V2 tables and considers this equivalent to omitted or `null`. This has never been formalized in the spec but for compatibility, other implementations can accept `-1` as `null`. Java will no longer write `-1` and will use `null` for "no current snapshot" for all tables with a version greater than or equal to V3.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Still have the confusion as #12334 (comment) , are these two sentence referring to the same implementations? To me, the language suggest they are separate ones ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The reference Java implementation writes `-1` for "no current snapshot" with V1 and V2 tables and considers this equivalent to omitted or `null`. This has never been formalized in the spec but for compatibility, other implementations can accept `-1` as `null`. Java will no longer write `-1` and will use `null` for "no current snapshot" for all tables with a version greater than or equal to V3.
Java writes `-1` for "no current snapshot" with V1 and V2 tables and considers this equivalent to omitted or `null`. This has never been formalized in the spec, but for compatibility, other implementations can accept `-1` as `null`. Java will no longer write `-1` and will use `null` for "no current snapshot" for all tables with a version greater than or equal to V3.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szehon-ho Good point; thanks for iterating on this. What do you think of this form? It doesn't read well to call out The reference Java implementation twice. WDYT?

@szehon-ho szehon-ho Mar 3, 2025

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just put 'The reference Java implementation' in the first pargraph then the first time we mention it?

The reference Java implementation uses a type 4 uuid and XORs the 4 most significant bytes with the 4 least significant bytes then ANDs with the maximum long value to arrive at a pseudo-random snapshot id with a low probability of collision.

And then can use Java threafter (including the second paragraph)? Then its more clear that both paragraph refer to the same one.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread format/spec.md Outdated
Fokko and others added 2 commits March 3, 2025 22:31
Co-authored-by: Renjie Liu <liurenjie2008@gmail.com>
@Fokko Fokko merged commit 41ed450 into apache:main Mar 6, 2025
@Fokko

Fokko commented Mar 6, 2025

Copy link
Copy Markdown
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Specification Issues that may introduce spec changes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants