Core: introduce shared authentication refresh executor#12563
Conversation
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
69ab7f4 to
9587a2e
Compare
|
@nastra @danielcweeks can we consider this for 1.10 please? |
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
|
Pinging @nastra and @danielcweeks again: do you think we can consider this one for 1.10? |
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
|
@nastra @danielcweeks : ping for review, please. |
| * A shared {@link ScheduledExecutorService} that REST catalogs can use for refreshing their | ||
| * authentication data. | ||
| */ | ||
| public static ScheduledExecutorService getAuthRefreshPool() { |
There was a problem hiding this comment.
| public static ScheduledExecutorService getAuthRefreshPool() { | |
| public static ScheduledExecutorService authRefreshPool() { |
I'm aware that we already have some methods in this class with a get prefix, but those methods have been added at the very beginning and haven't been renamed
| /** | ||
| * An {@link AuthManager} that provides machinery for refreshing authentication data asynchronously, | ||
| * using a background thread pool. | ||
| * @deprecated since 1.9.0, will be removed in 1.10.0; use {@link ThreadPools#getAuthRefreshPool()}. |
There was a problem hiding this comment.
version info needs to be updated here, since the deprecation applies with 1.10.0 and we can only remove with 1.11.0
| * using a background thread pool. | ||
| * @deprecated since 1.9.0, will be removed in 1.10.0; use {@link ThreadPools#getAuthRefreshPool()}. | ||
| */ | ||
| @Deprecated |
There was a problem hiding this comment.
if we're deprecating this class, then I don't think we should be removing functionality here?
There was a problem hiding this comment.
in fact I believe we should only be deprecating this here without updating the functionality (as that would be a breaking change). Once RefreshingAuthManager is removed, we can switch the OAuth2Manager to use the new auth refresh pool, but I don't think we can deprecate + switch within the same release
There was a problem hiding this comment.
@nastra I don't mind doing the change in 2 steps, but in which case do you think this would be a breaking change? The API is not affected (and revapi didn't complain).
There was a problem hiding this comment.
what if somebody else from the community is extending RefreshingAuthManager and relying on its functionality? We're effectively removing the "refresh" part, which is a breaking change in behavior, even though RevAPI wasn't complaining
There was a problem hiding this comment.
What we can do though, is revert the changes to RefreshingAuthManager just in case someone is extending that class.
There was a problem hiding this comment.
Sorry I commented before I saw your comment 😅 – I agree that's an issue.
There was a problem hiding this comment.
What we can do though, is revert the changes to
RefreshingAuthManagerjust in case someone is extending that class.
Yes that's what I was proposing in my earlier comment
Currently, each `AuthManager` instance manages its own refresh pool. During the lifecycle of the catalog session, a few `AuthManager` instances may be created: one for the catalog itself, and a few more in components like the S3 signer client, and credential refreshers. Each one of them is currently creating their own pool, which is a waste of resources. This PR introduces a shared refresh pool for use by `AuthManager`s, thus allowing to minimize the amount of threads allocated for auth session refreshes. The shared pool is closed by a shutdown hook.
9587a2e to
6f256de
Compare
|
FYI I rebased the PR since it's been a while. |
| private long startTimeMillis; | ||
| private OAuthTokenResponse authResponse; | ||
| private AuthSessionCache sessionCache; | ||
| private ScheduledExecutorService refreshExecutor; |
There was a problem hiding this comment.
I don't think we need any changes to this class. We would update it once RefreshingAuthManager has been removed right?
There was a problem hiding this comment.
Well, for me there wasn't any risk of breaking change here – but OK. Let me change this.
| new ConfigEntry<>( | ||
| "iceberg.rest.auth.refresh.num-threads", | ||
| "ICEBERG_AUTH_REFRESH_NUM_THREADS", | ||
| 1, |
There was a problem hiding this comment.
given that we eventually want to share this thread pool, I wonder whether the default should be higher
There was a problem hiding this comment.
I think 1 is fine. Most of the time, we'll have only 1 auth manager per JVM, and sometimes 2 (e.g. request signing + credential refreshing) – but realistically, I think we won't see more than 2 per JVM. Given that the typical usage of this executor is to schedule token refreshes every hour or so, 1 is probably fine (also: this is the core pool size, not the maximum pool size). Wdyt?
There was a problem hiding this comment.
ah right, yeah ok I think 1 should be fine then
| assertThat(validatingSigner.icebergSigner) | ||
| .extracting("authManager") | ||
| .extracting("refreshExecutor") | ||
| assertThat(ThreadPools.authRefreshPool()) |
There was a problem hiding this comment.
I think we should roll back these test changes here and in the other test class, since nothing really changed in terms of the implementation
| try (OAuth2Manager manager = new OAuth2Manager("test"); | ||
| OAuth2Util.AuthSession session = manager.initSession(client, properties)) { | ||
| assertThat(session.headers()).isEmpty(); | ||
| assertThat(manager) |
There was a problem hiding this comment.
see my other comment (#12563 (comment)), so we should be rolling back the changes here
nastra
left a comment
There was a problem hiding this comment.
LGTM, I'll also wait a bit in case @danielcweeks wants to review this
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
|
@nastra @danielcweeks can we try to get this in please? It was already approved. |
Currently, each
AuthManagerinstance manages its own refresh pool. During the lifecycle of the catalog session, a fewAuthManagerinstances may be created: one for the catalog itself, and a few more in components like the S3 signer client, and credential refreshers. Each one of them is currently creating their own pool, which is a waste of resources.This PR introduces a shared refresh pool for use by
AuthManagers, thus allowing to minimize the amount of threads allocated for auth session refreshes. The shared pool is closed by a shutdown hook.