Skip to content

AWS: Prevent excessive creation of auth sessions in S3V4RestSignerClient#13215

Merged
danielcweeks merged 6 commits into
apache:mainfrom
adutra:signer-auth-manager-static
Jul 9, 2025
Merged

AWS: Prevent excessive creation of auth sessions in S3V4RestSignerClient#13215
danielcweeks merged 6 commits into
apache:mainfrom
adutra:signer-auth-manager-static

Conversation

@adutra

@adutra adutra commented Jun 2, 2025

Copy link
Copy Markdown
Contributor

This PR fixes an oversight in S3V4RestSignerClient introduced by the switch to the AuthManager API: the session cache was previously a static field, but became an instance field. As a result, too many token fetches are being observed when using request signing.

@github-actions github-actions Bot added the AWS label Jun 2, 2025
@adutra

adutra commented Jun 2, 2025

Copy link
Copy Markdown
Contributor Author

@nastra @danielcweeks @c-thiel FYI

@adutra adutra force-pushed the signer-auth-manager-static branch 2 times, most recently from d6e7493 to f72358f Compare June 2, 2025 10:42
return authSession;
}

static {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this further to the top?

@adutra

adutra commented Jun 2, 2025

Copy link
Copy Markdown
Contributor Author

Heads up: I need more changes to make this work properly.

@adutra adutra force-pushed the signer-auth-manager-static branch from f72358f to 58342c0 Compare June 2, 2025 12:16
@github-actions github-actions Bot added the core label Jun 2, 2025
@adutra

adutra commented Jun 2, 2025

Copy link
Copy Markdown
Contributor Author

Heads up: I need more changes to make this work properly.

Done, this is now ready for review.

FYI we needed to implement session caching for sessions obtained via the AuthManager#tableSession(RESTClient, Map<String,String>). This method was introduced precisely for request signers, but the default impl does not cache returned sessions.

sessionCache = newSessionCache(name, properties);
}

if (config.token() != null) {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically the same logic as in catalogSession(), but with caching enabled.

@stevenzwu stevenzwu added this to the Iceberg 1.10.0 milestone Jun 2, 2025
@VisibleForTesting
AuthSession authSession() {
if (null == authSession) {
private AuthManager authManager() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please move the authManager() method just below authSession() as it's causing a diff that adds visual overhead during review

adutra added a commit to adutra/iceberg-auth-manager that referenced this pull request Jun 3, 2025
See apache/iceberg#13215 for context.

In Iceberg 1.9.0, "standalone" sessions created by the S3 request signer client weren't cached, and an AuthManager was created for each signer instance. This is a performance degradation compared to the situation pre-AuthManager, where sessions were cached by the signer itself, and the cache was a static field.

In Iceberg 1.10.0, this should be fixed, but also requires this AuthManager to implement proper caching of sessions created by the signer client.
adutra added a commit to adutra/iceberg-auth-manager that referenced this pull request Jun 3, 2025
See apache/iceberg#13215 for context.

In Iceberg 1.9.0, "standalone" sessions created by the S3 request signer client weren't cached, and an AuthManager was created for each signer instance. This is a performance degradation compared to the situation pre-AuthManager, where sessions were cached by the signer itself, and the cache was a static field.

In Iceberg 1.10.0, this should be fixed, but also requires this AuthManager to implement proper caching of sessions created by the signer client.
adutra added a commit to dremio/iceberg-auth-manager that referenced this pull request Jun 3, 2025
See apache/iceberg#13215 for context.

In Iceberg 1.9.0, "standalone" sessions created by the S3 request signer client weren't cached, and an AuthManager was created for each signer instance. This is a performance degradation compared to the situation pre-AuthManager, where sessions were cached by the signer itself, and the cache was a static field.

In Iceberg 1.10.0, this should be fixed, but also requires this AuthManager to implement proper caching of sessions created by the signer client.
Comment thread aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java Outdated

@Override
public AuthSession tableSession(RESTClient sharedClient, Map<String, String> properties) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Comment thread core/src/main/java/org/apache/iceberg/rest/auth/OAuth2Manager.java

@nastra nastra left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a few minor comments that would be good to address but overall the approach LGTM. @danielcweeks could you also take a look please?

@nastra nastra requested a review from danielcweeks July 3, 2025 13:36
@SuppressWarnings("immutables:incompat")
private volatile AuthSession authSession;
@SuppressWarnings("ShutdownHook")
private static void installShutdownHook() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this is safe? It feels like we could be trading one problem for another because if an engine loads tasks in separate classloaders, we might have an issue with shutdown hooks piling up.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I added the hook, it wasn't there before. The HTTP client for example wasn't being closed before.

I guess it's OK to remove the hook. (I can also look into adding phantom references here, but I'd prefer to do that in a follow-up task, since this PR is kind of urgent.)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we're making the situation here any worse and I find shutdown hooks not a great solution because we're adding complexity and still not shutting them down based on when they're no longer necessary, but rather when you're already exiting the jvm.

@danielcweeks danielcweeks merged commit f60591f into apache:main Jul 9, 2025
43 checks passed
snazy added a commit to projectnessie/nessie that referenced this pull request Sep 16, 2025
Since Iceberg 1.10 it is no longer possible to use different auth configurations with S3 remote signing. This is caused by apache/iceberg#13215 which keeps auth related configuration in static fields in `o.a.i.aws.s3.signer.S3V4RestSignerClient`.
snazy added a commit to snazy/nessie that referenced this pull request Sep 16, 2025
Since Iceberg 1.10 it is no longer possible to use different auth configurations with S3 remote signing. This is caused by apache/iceberg#13215 which keeps auth related configuration in static fields in `o.a.i.aws.s3.signer.S3V4RestSignerClient`.
snazy added a commit to projectnessie/nessie that referenced this pull request Sep 16, 2025
Since Iceberg 1.10 it is no longer possible to use different auth configurations with S3 remote signing. This is caused by apache/iceberg#13215 which keeps auth related configuration in static fields in `o.a.i.aws.s3.signer.S3V4RestSignerClient`.
adutra pushed a commit to dremio/iceberg-auth-manager that referenced this pull request Jun 8, 2026
…sionCache (#242)

The double-checked locking pattern in getOrCreateSessionCache() had a bug:
  the local variable 'cache' was not updated after another thread won the
  initialization race inside the synchronized block. The losing thread would
  return null, causing a NullPointerException at cache.cachedSession() in
  tableSession() (line 94).

  This surfaces when multiple threads call tableSession() simultaneously on
  the first Iceberg commit — for example, when an Iceberg Kafka Connect sink
  flushes several Parquet files in parallel. It became observable in
  iceberg-kafka-connect 1.11.0 after apache/iceberg#13215 changed
  S3V4RestSignerClient to call authManager().tableSession() on every S3
  request instead of caching the session per instance.

  Fix: re-read sessionCache from the volatile field inside the synchronized
  block so the losing thread picks up the value set by the winner.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants