Skip to content

AWS: Add scheduled refresh for the S3FileIO held storage credentials#15678

Merged
danielcweeks merged 5 commits into
apache:mainfrom
danielcweeks:fileio-credential-refresh-v2
Mar 19, 2026
Merged

AWS: Add scheduled refresh for the S3FileIO held storage credentials#15678
danielcweeks merged 5 commits into
apache:mainfrom
danielcweeks:fileio-credential-refresh-v2

Conversation

@danielcweeks

Copy link
Copy Markdown
Contributor

The S3FileIO implementation never refreshes the credentials that are held directly from the table load. The s3 clients use a VendedCredentialProvider that internally refreshes the AWS client, but those updates are not reflected back to the FileIO.

If an S3FileIO instance is serialized to remote workers, the credentials may be expired triggering a thundering herd of requests to refresh immediately.

This change addresses this problem by proactively updating the credentials in the FileIO so that only valid credentials are propagated to remote clients.

Comment thread aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java Outdated
Comment thread aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java Outdated
Comment thread aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java
Comment thread aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java Outdated
@nastra

nastra commented Mar 19, 2026

Copy link
Copy Markdown
Contributor

I believe we have the exact same issue with GCSFileIO

@nastra nastra added this to the Iceberg 1.11.0 milestone Mar 19, 2026
Comment thread aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java
}

@Test
public void credentialRefreshWithinFiveMinuteWindow() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a test for multiple prefixes being sent too ? specially for plumbing it with PrefixedS3Client

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're now delegating to the VendedCredentialProvider which already has tests for that case, so I don't think it's necessary to duplicate here.

Comment thread aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java Outdated
Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
@danielcweeks danielcweeks merged commit 1009ef4 into apache:main Mar 19, 2026
34 checks passed
manuzhang pushed a commit to manuzhang/iceberg that referenced this pull request Mar 30, 2026
…pache#15678)

* AWS: Add scheduled refresh for the S3FileIO held storage credentials


Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants