Skip to content

DefaultCredentialsProvider should fail fast when Web Identity Token is configured but STS dependency is missing #6638

@wbingli

Description

@wbingli

Describe the bug

When AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_ARN environment variables are set (indicating explicit intent to use Web Identity Token authentication), but the software.amazon.awssdk:sts dependency is missing, the DefaultCredentialsProvider silently falls back to the next provider in the chain (typically EC2 instance metadata) instead of failing with a clear error.

This was previously discussed in #1915, which was closed with a warning-log-only solution. I'd like to reopen discussion because the current behavior remains problematic.

Expected behavior

When the SDK detects:

  1. Web Identity Token environment variables are set (AWS_WEB_IDENTITY_TOKEN_FILE, AWS_ROLE_ARN)
  2. AND ClassNotFoundException occurs when initializing WebIdentityTokenFileCredentialsProvider

It should throw an exception rather than log a warning and continue. The presence of these environment variables indicates explicit user intent that cannot be satisfied due to a missing dependency.

Current behavior

  1. SDK detects Web Identity Token env vars
  2. Attempts to create WebIdentityTokenFileCredentialsProvider
  3. Gets ClassNotFoundException (missing STS)
  4. Logs a WARNING (easily missed in production logs)
  5. Silently falls back to EC2 IMDS
  6. Application runs with wrong IAM role (node role instead of pod role)
  7. Developer sees cryptic 403 errors on specific operations

Why the warning-only approach is insufficient

  1. Warnings are easily missed - Production logs contain thousands of lines. A single warning during startup is virtually invisible until someone spends hours debugging 403 errors.

  2. The environment variables express explicit intent - When these env vars are present, the user has clearly configured their environment for Web Identity authentication. A ClassNotFoundException here is a broken configuration, not a "try next option" situation.

  3. Silent fallback creates security risks - The application runs with the EC2 node's IAM role instead of the pod's intended role, potentially with broader permissions than intended.

  4. Real-world impact - Developers spend hours or days debugging this issue. The symptom (403 Forbidden) gives no hint that the root cause is a missing dependency.

Why "just use explicit provider" is not the answer

The suggestion to explicitly set WebIdentityTokenFileCredentialsProvider defeats the purpose of the default credentials chain. The chain's value is that the same application code works across environments:

  • Local development: uses ~/.aws/credentials or environment variables
  • EKS/Kubernetes: uses Web Identity Token (IRSA)
  • EC2: uses instance metadata

Requiring developers to hardcode a specific provider means writing environment-specific code or adding configuration complexity that the default chain was designed to eliminate.

Proposed solution

Modify DefaultCredentialsProvider to throw an exception (not just log a warning) when:

  • Web Identity Token environment variables are explicitly set
  • AND the STS dependency is missing (ClassNotFoundException)

This preserves the chain's flexibility for all other scenarios while failing fast when explicit configuration cannot be satisfied.

Environment

  • SDK version: (affects all versions with this behavior)
  • JDK version: N/A
  • OS: N/A (occurs in EKS/Kubernetes environments)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature-requestA feature should be added or improved.needs-triageThis issue or PR still needs to be triaged.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions