Skip to content

[resourcedetection] EKS detector fails identifying EKS cluster when the configmap aws_auth is missing (ex: EKS Auto Mode) #39479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dloucasfx opened this issue Apr 17, 2025 · 2 comments · Fixed by #39486
Labels
bug Something isn't working processor/resourcedetection Resource detection processor

Comments

@dloucasfx
Copy link
Contributor

dloucasfx commented Apr 17, 2025

Component(s)

processor/resourcedetection

What happened?

Description

The EKS detector depends on the presence of aws_auth configmap to flag a cluster as EKS.
This identifier does not cover all uses cases, e.g. EKS Auto Mode, the aws_auth is not created.

According to AWS docs https://docs.aws.amazon.com/eks/latest/userguide/auth-configmap.html#aws-auth-configmap

The aws-auth ConfigMap is automatically created and applied to your cluster when you create a managed node group or when you create a node group using eksctl.

And when you create EKS with Auto Mode enabled, aws-auth is not created/needed.
https://docs.aws.amazon.com/eks/latest/userguide/auth-configmap.html#aws-auth-configmap

Amazon EKS Auto Mode represents a significant evolution in Kubernetes infrastructure management, combining secure and scalable cluster infrastructure with integrated Kubernetes capabilities managed by AWS . The service provides fully-managed worker node operations, eliminating the need for customers to set up Managed Node Groups or AutoScaling groups .

Steps to Reproduce

Create an EKS cluster from the UI using the default setting (which enables EKS Auto Mode)

Expected Result

Error when EKS detector runs isEKS()

2025/04/14 18:19:48 settings.go:478: Set config to [/conf/relay.yaml]
2025/04/14 18:19:48 settings.go:539: Set memory limit to 450 MiB
2025/04/14 18:19:48 settings.go:524: Set soft memory limit set to 450 MiB
2025/04/14 18:19:48 settings.go:373: Set garbage collection target percentage (GOGC) to 400
2025/04/14 18:19:48 settings.go:414: set "SPLUNK_LISTEN_INTERFACE" to "0.0.0.0"
2025-04-14T18:19:48.440Z	info	[email protected]/service.go:193	Setting up own telemetry...
2025-04-14T18:19:48.441Z	info	[email protected]/memorylimiter.go:74	Memory limiter configured	{"otelcol.component.kind": "Processor", "limit_mib": 450, "spike_limit_mib": 90, "check_interval": 2}
2025-04-14T18:19:48.443Z	info	[email protected]/service.go:260	Starting otelcol...	{"Version": "v0.122.0", "NumCPU": 2}
2025-04-14T18:19:48.443Z	info	extensions/extensions.go:40	Starting extensions...
2025-04-14T18:19:48.443Z	info	extensions/extensions.go:44	Extension is starting...	{"otelcol.component.id": "health_check", "otelcol.component.kind": "Extension"}
2025-04-14T18:19:48.443Z	info	[email protected]/healthcheckextension.go:32	Starting health_check extension	{"otelcol.component.id": "health_check", "otelcol.component.kind": "Extension", "config": {"Endpoint":"0.0.0.0:13133","TLSSetting":null,"CORS":null,"Auth":null,"MaxRequestBodySize":0,"IncludeMetadata":false,"ResponseHeaders":null,"CompressionAlgorithms":null,"ReadTimeout":0,"ReadHeaderTimeout":0,"WriteTimeout":0,"IdleTimeout":0,"Path":"/","ResponseBody":null,"CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
2025-04-14T18:19:48.443Z	info	extensions/extensions.go:61	Extension started.	{"otelcol.component.id": "health_check", "otelcol.component.kind": "Extension"}
2025-04-14T18:19:48.443Z	info	internal/resourcedetection.go:137	began detecting resource information	{"otelcol.component.id": "resourcedetection/k8s_cluster_name", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "metrics", "otelcol.signal": "metrics"}
2025-04-14T18:19:48.455Z	warn	internal/resourcedetection.go:159	failed to detect resource	{"otelcol.component.id": "resourcedetection/k8s_cluster_name", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "metrics", "otelcol.signal": "metrics", "error": "isEks() error retrieving auth configmap: failed to retrieve ConfigMap kube-system/aws-auth: configmaps \"aws-auth\" not found"}
Retrying fetching data...
2025-04-14T18:19:49.492Z	warn	internal/resourcedetection.go:159	failed to detect resource	{"otelcol.component.id": "resourcedetection/k8s_cluster_name", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "metrics", "otelcol.signal": "metrics", "error": "isEks() error retrieving auth configmap: failed to retrieve ConfigMap kube-system/aws-auth: configmaps \"aws-auth\" not found"}
Retrying fetching data...
2025-04-14T18:19:54.173Z	warn	internal/resourcedetection.go:159	failed to detect resource	{"otelcol.component.id": "resourcedetection/k8s_cluster_name", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "metrics", "otelcol.signal": "metrics", "error": "isEks() error retrieving auth configmap: failed to retrieve ConfigMap kube-system/aws-auth: configmaps \"aws-auth\" not found"}
Retrying fetching data...
2025-04-14T18:19:55.312Z	warn	internal/resourcedetection.go:159	failed to detect resource	{"otelcol.component.id": "resourcedetection/k8s_cluster_name", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "metrics", "otelcol.signal": "metrics", "error": "isEks() error retrieving auth configmap: failed to retrieve ConfigMap kube-system/aws-auth: configmaps \"aws-auth\" not found"}
2025-04-14T18:20:03.446Z	warn	internal/resourcedetection.go:166	Context was cancelled: %w	{"otelcol.component.id": "resourcedetection/k8s_cluster_name", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "metrics", "otelcol.signal": "metrics", "error": "context deadline exceeded"}
2025-04-14T18:20:03.446Z	info	internal/resourcedetection.go:188	detected resource information	{"otelcol.component.id": "resourcedetection/k8s_cluster_name", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "metrics", "otelcol.signal": "metrics", "resource": {}}
2025-04-14T18:20:03.446Z	info	internal/resourcedetection.go:137	began detecting resource information	{"otelcol.component.id": "resourcedetection", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "metrics/collector", "otelcol.signal": "metrics"}
2025-04-14T18:20:03.451Z	warn	internal/resourcedetection.go:159	failed to detect resource	{"otelcol.component.id": "resourcedetection", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "metrics/collector", "otelcol.signal": "metrics", "error": "isEks() error retrieving auth configmap: failed to retrieve ConfigMap kube-system/aws-auth: configmaps \"aws-auth\" not found"}
Retrying fetching data...
Retrying fetching data...
2025-04-14T18:20:04.872Z	warn	internal/resourcedetection.go:159	failed to detect resource	{"otelcol.component.id": "resourcedetection", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "metrics/collector", "otelcol.signal": "metrics", "error": "isEks() error retrieving auth configmap: failed to retrieve ConfigMap kube-system/aws-auth: configmaps \"aws-auth\" not found"}
2025-04-14T18:20:04.877Z	warn	internal/resourcedetection.go:159	failed to detect resource	{"otelcol.component.id": "resourcedetection", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "metrics/collector", "otelcol.signal": "metrics", "error": "isEks() error retrieving auth configmap: failed to retrieve ConfigMap kube-system/aws-auth: configmaps \"aws-auth\" not found"}
Retrying fetching data...
2025-04-14T18:20:13.085Z	warn	internal/resourcedetection.go:159	failed to detect resource	{"otelcol.component.id": "resourcedetection", "otelcol.component.kind": "Processor", "otelcol.pipeline.id": "metrics/collector", "otelcol.signal": "metrics", "error": "isEks() error retrieving auth configmap: failed to retrieve ConfigMap kube-system/aws-auth: configmaps \"aws-auth\" not found"}

Actual Result

No error

OpenTelemetry Collector configuration

      resourcedetection:
        detectors:
        - env
        - eks
        - ec2
        - system
        override: true
        timeout: 15s

Collector version

main

@dloucasfx dloucasfx added bug Something isn't working needs triage New item requiring triage labels Apr 17, 2025
@github-actions github-actions bot added the processor/resourcedetection Resource detection processor label Apr 17, 2025
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1
Copy link
Member

A maintainer has approved the PR that resolves this issue. Removing needs triage.

@crobert-1 crobert-1 removed the needs triage New item requiring triage label Apr 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working processor/resourcedetection Resource detection processor
Projects
None yet
2 participants