-
Notifications
You must be signed in to change notification settings - Fork 523
Same target from two different jobs missing after targetallocator upgrade 0.121.0+ #4044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Are you sure this also happens if the metric paths are different? The logic behind #3832 was that all the data that used to go into the hash calculation was present in the labels anyway. And that is definitely true for Thank you for the report and detailed reproduction! |
Thanks @swiatekm for the quick reply. The above debugger screenshot is for the config:
I was surprised to not see the I looked into it more now just to double-check and the labels in the targetgroup and target returned by the discovery manager are the bare minimum with address and some static labels. Then the scrape manager later takes these and combines these with more labels from the scrape config for job, metrics_path, etc: https://github.com/prometheus/prometheus/blob/main/scrape/target.go#L445 Discovery manager target groups: https://github.com/prometheus/prometheus/blob/main/discovery/targetgroup/targetgroup.go I am happy to help with making a PR with a fix, but I'm not sure what strategy we want to take for the fix |
That's a bit annoying. I don't want to revert the entirety of #3832 because it's conceptually sound and the performance improvement is significant. Unfortunately, Prometheus's |
Ok sounds good, thanks, that makes sense to me |
Thanks @swiatekm, made an initial PR for this, let me know what you think |
Component(s)
target allocator
What happened?
Description
In a practical case, this happens when two different jobs point to the same target but each job has a different metrics_path. Now the target is only scraped for one of the jobs, whereas before it was scraped for both.
I was able to change of the unit tests to reproduce this and attach a debugger. I have narrowed it down to this PR: #3832 that changes the hash for a target from
t.JobName + t.TargetURL + strconv.FormatUint(t.Labels.Hash(), 10)
tot.Labels.Hash()
.The
targetgroup
map coming from the Prometheus discovery manager does not contain job as a label when the targets are being processed by the Target Allocator and stored in the target list:Job
is the key to atargetgroup
object in the map, which does have theaddress
as a label per target. This is also not added a label when processing the targets right before hashing them: https://github.com/swiatekm/opentelemetry-operator/blob/main/cmd/otel-allocator/internal/target/discovery.go#L195. The metrics path is also not a label at the time.Steps to Reproduce
Use a scrape config pointing to the same target from two different jobs for Target Allocator version 0.121.0 and up:
Expected Result
Target
prom.domain:9001
will be allocated to a collector for both jobsprometheus
orprometheus2
.Actual Result
Target
prom.domain:9001
will be allocated to a collector for only one job ofprometheus
orprometheus2
, but not both.Kubernetes Version
1.31.8
Operator version
v0.121.0+
Collector version
v0.121.0+
Environment information
Environment
OS: "Ubuntu 20.04"
Log output
Additional context
No response
The text was updated successfully, but these errors were encountered: