-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Failed to scrape Prometheus endpoint #23889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
…a label for metrics received. (#24538) **Description:** As a user, I must be able correctly correlate metrics across arrays, to do so the metrics provided must have context as to which array they came from. - It was decided from the Pure Storage Observability Practice that it will be best for the user to define it as a string to act as a pretty name as the label `fa_array_name`, or `host` if the array does not provide a `host` label. (e.g. if `host` is not provided as a label, the array is the host). bugs fixed: the prometheus receiver that is used as a dependency does not support `.` in label names our implementation of adding labels - this may have effect in future planning for OTel Semantic conventions (e.g. host.name). This was causing the prom scraper to fail after metrics were collected in `receiver.go` **Link to tracking Issue:** #23889 #21248 #22027 **Testing:** <Describe what testing was performed and which tests were added.> - Tested with a live Pure Storage FlashArray, and verified labels were being added when exported via `logging` - There is error validation on if fa_array_name == "", however there are some open test items in #23271 that are being resolved that this new pretty naming scheme should be included in. **Documentation:** READMEs have been updated to be more verbose for how to scrape multiple arrays with multiple instances --------- Co-authored-by: Juraci Paixão Kröhling <[email protected]>
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This should be resolved with #24538 . Please confirm @james-laing |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Tested with 0.83.0 and latest 0.88.0, working as expected. |
Hey guys, I don't quite understand if the issue still persists or helm chart contains outdated version. {"level":"info","ts":1700040909.5763159,"caller":"[email protected]/telemetry.go:84","msg":"Setting up own telemetry..."}
{"level":"info","ts":1700040909.5768528,"caller":"[email protected]/telemetry.go:201","msg":"Serving Prometheus metrics","address":"hidden:8888","level":"Basic"}
{"level":"info","ts":1700040909.5921855,"caller":"[email protected]/metrics.go:89","msg":"Metric filter configured","kind":"processor","name":"filter/dropping","pipeline":"metrics","include match_type":"","include expressions":[],"include metric names":[],"include metrics with resource attributes":null,"exclude match_type":"","exclude expressions":[],"exclude metric names":[],"exclude metrics with resource attributes":null}
{"level":"info","ts":1700040909.5987601,"caller":"[email protected]/memorylimiter.go:138","msg":"Using percentage memory limiter","kind":"processor","name":"memory_limiter","pipeline":"traces","total_memory_mib":16006,"limit_percentage":75,"spike_limit_percentage":15}
{"level":"info","ts":1700040909.5991478,"caller":"[email protected]/memorylimiter.go:102","msg":"Memory limiter configured","kind":"processor","name":"memory_limiter","pipeline":"traces","limit_mib":12004,"spike_limit_mib":2400,"check_interval":1}
{"level":"warn","ts":1700040909.6063027,"caller":"[email protected]/factory.go:49","msg":"jaeger receiver will deprecate Thrift-gen and replace it with Proto-gen to be compatbible to jaeger 1.42.0 and higher. See https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/18485 for more details.","kind":"receiver","name":"jaeger","data_type":"traces"}
{"level":"info","ts":1700040909.6087239,"caller":"[email protected]/service.go:143","msg":"Starting otelcol-contrib...","Version":"0.88.0","NumCPU":4}
{"level":"info","ts":1700040909.608817,"caller":"extensions/extensions.go:33","msg":"Starting extensions..."}
{"level":"info","ts":1700040909.6089096,"caller":"extensions/extensions.go:36","msg":"Extension is starting...","kind":"extension","name":"health_check"}
{"level":"info","ts":1700040909.6090906,"caller":"[email protected]/healthcheckextension.go:35","msg":"Starting health_check extension","kind":"extension","name":"health_check","config":{"Endpoint":"hidden:13133","TLSSetting":null,"CORS":null,"Auth":null,"MaxRequestBodySize":0,"IncludeMetadata":false,"ResponseHeaders":null,"Path":"/","ResponseBody":null,"CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
{"level":"info","ts":1700040909.6100233,"caller":"extensions/extensions.go:43","msg":"Extension started.","kind":"extension","name":"health_check"}
{"level":"info","ts":1700040909.6101332,"caller":"extensions/extensions.go:36","msg":"Extension is starting...","kind":"extension","name":"memory_ballast"}
{"level":"info","ts":1700040911.1884217,"caller":"[email protected]/memory_ballast.go:41","msg":"Setting memory ballast","kind":"extension","name":"memory_ballast","MiBs":6402}
{"level":"info","ts":1700040911.1895204,"caller":"extensions/extensions.go:43","msg":"Extension started.","kind":"extension","name":"memory_ballast"}
{"level":"info","ts":1700040911.2139206,"caller":"internal/resourcedetection.go:125","msg":"began detecting resource information","kind":"processor","name":"resourcedetection","pipeline":"traces"}
{"level":"info","ts":1700040911.2668836,"caller":"internal/resourcedetection.go:139","msg":"detected resource information","kind":"processor","name":"resourcedetection","pipeline":"traces","resource":{"cloud.account.id":"hidden","cloud.availability_zone":"hidden","cloud.platform":"gcp_kubernetes_engine","cloud.provider":"gcp","host.id":"224325797131441806","host.name":"hidden","k8s.cluster.name":"hidden"}}
{"level":"info","ts":1700040911.3124878,"caller":"[email protected]/otlp.go:83","msg":"Starting GRPC server","kind":"receiver","name":"otlp","data_type":"logs","endpoint":"hidden:4317"}
{"level":"info","ts":1700040911.3126836,"caller":"[email protected]/otlp.go:101","msg":"Starting HTTP server","kind":"receiver","name":"otlp","data_type":"logs","endpoint":"hidden:4318"}
{"level":"info","ts":1700040911.3155255,"caller":"[email protected]/metrics_receiver.go:230","msg":"Scrape job added","kind":"receiver","name":"prometheus","data_type":"metrics","jobName":"opentelemetry-collector"}
{"level":"info","ts":1700040911.3161445,"caller":"healthcheck/handler.go:132","msg":"Health Check state change","kind":"extension","name":"health_check","status":"ready"}
{"level":"info","ts":1700040911.316259,"caller":"[email protected]/service.go:169","msg":"Everything is ready. Begin running and processing data."}
{"level":"info","ts":1700040911.3351939,"caller":"[email protected]/metrics_receiver.go:239","msg":"Starting discovery manager","kind":"receiver","name":"prometheus","data_type":"metrics"}
{"level":"info","ts":1700040911.3395038,"caller":"[email protected]/metrics_receiver.go:281","msg":"Starting scrape manager","kind":"receiver","name":"prometheus","data_type":"metrics"}
{"level":"warn","ts":1700040924.346994,"caller":"internal/transaction.go:123","msg":"Failed to scrape Prometheus endpoint","kind":"receiver","name":"prometheus","data_type":"metrics","scrape_timestamp":1700040924343,"target_labels":"{__name__=\"up\", instance=\"hidden:9127\", job=\"opentelemetry-collector\"}"}
{"level":"warn","ts":1700040934.353167,"caller":"internal/transaction.go:123","msg":"Failed to scrape Prometheus endpoint","kind":"receiver","name":"prometheus","data_type":"metrics","scrape_timestamp":1700040934343,"target_labels":"{__name__=\"up\", instance=\"hidden:9127\", job=\"opentelemetry-collector\"}"} That's my helm chart
And as far as I understand helm chart version 0.73.1 uses collector version 0.88.0, at least that's what is written in opentelemetry-collector chart:
Here's my opentelemetry-collector:
config:
receivers:
prometheus:
config:
scrape_configs:
- job_name: opentelemetry-collector
scrape_interval: 10s
static_configs:
- targets:
- ${env:MY_POD_IP}:9127 Am I doing something wrong or helm chart uses different version? |
@dmitrii-sisutech although the error message is the same your issue is not the same. The issue documented here was specific to the development of the receiver/purefa receiver which you are not using in your example. You are using the receiver/prometheus. Check the receiver is configured correctly. I would suggest you take a look at your config and ensure it is correct and compatible, then post in the related component you believe has the issue. |
For history, in case anybody will be interested - the problem was in |
In my case the issue was RBAC related. It would be good if the code gave some more detailed information other than simply 'failed'. |
Component(s)
receiver/purefa
What happened?
Description
The receiver is failing to scrape the Prometheus endpoint since PR21100 was commited.
The last working version was otelcol-contrib_v0.76.3-PR21100 but since the code has been merged the receiver has not successfully scraped any metrics.
Tested using otelcol-contrib_0.78.0 and otelcol-contrib_0.80.0, both produce the same error.
Exact same config file works fine with otelcol-contrib_v0.76.3-PR21100.
Steps to Reproduce
Test with otelcol-contrib_0.78.0 upwards
Expected Result
Successful scrape
Actual Result
Collector version
v0.80.0, v0.78.0, v0.76.3-PR21100
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
OpenTelemetry Collector configuration
Log output
2023-07-03T06:58:10.866-0400 info [email protected]/metrics_receiver.go:242 Starting discovery manager {"kind": "receiver", "name": "prometheus/internal", "data_type": "metrics"}
2023-07-03T06:58:10.867-0400 info [email protected]/metrics_receiver.go:233 Scrape job added {"kind": "receiver", "name": "prometheus/internal", "data_type": "metrics", "jobName": "otel-collector"}
2023-07-03T06:58:10.868-0400 info [email protected]/metrics_receiver.go:281 Starting scrape manager {"kind": "receiver", "name": "prometheus/internal", "data_type": "metrics"}
2023-07-03T06:58:10.868-0400 info [email protected]/metrics_receiver.go:242 Starting discovery manager {"kind": "receiver", "name": "purefa/array01", "data_type": "metrics"}
2023-07-03T06:58:10.869-0400 info [email protected]/metrics_receiver.go:233 Scrape job added {"kind": "receiver", "name": "purefa/array01", "data_type": "metrics", "jobName": "purefa/array/array01"}
2023-07-03T06:58:10.869-0400 info [email protected]/metrics_receiver.go:233 Scrape job added {"kind": "receiver", "name": "purefa/array01", "data_type": "metrics", "jobName": "purefa/hosts/array01"}
2023-07-03T06:58:10.869-0400 info [email protected]/metrics_receiver.go:233 Scrape job added {"kind": "receiver", "name": "purefa/array01", "data_type": "metrics", "jobName": "purefa/directories/array01"}
2023-07-03T06:58:10.869-0400 info [email protected]/metrics_receiver.go:233 Scrape job added {"kind": "receiver", "name": "purefa/array01", "data_type": "metrics", "jobName": "purefa/pods/array01"}
2023-07-03T06:58:10.869-0400 info [email protected]/metrics_receiver.go:233 Scrape job added {"kind": "receiver", "name": "purefa/array01", "data_type": "metrics", "jobName": "purefa/volumes/array01"}
2023-07-03T06:58:10.869-0400 info healthcheck/handler.go:129 Health Check state change {"kind": "extension", "name": "health_check", "status": "ready"}
2023-07-03T06:58:10.869-0400 info service/service.go:148 Everything is ready. Begin running and processing data.
2023-07-03T06:58:10.869-0400 info [email protected]/metrics_receiver.go:281 Starting scrape manager {"kind": "receiver", "name": "purefa/array01", "data_type": "metrics"}
2023-07-03T06:58:10.995-0400 info clientutil/api.go:44 API key validation successful. {"kind": "exporter", "data_type": "metrics", "name": "datadog"}
2023-07-03T06:58:34.815-0400 warn internal/transaction.go:114 Failed to scrape Prometheus endpoint {"kind": "receiver", "name": "purefa/array01", "data_type": "metrics", "scrape_timestamp": 1688381914622, "target_labels": "{name="up", deployment.environment="array01", host.name="http://pure-ome:9490/metrics\", instance="pure-ome:9490", job="purefa/array/array01"}"}
2023-07-03T06:58:40.860-0400 info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "resource metrics": 2, "metrics": 46, "data points": 98}
2023-07-03T06:58:40.864-0400 info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "resource metrics": 4, "metrics": 11, "data points": 100}
2023-07-03T06:58:53.601-0400 info hostmetadata/metadata.go:194 Sent host metadata {"kind": "exporter", "data_type": "metrics", "name": "datadog"}
2023-07-03T06:58:55.879-0400 info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "resource metrics": 2, "metrics": 6, "data points": 6}
2023-07-03T06:59:05.793-0400 warn internal/transaction.go:114 Failed to scrape Prometheus endpoint {"kind": "receiver", "name": "purefa/array01", "data_type": "metrics", "scrape_timestamp": 1688381945635, "target_labels": "{name="up", deployment.environment="array01", instance="pure-ome:9490", job="purefa/hosts/array01"}"}
2023-07-03T06:59:07.262-0400 warn internal/transaction.go:114 Failed to scrape Prometheus endpoint {"kind": "receiver", "name": "purefa/array01", "data_type": "metrics", "scrape_timestamp": 1688381947095, "target_labels": "{name="up", deployment.environment="array01", host.name="http://pure-ome:9490/metrics\", instance="pure-ome:9490", job="purefa/pods/array01"}"}
2023-07-03T06:59:08.640-0400 warn internal/transaction.go:114 Failed to scrape Prometheus endpoint {"kind": "receiver", "name": "purefa/array01", "data_type": "metrics", "scrape_timestamp": 1688381948404, "target_labels": "{name="up", deployment.environment="array01", host.name="http://pure-ome:9490/metrics\", instance="pure-ome:9490", job="purefa/directories/array01"}"}
2023-07-03T06:59:10.867-0400 info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "resource metrics": 4, "metrics": 11, "data points": 100}
2023-07-03T06:59:10.874-0400 info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "resource metrics": 8, "metrics": 19, "data points": 100}
The text was updated successfully, but these errors were encountered: