Prometheus receiver not handling third-party Prometheus exporters sending untyped metrics #1103

Nicolas-MacBeth · 2020-06-10T19:44:34Z

Describe the bug
I am trying to send metrics from databases to the Collector using Prometheus' exporters for third-party services, and the Collector seems to not be able to handle it. My research has led to understand the problem lies in the Prometheus receiver, more specifically its handling of Prometheus untyped TYPE.

The biggest problem here is that because of that one "faulty" metric, all others (multiple hundred) are discarded and not exported. 2 out of 2 metrics exporters that I have tried send this type of metric, so I think it's safe to assume others will too. How would the community like to remedy this on the Collector side? At the very least, I think the behavior should be a warning + the rest of the metrics being handled/exported. Otherwise, how can we handle these untyped metrics? Here's the Prometheus reference.

In my case specifically, that metric could be substituted with a gauge - which fixed everything, but untyped is used a default according to the Prometheus reference. I'm unsure if it would be wise to "try to convert untyped to a gauge type by default", since an exporter passing incorrect values with a gauge type could break other things, but it could also be a great solution since it seems untyped is used when the actual value passed with the labels is unnecessary (for example: that "faulty" metric was the following:

# TYPE pg_static untyped
pg_static{server="localhost:5432",short_version="12.2.0",version="PostgreSQL 12.2 (Debian 12.2-1+build2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 9.2.1-16) 9.2.1 20191030, 64-bit"} 1

here, the value 1 is not needed). Also, even if something passed with untyped "breaks" the receiver, that behavior is not distant from the one we have now (nothing being exported).

I'd appreciate some input on how this should be fixed, and depending on the complexity/work involved, would love to try my hand at fixing it. Thanks!

Extra info:

I tried with the MySQL exporter and the PostgreSQL exporter. They both give this log in the Collector, every 10 seconds (scrape period):

2020-06-10T14:53:39.615Z        INFO    internal/metrics_adjuster.go:357        Adjust - skipping unexpected point      
{"component_kind": "receiver", "component_type": "prometheus", "component_name": "prometheus", "type": "UNSPECIFIED"}

The same metrics with the same exporters appear just fine in the actual Prometheus service, but are dropped in the Collector. Also, since they're dropped they're not exported to my backend of choice. Once again, the problem seems to be that the metricType (untyped) is not recognized by the Collector, and this line of code runs:

opentelemetry-collector/receiver/prometheusreceiver/internal/metrics_adjuster.go

Line 357 in 8aa2731

    
           ma.logger.Info("Adjust - skipping unexpected point", zap.String("type", metricType.String()))

(notice the comment saying // this shouldn't run right above it).

More info: There are other untyped metrics exported by those two exporters, but they all have NaN as their values, so they're treated as dummy metrics (with 0 values if I understand correctly) by the Collector, so they don't cause the same problem - since they're dropped. However the second they have a value they would pose the same problem. Those metrics also appear properly when collected by the standalone Prometheus service, with NaN as values appearing in the dashboard. That's another instance of different behavior between the OTel Prometheus receiver and the standalone Prometheus service. This is less of a problem for now since at least the rest of the metrics are handled/exported.

Steps to reproduce
Download and run any of the two aforementioned databases, download and build its exporter and the Collector. Run all three at ounce.

What did you expect to see?
I expected to see my metrics exported and appear in my backend of choice (Stackdriver).

What did you see instead?
I saw this warning message logged by the Collector for latest commit (9af3381):

2020-06-10T14:53:39.615Z        INFO    internal/metrics_adjuster.go:357        Adjust - skipping unexpected point      
{"component_kind": "receiver", "component_type": "prometheus", "component_name": "prometheus", "type": "UNSPECIFIED"}

Log when running Latest release binary (0.3.0):

{"level":"info","ts":1591373779.4030387,"caller":"internal/metrics_adjuster.go:357",
"msg":"Adjust - skipping unexpected point","type":"UNSPECIFIED"}

Also, those metrics were not exported, only dropped.

What version did you use?
Open Telemetry Contrib:

Built binary from latest commit: 9af3381
Version 0.3.0 release binary
Both do not work, same behavior

What config did you use?

extensions:
        health_check:
        zpages:
                endpoint: 0.0.0.0:55679
receivers:
        prometheus:
                config:
                        scrape_configs:
                                - job_name: 'postgresql_exporter_collector'
                                  scrape_interval: 10s
                                  static_configs:
                                          - targets: ['localhost:9187']
processors:
        batch:
        queued_retry:
exporters:
        stackdriver:
                project: "placeholder"
                skip_create_metric_descriptor: true
        file:
                path: ./metric.json
service:
        pipelines:
                metrics:
                        receivers: [prometheus]
                        exporters: [stackdriver]
        extensions: [health_check, zpages]

Environment
OS: Debian 9
Go version: 1.14.3 linux/amd64

The text was updated successfully, but these errors were encountered:

Nicolas-MacBeth · 2020-06-11T19:53:35Z

Joshua MacDonald in the Gitter seems to support defaulting to gauge when there is ambiguity - see https://gitter.im/open-telemetry/community.

I believe the conversion is done at this function: convToOCAMetricType link.

draffensperger · 2020-06-11T20:32:19Z

@Nicolas-MacBeth do you think this would be a good first issue for you to tackle to do that conversion? @tigrannajaryan would you be open to that (I saw you assigned it to @asuresh4 )?

asuresh4 · 2020-06-15T13:50:56Z

@Nicolas-MacBeth - I'd happy to help with the review if you would like to work on this.

Nicolas-MacBeth · 2020-06-15T14:33:38Z

@asuresh4 I appreciate that, so to confirm; should I take the approach where all cases of ambiguity are handled as gauges (a default for all metric types whose type is unknown/not supported), or only for the untyped ones?

asuresh4 · 2020-06-15T15:59:07Z

Looking at the commit that introduced the new metric types, seems to me like metrics of untyped type map to unknown type. If that's indeed true, I believe we could convert metrics with unknown type to gauge. However, I'm not sure if the same holds for other types introduced by the commit, especially info and stateset types.

@bogdandrutu @tigrannajaryan - what do you guys think?

tigrannajaryan · 2020-06-17T21:20:13Z

This is outside of my area of expertise.
@bogdandrutu perhaps you can help.

asuresh4 · 2020-06-24T21:53:51Z

@Nicolas-MacBeth - I synced with @bogdandrutu on this offline and he's in agreement that we can convert metrics of textparse.MetricTypeUnknown type to gauge, and get to the other other unsupported types at a later point. Let me know if you still would like to work on the issue or I could pick it up.

Nicolas-MacBeth · 2020-06-25T14:51:28Z

Okay, perfect! I'll make a PR soon; unless I'm missing anything it should be a simple fix. Thanks!

Nicolas-MacBeth · 2020-06-25T15:56:12Z

Hi @asuresh4 , I made a PR if you want to check it out! #1194

)

Nicolas-MacBeth added the bug Something isn't working label Jun 10, 2020

tigrannajaryan assigned asuresh4 Jun 11, 2020

tigrannajaryan self-assigned this Jun 17, 2020

tigrannajaryan removed their assignment Jun 17, 2020

flands added this to the GA 1.0 milestone Jun 18, 2020

Nicolas-MacBeth mentioned this issue Jun 25, 2020

Fixed untyped Prometheus receiver bug, updated tests and updated Design.md #1194

Merged

bogdandrutu closed this as completed Jul 13, 2020

flands modified the milestones: GA 1.0, Beta 0.6.0 Jul 16, 2020

sky333999 mentioned this issue Dec 7, 2022

[receiver/prometheus] Add config option to allow dropping of untyped metrics open-telemetry/opentelemetry-collector-contrib#16768

Closed

swiatekm pushed a commit to swiatekm/opentelemetry-collector that referenced this issue Oct 9, 2024

Add support of ephemeral-storage in the resources (open-telemetry#1103

48b9764

)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prometheus receiver not handling third-party Prometheus exporters sending untyped metrics #1103

Prometheus receiver not handling third-party Prometheus exporters sending untyped metrics #1103

Nicolas-MacBeth commented Jun 10, 2020 •

edited

Loading

Nicolas-MacBeth commented Jun 11, 2020

Uh oh!

draffensperger commented Jun 11, 2020

Uh oh!

asuresh4 commented Jun 15, 2020

Uh oh!

Nicolas-MacBeth commented Jun 15, 2020 •

edited

Loading

Uh oh!

asuresh4 commented Jun 15, 2020

Uh oh!

tigrannajaryan commented Jun 17, 2020

Uh oh!

asuresh4 commented Jun 24, 2020

Uh oh!

Nicolas-MacBeth commented Jun 25, 2020

Uh oh!

Nicolas-MacBeth commented Jun 25, 2020

Uh oh!

Prometheus receiver not handling third-party Prometheus exporters sending untyped metrics #1103

Prometheus receiver not handling third-party Prometheus exporters sending untyped metrics #1103

Comments

Nicolas-MacBeth commented Jun 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Nicolas-MacBeth commented Jun 11, 2020

Uh oh!

draffensperger commented Jun 11, 2020

Uh oh!

asuresh4 commented Jun 15, 2020

Uh oh!

Nicolas-MacBeth commented Jun 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asuresh4 commented Jun 15, 2020

Uh oh!

tigrannajaryan commented Jun 17, 2020

Uh oh!

asuresh4 commented Jun 24, 2020

Uh oh!

Nicolas-MacBeth commented Jun 25, 2020

Uh oh!

Nicolas-MacBeth commented Jun 25, 2020

Uh oh!

Nicolas-MacBeth commented Jun 10, 2020 •

edited

Loading

Nicolas-MacBeth commented Jun 15, 2020 •

edited

Loading