-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Permanently enable 'telemetry.newPipelineTelemetry' feature gate #12856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Permanently enable 'telemetry.newPipelineTelemetry' feature gate #12856
Conversation
5f9ec2d
to
c9cbf6a
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #12856 +/- ##
==========================================
+ Coverage 91.47% 91.52% +0.04%
==========================================
Files 494 494
Lines 26993 26982 -11
==========================================
+ Hits 24691 24694 +3
+ Misses 1818 1809 -9
+ Partials 484 479 -5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
c9cbf6a
to
474815c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just two suggestions
ec8eaf9
to
d8ac16f
Compare
It's been pointed out elsewhere that since the feature gate being enabled is a workaround for the problems it introduced, it's likely that users will already be using the gate. Therefore, removing it outright would break users. Our featuregate docs explicitly call out that We can also just move the gate along our normal path to |
c332465
to
3e6aeda
Compare
3e6aeda
to
66328e7
Compare
The spell check failure appears to be bogus and unrelated to this PR. |
Co-authored-by: Jade Guiton <[email protected]>
…ate (open-telemetry#12856)" This reverts commit dc8e2dd.
…behind feature gate (#12933) #### Context PR #12617 introduced logic to inject new instrumentation scope attributes in all internal telemetry to identify which Collector component it came from. These attributes had already been added to internal logs as regular log attributes, and this PR switched them to scope attributes for consistency. The new logic was placed behind an Alpha stage feature gate, `telemetry.newPipelineTelemetry`. Unfortunately, the default "off" state of the feature gate disabled the injection of component-identifying attributes entirely, which was a regression since they had been present in internal logs in previous releases. See issue #12870 for an in-depth discussion of this issue. To correct this, PR #12856 was filed, which stabilized the feature gate, making it on by default, with no way to disable it, and removed the logic that the feature gate used to toggle. This was thought to be the simplest way to mitigate the regression in the "off" state, since we planned to stabilize the feature eventually anyways. Unfortunately, it was found that the "on" state of the feature gate causes a different issue: [the Prometheus exporter](https://github.com/open-telemetry/opentelemetry-go/tree/main/exporters/prometheus) is the default way of exporting the Collector's internal metrics, accessible at `collector:8888/metrics`. This exporter does not currently have any support for instrumentation scope attributes, meaning that metric streams differentiated by said attributes but not by any other identifying property will appear as aliases to Prometheus, which causes an error. This completely breaks the export of Collector metrics through Prometheus under some simple configurations, which is a release blocker. #### Description To fix this issue, this PR sets the `telemetry.newPipelineTelemetry` feature gate back to "Alpha" (off by default), and reintroduces logic to disable the injection of the new instrumentation scope attributes when the gate is off, but only in internal metrics. Note that the new logic is still used unconditionally for logs and traces, to avoid reintroducing the logs issue (#12870). This should avoid breaking the Collector in its default configuration while we try to get a fix in the Prometheus exporter. #### Link to tracking issue No tracking issue currently, will probably file one later. #### Testing I performed some simple manual testing with a config file like the following: ```yaml receivers: otlp: [...] processors: batch: exporters: debug: [...] service: pipelines: logs: receivers: [otlp] processors: [batch] exporters: [debug] traces: receivers: [otlp] processors: [batch] exporters: [debug] telemetry: metrics: level: detailed traces: [...] logs: [...] ``` The two batch processors create aliased metric streams, which are only differentiated by the new component attributes. I checked that: 1. this config causes an error in the Prometheus exporter on main; 2. the error is resolved by default after applying this PR; 3. the error reappears when enabling the feature gate (this is expected) 4. scope attributes are added on the traces and logs no matter the state of the gate.
…peline components (#12812) Depends on #12856 Resolves #12676 This is a reboot of #11311, incorporating metrics defined in the [component telemetry RFC](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/rfcs/component-universal-telemetry.md) and attributes added in #12617. The basic pattern is: - When building any pipeline component which produces data, wrap the "next consumer" with instrumentation to measure the number of items being passed. This wrapped consumer is then passed into the constructor of the component. - When building any pipeline component which consumes data, wrap the component itself. This wrapped consumer is saved onto the graph node so that it can be retrieved during graph assembly. --------- Co-authored-by: Pablo Baeyens <[email protected]>
Discussed offline in relation to #12812
Introduction of this gate had some unintended side effects, such as removing attributes from loggers.