Skip to content

Permanently enable 'telemetry.newPipelineTelemetry' feature gate #12856

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 17, 2025

Conversation

djaglowski
Copy link
Member

Discussed offline in relation to #12812

Introduction of this gate had some unintended side effects, such as removing attributes from loggers.

@github-actions github-actions bot requested review from dmathieu and mx-psi April 15, 2025 14:03
Copy link

codecov bot commented Apr 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.52%. Comparing base (78663f9) to head (208b49b).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #12856      +/-   ##
==========================================
+ Coverage   91.47%   91.52%   +0.04%     
==========================================
  Files         494      494              
  Lines       26993    26982      -11     
==========================================
+ Hits        24691    24694       +3     
+ Misses       1818     1809       -9     
+ Partials      484      479       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@jade-guiton-dd jade-guiton-dd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just two suggestions

@djaglowski
Copy link
Member Author

It's been pointed out elsewhere that since the feature gate being enabled is a workaround for the problems it introduced, it's likely that users will already be using the gate. Therefore, removing it outright would break users.

Our featuregate docs explicitly call out that alpha gates may be immediately deprecated, but also that this is equivalent to permanently disabling the gate. Therefore, from a functionality perspective, stable is the most direct way to lock the gate into a permanently enabled state.

We can also just move the gate along our normal path to beta, but it's unclear that anyone would benefit from disabling the gate, and also IMO sends the wrong message that the gate is progressing as normal.

@djaglowski djaglowski closed this Apr 16, 2025
@djaglowski djaglowski reopened this Apr 16, 2025
@djaglowski djaglowski changed the title Remove 'telemetry.newPipelineTelemetry' feature gate Permanently enable 'telemetry.newPipelineTelemetry' feature gate Apr 16, 2025
@djaglowski
Copy link
Member Author

The spell check failure appears to be bogus and unrelated to this PR.

@bogdandrutu bogdandrutu added this pull request to the merge queue Apr 17, 2025
Merged via the queue into open-telemetry:main with commit dc8e2dd Apr 17, 2025
56 checks passed
@djaglowski djaglowski deleted the rm-attributes-gate branch April 17, 2025 17:36
songy23 added a commit to songy23/opentelemetry-service that referenced this pull request Apr 24, 2025
@songy23 songy23 mentioned this pull request Apr 24, 2025
github-merge-queue bot pushed a commit that referenced this pull request Apr 28, 2025
…behind feature gate (#12933)

#### Context

PR #12617 introduced logic to inject new instrumentation scope
attributes in all internal telemetry to identify which Collector
component it came from. These attributes had already been added to
internal logs as regular log attributes, and this PR switched them to
scope attributes for consistency. The new logic was placed behind an
Alpha stage feature gate, `telemetry.newPipelineTelemetry`.

Unfortunately, the default "off" state of the feature gate disabled the
injection of component-identifying attributes entirely, which was a
regression since they had been present in internal logs in previous
releases. See issue #12870 for an in-depth discussion of this issue.

To correct this, PR #12856 was filed, which stabilized the feature gate,
making it on by default, with no way to disable it, and removed the
logic that the feature gate used to toggle. This was thought to be the
simplest way to mitigate the regression in the "off" state, since we
planned to stabilize the feature eventually anyways.

Unfortunately, it was found that the "on" state of the feature gate
causes a different issue: [the Prometheus
exporter](https://github.com/open-telemetry/opentelemetry-go/tree/main/exporters/prometheus)
is the default way of exporting the Collector's internal metrics,
accessible at `collector:8888/metrics`. This exporter does not currently
have any support for instrumentation scope attributes, meaning that
metric streams differentiated by said attributes but not by any other
identifying property will appear as aliases to Prometheus, which causes
an error. This completely breaks the export of Collector metrics through
Prometheus under some simple configurations, which is a release blocker.

#### Description

To fix this issue, this PR sets the `telemetry.newPipelineTelemetry`
feature gate back to "Alpha" (off by default), and reintroduces logic to
disable the injection of the new instrumentation scope attributes when
the gate is off, but only in internal metrics. Note that the new logic
is still used unconditionally for logs and traces, to avoid
reintroducing the logs issue (#12870).

This should avoid breaking the Collector in its default configuration
while we try to get a fix in the Prometheus exporter.

#### Link to tracking issue
No tracking issue currently, will probably file one later.

#### Testing

I performed some simple manual testing with a config file like the
following:

```yaml
receivers:
  otlp: [...]
processors:
  batch:
exporters:
  debug: [...]
service:
  pipelines:
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [debug]
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [debug]
  telemetry:
    metrics:
      level: detailed    
    traces: [...]
    logs: [...]
```

The two batch processors create aliased metric streams, which are only
differentiated by the new component attributes. I checked that:
1. this config causes an error in the Prometheus exporter on main;
2. the error is resolved by default after applying this PR;
3. the error reappears when enabling the feature gate (this is expected)
4. scope attributes are added on the traces and logs no matter the state
of the gate.
github-merge-queue bot pushed a commit that referenced this pull request May 12, 2025
…peline components (#12812)

Depends on
#12856

Resolves #12676

This is a reboot of #11311, incorporating metrics defined in the
[component telemetry
RFC](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/rfcs/component-universal-telemetry.md)
and attributes added in #12617.

The basic pattern is:
- When building any pipeline component which produces data, wrap the
"next consumer" with instrumentation to measure the number of items
being passed. This wrapped consumer is then passed into the constructor
of the component.
- When building any pipeline component which consumes data, wrap the
component itself. This wrapped consumer is saved onto the graph node so
that it can be retrieved during graph assembly.

---------

Co-authored-by: Pablo Baeyens <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants