-
Notifications
You must be signed in to change notification settings - Fork 1.2k
prometheus exporter should use same normalization code as otel collector #6704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@ywwg, I think this is a good idea. I see that Moreover, we would ask for a stable release of |
@pellared Could we cross that bridge when it actually becomes a problem? For cyclic dependencies to arise from otlptranslator depending on |
@pellared Thinking about it some more, I don't think it makes sense to define OTel metric types in a Prometheus library (otlptranslator). The definitions are not Prometheus specific, and should therefore be in an open-telemetry owned library. If there are cyclic dependencies from otlptranslator depending on |
The intention is that the collector-contrib prometheusexporter and remotewriteexporter will depend on this otlptranslator package in order to unify the translation logic. Would that qualify as a cyclic dependency? I would prefer not to extract yet another library without a concrete need to do so. Would it be a problem to wait until that actually happens, and then fix it? Or is there some ossification that could take place that would make it harder to fix later. @aknuds1 While "otlptranslator" lives in the Prometheus organization, my understanding was that we wanted to consider it a joint venture between otel and prometheus and is not A Prometheus Library or An Otel Library. We chose to put it in the prometheus org because most of the logic revolves around prometheus' standards for normalization. But its dependence on open telemetry means the code necessarily needs to be a collaboration between the two organizations because it is the critical interface point between the two systems. So I would prefer not to make decisions based on "who owns" the code, but rather the mutual needs of the two codebases. |
That's what I'm arguing for. Let's not create a problem for ourselves unless it's actually necessary (i.e. we experience that otlptranslator's dependency on pdata causes a cyclical dependency when trying to build). I hope we're on the same page in this regard then?
@ywwg My argument here isn't fundamentally about ownership, but that I would like to avoid replicating OTel metric type definitions in otlptranslator. Let's say we were to add metric type definitions to otlptranslator, those definitions would still live on in OTOH if we wanted to put the canonical OTel metric type definitions in a single library, that doesn't cause cyclic dependencies, otlptranslator doesn't make sense to me, considering that Prometheus is only one OTel backend. Does that make sense? So long as there's no real technical obstacle to keeping metric type definitions in |
To clarify, we're not talking about a package-level cyclic dependency, but rather a product-level one. For example, imagine we need to introduce a new metric data type (this is a type of change that I am mostly concerned about). We wouldn't be able to add it to Meanwhile, any Collector component that depends on However, notice there is even a bigger problem that you already have. How would you handle a translation for a new metric data type in For me the design is already broken and requires fixing 😉 |
Thanks very much for the clarification @pellared. I will try to digest it once I have some time. We have an off-site this week, so fairly occupied because of that. |
@ywwg, I totally agree with you. I guess we could also have it in https://github.com/open-telemetry organization (it does not matter for me personally). The problem right now is that it's design is tightly coupled to Collector (as mentioned #6704 (comment)) making it not really reusable. I guess it may be worth to do a spike/prototype to validate if it even is worth to have a reusable component as introducing new types and mapping everything may both less performant and harder to maintain than duplicating some of the translation code in two places. Or maybe |
@pellared OK, I understand the nature of the problem now. Thanks for laying it out for those of us unfamiliar with the OTel ecosystem (I'm coming from the Prometheus side)!
I think that ideally, it would be best to have the OTel metric type definitions in one place, and it doesn't make sense for that one place to be an OTel/Prometheus interop library, at least if interop for other OTel backends should want to use them. Could you explain how it would be harder to maintain a free-standing (i.e. independent of the Collector) OTel library containing the metric type definitions, rather than duplicating them between libraries (couldn't there be more than two places, if other backends than Prometheus needs to use them?)? Also, what could make this solution less performant? I can't really see what would lead to the latter. To make sure we're on the same page; by spike/prototype, do you mean a spike that uses prometheus/otlptranslator in open-telemetry/opentelemetry-collector-contrib and open-telemetry/opentelemetry-go? Or do you mean a spike of an OTel library containing the metric type definitions? I'm generally for the idea of doing a spike :) |
Maintaining another repository, making releases, making sure that the API does not have any breaking changes is not free.
Here I can be wrong. I just thought that maybe we would need to translate the some types from our representation to the libraries representation. I would need to see the code and ideally benchmarks 😉
There can be multiple spikes so that we can even compare different designs. |
@pellared What do you think of a spike that uses the prometheus/otlptranslator branch containing the OTel metric type definitions, in open-telemetry/opentelemetry-collector-contrib and open-telemetry/opentelemetry-go? I think it would be valuable as a PoC of the otlptranslator API design. |
Yes. See: prometheus/otlptranslator#29 (comment) |
FYI @dashpole |
Hey I know this isn't directly related to the issue at hand (but somewhat related), but the current experience is a little odd with If we do: exporter, _:= expprom.New()
meterProvider := sdkmetric.NewMeterProvider(
sdkmetric.WithReader(exporter),
)
counter, _ := meter.Int64Counter(
"example.counter",
otelmetricsdk.WithDescription("Example Counter"),
otelmetricsdk.WithUnit("unit"),
) with the
which is now vaild by default since the default name validation scheme from
feels like bad dev UX atm, especially since we want to expose metrics with the best practices: For now you can fix this by setting the NameValidationScheme to the legacy one, which will eventually be removed. In the above meter constructor manually changing It feels like conflicting practices being used Wondering if a temporary patch would be useful to change the way the naming scheme is handled in
|
Indirectly related to open-telemetry/opentelemetry-go#6704 Signed-off-by: Alexandre Lamarre <[email protected]>
Instead of changing the default "NameValidationScheme", we should change the default translation method. See: https://github.com/prometheus/docs/blob/main/content/docs/guides/opentelemetry.md#utf-8. Probably we want this to be UnderscoreEscapingWithSuffixes until otel depends on the new shared otlptranslator repo |
A prometheus docs update went live and moved the link to : https://github.com/prometheus/docs/blob/main/docs/guides/opentelemetry.md#utf-8 |
@ywwg, at the prometheus wg, we had talked about keeping the existing options (negotiation for escaping, config for suffixes). Does your comment above mean you want to to switch to using the same "strategies" as promethues in the configuration? |
There's also inconsistencies between adds the plain text suffix anyways if it is not in the ucum map opentelemetry-go/exporters/prometheus/exporter.go Lines 512 to 514 in e578799
does not add the suffix if it is not in the ucum map I see that https://github.com/prometheus/otlptranslator takes the former stance to handle units (probably intentionally). I'm not sure what the timeline is for stabilizing https://github.com/prometheus/otlptranslator, but in the interim can we consider changing the behaviour of |
aha, I see -- I meant that negotiation is for determining what Prometheus can accept and configuration is for what the client wants to send. And "NameValidationScheme" indicates what the code can support. This is a lot of moving parts but I think it all holds together.
So one way to migrate is: Set up endpoints to always produce underscore+suffix names. Prometheus says it can Accept underscores, or it can Accept UTF-8 -- doesn't matter, the endpoint will always produce escaped names. Later, the operator can update the endpoint configuration to produce untranslated names. David, you talked about a situation where a customer has many endpoints and wants to update them all at the same time. One way to do this would be:
The only hole I'm seeing in this more complex plan is the need separate configuration on the endpoints for UTF-8 vs escaped name formation. I have some new prom docs to try to explain this: https://github.com/prometheus/docs/blob/main/docs/instrumenting/content_negotiation.md |
Thanks, makes sense. I think having configuration is fine. Do you think this configuration should be on the OpenTelemetry exporter, or should it be part of the prometheus client? E.g. on something like HandlerOpts, rather than an option in go.opentelemetry.io/otel/exporters/prometheus? I would prefer that OTel exporters be "dumb" wrappers around prometheus clients as much as possible, so that defining a metric using OTel with a Prometheus exporter vs defining the metric using the prometheus client produce the same result. |
…ntrib (#6839) Related to #6704 (comment) --------- Signed-off-by: Alexandre Lamarre <[email protected]> Co-authored-by: David Ashpole <[email protected]>
edit: rereading what you said I have been assuming this should be configuration on the prometheus endpoint code, not on the otel exporter. i.e., the config lives in the place where the conversion is being done from otel to prometheus, not in a pure-otel part of the code. |
I think I'm not sure what you mean by "prometheus client" here -- I am used to talking about "Otel Exporter" to mean the thing that sends native otel to an otel endpoint (which could be prometheus) and the "Otel Prometheus Exporter" which is the thing that exposes a /metrics endpoint on the collector. |
I think I agree that the configuration should go in the prometheus part of the code so that it applies everywhere |
I think in an ideal world, controlling the character set and the addition of suffixes would be a feature built-into Prometheus clients (e.g. client_golang). OpenTelemetry would benefit from that as a user of the client, but it wouldn't be opentelemetry-specific. But that would also presumably mean preserving the existing behavior of the Prometheus client as the default, which I believe is equivalent to NoTranslation. I'm not sure if we are comfortable with that being the default for OTel SDKs today. |
Problem Statement
Currently the prometheus export code has hand-rolled code to convert metric names from Otel-standard to Prometheus. This is out of sync with the exporters in otel-collector-contrib and the prometheus otel endpoint.
Proposed Solution
We should use the new https://github.com/prometheus/otlptranslator library to standardize the way that metric and label names are translated, including the different possible configurations
Alternatives
We could clone the current otel-collector-contrib code, but that code will soon be updated to use the new library and it seems unnecessary to go the extra step
Additional Context
There is a related issue here: open-telemetry/opentelemetry-collector-contrib#35459
The text was updated successfully, but these errors were encountered: