-
Notifications
You must be signed in to change notification settings - Fork 2.8k
End to end trace duration metric #37597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
/label connector/signaltometricsconnector |
To clarify, in your example trace, there is an asynchronous operation? That is, the root span A does not encompass the whole transaction, and span C ends after span A does. Because otherwise you could just use the duration of span A as the trace duration. The problem you are describing is non-trivial because it requires the same machine to have access to all the spans of a trace at the same time. It makes the system stateful, and if you are running more than one collector it would require using the There is a processor that can calculate trace duration - the tail sampling processor, but this is for the purposes of sampling, not spanmetrics. I don't work or own the span metrics connector though, so I'll wait for codeowner input. Perhaps there is some prior discussion on this but I couldn't find any. |
Hey @jamesmoessis Thanks for the response. Yeah, unfortunately the spans aren't synchronous, service A operates on the request, then closes the span, then there may up to a few seconds or so before service B begins work(Same for C), so adding the aggregate durations of the sub spans gives a different number than the end span closing to the start span opening. Yeah understand on your point about statefulness. I also was looking at the https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/groupbytraceprocessor/README.md groupbytrace processor which seems like it could work with the span metrics connector? Something like traces come in -> groupbytraceprocessor with 60s hold -> spanmetrics connector -> transform processor to store service A spans start timestamp in cache, and then overwrite the duration metrics for service B and service C? |
@decimalst that would work if you only have one collector, or you somehow ensure that all spans from the same trace end up in the same collector. If you are running multiple collectors then you need two layers of collectors, with the load balancing exporter. |
Thanks for @jamesmoessis clear explanation. I also believe that this case is more complex than one spanmetrics can handle. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Component(s)
connector/spanmetrics
Is your feature request related to a problem? Please describe.
Hi, I have a scenario where I have a trace, with three sub spans, from three different services. In terms of flow, service A will always produce the root span for the trace, then service B does some work on the request, and service C finalizes the request and publishes it.
I want to calculate two metrics - the end to end duration of the trace defined as the time between span 1 from service A starting to when the span from service C closes.
Describe the solution you'd like
Ideally, the ability to iterate through spans grouped by the same trace ID. Then, output the end to end duration of the trace defined as the time between the root span from service A's start time to the span from service C end time.
Describe alternatives you've considered
Maybe the signaltometrics connector instead https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/connector/signaltometricsconnector ?
We could also maybe achieve this with a transform processor based on the spanmetrics durations
Additional context
No response
The text was updated successfully, but these errors were encountered: