[reciever/prometheusremotewritereceiver] Handle multiple timeseries with same metric name+type+unit #38453

perebaj · 2025-03-07T13:00:22Z

Description

This PR belongs to part of the Linux foundation mentee program. Here we are dealing with data points slices that must be added to the same metric when the incoming time series have some attributes in common. Like same resource, metricName, scopeName, scopeVersion, unitRef and timeseries type. The reference to this implementation can be found https://opentelemetry.io/docs/specs/otel/metrics/data-model/#opentelemetry-protocol-data-model

Besides that. We are creating 2 new test cases. The first one, to validate the described behavior above.

The second is to validate an error case. When the unitRef passed doesn't match with the symbols slice, causing a panic. This test case is important to guarantee that the error is being handled correctly.

Link to tracking issue

Fixes part of #37277. Bullet point Handle multiple timeseries with same metric name+type+unit

…bels, same datadatpoint slice

ArthurSens

It's been a while since David and I decided we needed a cache for what you're doing now. So, I'm making an effort to remember why we wanted this cache in the first place. I'm separating the content here into three parts!

Why caching?

Starting with the fundamentals, why people build caches: A cache's primary purpose is to increase data retrieval performance by reducing the need to access the underlying slower storage layer.

In the terms of this code you're working on:

The data we want to retrieve: A possible already-existent Metric with the same metricName+unit+description as the one we're currently processing.
The slower storage layer: A loop over all ResourceMetrics, ScopeMetrics and Metrics until we find the one that fits.

You are correct that a cache for us here will be a hashmap (I'm looking at the intraRequestCache argument you created). Accessing a hashmap element is O(1) while looping over an array is O(n). We sacrifice a bit of memory for a faster translation! But the cache doesn't make sense if, after looking it up, we still need to loop over something else to see if it matches.

OTel Data Structure

Now, ignoring the cache problem, let's look at the OTel Metric Data Model.

Several parts identify a particular Metric. The first ones on the list are the Resource Attributes and Instrumentation Scope! In code, we can see that because we cannot access a Metric without accessing the mentioned information first.

rm := otelMetrics.ResourceMetrics().AppendEmpty()
// Add resource attributes
sm := rm.ScopeMetrics().AppendEmpty() 
sm.Scope.SetName(...)
sm.Scope.SetVersion(...)
metric := scope.Metrics().AppendEmpty() 
// Notice how we can never create metrics without first creating resource/scope!!

// If we wanted our metric to have a different Instrumentation Scope information, what we would need to do is:
sm2 := rm.ScopeMetrics.AppendEmpty()
sm2.Scope.SetName(...)
sm2.Scope.SetVersion(...)
// We can't re-use the previous metric here because there isn't anything like
// sm.Metrics().At(x).UpdateMetric(sm2.Metrics().AppendEmpty())
// We're forced to create a new metric for the new Scope; therefore, we have a brand new metric just because we changed the Scope
metric2 := sm2.Metrics().AppendEmpty()

// The same pattern repeats if we want a new Attribute in our Resource.

The second part is more obvious: metric name, unit, and type are identifying. Looking at the code:

// code that creates Resource and ScopeMetrics are hidden
metric1 := sm.Metrics().AppendEmpty()
metric1.SetName("new-metric")
metric1.SetUnit("s")

// Type is defined while creating data points. E.g., if it's a gauge metric, we set the metric type by writing:
gauge := metric1.SetEmptyGauge()

// What we want to avoid is calling SetEmptyGauge() again if we notice that a new timeseries came in with the same
// identifying information above: metric name, unit, and type.

Getting back to our problem: Caching OTel Metrics

We want to avoid looping over ResourceMetrics, ScopeMetrics, and Metrics to find something that matches name + type + unit. We want direct access to a Metric object with the information we seek.

I wanted you to exercise and propose something based on the information I just gave you. I'm going to give you two hints, though! The comment position with the TODO probably sent you in the wrong direction; I'm not sure if that's the best place to build and handle the caching logic. Passing the existing ResourceMetrics cache we have today doesn't seem correct, we want a new cache!

.chloggen/separate_timeseries_same_labels_same_datapoints.yaml

ArthurSens · 2025-03-07T19:41:14Z

.chloggen/separate_timeseries_same_labels_same_datapoints.yaml

+# (Optional) One or more lines of additional information to render under the primary note.
+# These lines will be padded with 2 spaces and then inserted directly into the document.
+# Use pipe (|) for multiline entries.
+subtext: timeseries that has the same labels and name/unitRef metric Metadata should belongs to the same datapoints slice.


Maybe too much details that doesn't add much value?