[exporter/kafkaexporter] Enable Partitioning by Specific Attribute for Logs & Metrics #38484

danielkatz · 2025-03-09T13:05:24Z

Component(s)

exporter/kafka

Is your feature request related to a problem? Please describe.

Currently, the Kafka exporter supports partitioning traces by trace id, which is very useful for context-aware data processing. However, for logs and metrics, we only have options like partition_metrics_by_resource_attributes and partition_logs_by_resource_attributes. These work well for load-balancing Kafka partitions but fall short when we need to partition data based on a single, specific attribute of a log/metric (such as trace_id or tenant_id). This limitation complicates tasks like merging related logs or metrics for enhanced processing.

Describe the solution you'd like

I propose introducing configuration options that allow specifying one attribute for partitioning logs and metrics. For instance, options like:

partition_logs_by_attribute: <name_of_the_attribute>
partition_metrics_by_attribute: <name_of_the_attribute>

This would extend the functionality provided for traces to logs and metrics, enabling context-aware data routing (e.g., merging by a specific attribute like trace_id, tenant_id, etc.) and improving the overall flexibility of the Kafka exporter.

Describe alternatives you've considered

There is non i could see, bar from exporting to kafka as-is, and then consuming and reemitting them into kafka with the proper partitioning by a custom service.

Additional context

No response

The text was updated successfully, but these errors were encountered:

github-actions · 2025-03-09T13:05:40Z

Pinging code owners:

exporter/kafka: @pavolloffay @MovieStoreGuy

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Frapschen · 2025-03-14T08:10:01Z

More details need to be discussed regarding the scenario where a user sets both partition_logs_by_resource_attributes: true and partition_metrics_by_attribute: <name_of_the_attribute>. What behavior will the collector exhibit in this case?

For me, I prefer the partition_metrics_by_attribute configuration to take precedence.

danielkatz · 2025-03-17T13:39:10Z

More details need to be discussed regarding the scenario where a user sets both partition_logs_by_resource_attributes: true and partition_metrics_by_attribute: <name_of_the_attribute>. What behavior will the collector exhibit in this case?

For me, I prefer the partition_metrics_by_attribute configuration to take precedence.

i agree

namco1992 · 2025-03-27T03:49:21Z

I'd like to work on this issue if it's accepted by the owners/maintainers as this is a blocker for us too.

Also cc the new owner @axw

axw · 2025-03-27T05:31:11Z

I'm definitely on board with partitioning by attributes. I'd ideally like to see a more complete design for the configuration, e.g.

Should this be only about span/data point/log record-level attributes? What about resource or scope attributes?
Should we include the ability to partition on client metadata?
Are we just talking about setting the message key, or also grouping data that share the same key?

In #38985 I'm planning to make topic & encoding configuration signal-specific (and I have a WIP branch for the exporter). I think it makes sense to do the same for partitioning, which would enable us to partition on properties that are signal-specific.

These work well for load-balancing Kafka partitions but fall short when we need to partition data based on a single, specific attribute of a log/metric (such as trace_id or tenant_id).

Generally speaking I would expect trace_id and tenant_id to sit at different scopes:

trace_id is always at the span and log record-level
tenant_id is application specific so it could come from anywhere

I haven't thought about all of this in great depth, but here's a rough proposal for discussion:

Introduce logs::message_key, metrics::message_key, traces::message_key
Deprecate partition_traces_by_id, partition_metrics_by_resource_attributes, and partition_logs_by_resource_attributes

I don't know what the new message_key config would look like exactly, but it should support at least:

hashing on client metadata (e.g. otlpreceiver request headers)
hashing the resource attributes, equivalent to partition_<signal>_by_resource_attributes (any signal)
arbitrary resource attributes (any signal)
trace_id (traces and logs only)
arbitrary span/datapoint/log record attributes (depending on signal)

We could potentially use OTTL for defining the message key, which would give the greatest flexibility. That way we may also get to use OTTL's context inference for deciding how to group data before encoding to a message. That may imply a single message for each span if you're using trace_id though, which would be a change from how things are today.

pjanotti · 2025-05-12T23:32:38Z

@axw is this something that you plan to work on? Or should we add the help wanted label?

axw · 2025-05-13T01:12:30Z

@pjanotti this can be solved by #39199 and #39208. Once I have a sponsor for #39199 I'll work on it.

danielkatz added enhancement New feature or request needs triage New item requiring triage labels Mar 9, 2025

github-actions bot added the exporter/kafka label Mar 9, 2025

danielkatz changed the title ~~[exporter/kafkaexporter]~~ [exporter/kafkaexporter] Enable Partitioning by Specific Resource Attribute for Logs & Metrics Mar 9, 2025

danielkatz changed the title ~~[exporter/kafkaexporter] Enable Partitioning by Specific Resource Attribute for Logs & Metrics~~ [exporter/kafkaexporter] Enable Partitioning by Specific Attribute for Logs & Metrics Mar 9, 2025

github-actions bot mentioned this issue Mar 11, 2025

Weekly Report: 2025-03-04 - 2025-03-11 #38503

Closed

github-actions bot mentioned this issue Mar 18, 2025

Weekly Report: 2025-03-11 - 2025-03-18 #38702

Closed

This was referenced Mar 23, 2025

Weekly Report: 2025-03-16 - 2025-03-23 LucaLanziani/opentelemetry-collector-contrib#16

Closed

Weekly Report: 2025-03-16 - 2025-03-23 LucaLanziani/opentelemetry-collector-contrib#17

Closed

github-actions bot mentioned this issue Mar 25, 2025

Weekly Report: 2025-03-18 - 2025-03-25 #38935

Closed

namco1992 mentioned this issue Mar 27, 2025

Extend the support for partition_logs_by_resource_attributes to raw encoding #38999

Closed

github-actions bot mentioned this issue Apr 1, 2025

Weekly Report: 2025-03-25 - 2025-04-01 #39070

Closed

axw mentioned this issue Apr 7, 2025

New component: OTTL-based request partitioning processor #39199

Open

github-actions bot mentioned this issue Apr 8, 2025

Weekly Report: 2025-04-01 - 2025-04-08 #39228

Closed

This was referenced Apr 15, 2025

Weekly Report: 2025-04-08 - 2025-04-15 #39396

Closed

Weekly Report: 2025-04-15 - 2025-04-22 #39524

Closed

github-actions bot mentioned this issue Apr 29, 2025

Weekly Report: 2025-04-22 - 2025-04-29 #39708

Closed

github-actions bot mentioned this issue May 6, 2025

Weekly Report: 2025-04-29 - 2025-05-06 #39865

Closed

pjanotti removed the needs triage New item requiring triage label May 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[exporter/kafkaexporter] Enable Partitioning by Specific Attribute for Logs & Metrics #38484

[exporter/kafkaexporter] Enable Partitioning by Specific Attribute for Logs & Metrics #38484

danielkatz commented Mar 9, 2025 •

edited

Loading

github-actions bot commented Mar 9, 2025

Uh oh!

Frapschen commented Mar 14, 2025

Uh oh!

danielkatz commented Mar 17, 2025

Uh oh!

namco1992 commented Mar 27, 2025 •

edited

Loading

Uh oh!

axw commented Mar 27, 2025

Uh oh!

pjanotti commented May 12, 2025

Uh oh!

axw commented May 13, 2025

Uh oh!

[exporter/kafkaexporter] Enable Partitioning by Specific Attribute for Logs & Metrics #38484

[exporter/kafkaexporter] Enable Partitioning by Specific Attribute for Logs & Metrics #38484

Comments

danielkatz commented Mar 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Component(s)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

github-actions bot commented Mar 9, 2025

Uh oh!

Frapschen commented Mar 14, 2025

Uh oh!

danielkatz commented Mar 17, 2025

Uh oh!

namco1992 commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

axw commented Mar 27, 2025

Uh oh!

pjanotti commented May 12, 2025

Uh oh!

axw commented May 13, 2025

Uh oh!

danielkatz commented Mar 9, 2025 •

edited

Loading

namco1992 commented Mar 27, 2025 •

edited

Loading