-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Support for configuring Kafka consumer rebalance strategy and group instance ID #39513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
/label internal/kafka |
Pinging code owners for internal/kafka: @pavolloffay @MovieStoreGuy @axw. See Adding Labels via Comments if you do not have permissions to add labels yourself. For example, comment '/label priority:p2 -needs-triaged' to set the priority and remove the needs-triaged label. |
A code owner has responded positively to the PR that resolves this issue, removing |
… Kafka consumer rebalance strategy and group instance ID (#39517) As metioned in issue# [39513](#39513) … Kafka consumer rebalance strategy and group instance ID This enhancement introduces two optional settings: group_rebalance_strategy and group_instance_id. These allow users to override the default Range-based rebalance strategy and optionally provide a static instance ID (as per KIP-345) for cooperative sticky balancing. This is particularly useful when handling high-cardinality metric workloads, as it reduces rebalance impact, improves cache reuse, and boosts CPU efficiency. Both settings are optional to maintain full backward compatibility. <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description <!-- Issue number (e.g. #1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes <!--Describe what testing was performed and which tests were added.--> #### Testing <!--Describe the documentation added.--> #### Documentation <!--Please delete paragraphs that you did not use before submitting.--> --------- Co-authored-by: Srinivas Venkata Bevara <[email protected]> Co-authored-by: Antoine Toulme <[email protected]> Co-authored-by: Vashistha Kumar Singh <[email protected]> Co-authored-by: vs667919 <[email protected]>
… Kafka consumer rebalance strategy and group instance ID (open-telemetry#39517) As metioned in issue# [39513](open-telemetry#39513) … Kafka consumer rebalance strategy and group instance ID This enhancement introduces two optional settings: group_rebalance_strategy and group_instance_id. These allow users to override the default Range-based rebalance strategy and optionally provide a static instance ID (as per KIP-345) for cooperative sticky balancing. This is particularly useful when handling high-cardinality metric workloads, as it reduces rebalance impact, improves cache reuse, and boosts CPU efficiency. Both settings are optional to maintain full backward compatibility. <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes <!--Describe what testing was performed and which tests were added.--> #### Testing <!--Describe the documentation added.--> #### Documentation <!--Please delete paragraphs that you did not use before submitting.--> --------- Co-authored-by: Srinivas Venkata Bevara <[email protected]> Co-authored-by: Antoine Toulme <[email protected]> Co-authored-by: Vashistha Kumar Singh <[email protected]> Co-authored-by: vs667919 <[email protected]>
… Kafka consumer rebalance strategy and group instance ID (open-telemetry#39517) As metioned in issue# [39513](open-telemetry#39513) … Kafka consumer rebalance strategy and group instance ID This enhancement introduces two optional settings: group_rebalance_strategy and group_instance_id. These allow users to override the default Range-based rebalance strategy and optionally provide a static instance ID (as per KIP-345) for cooperative sticky balancing. This is particularly useful when handling high-cardinality metric workloads, as it reduces rebalance impact, improves cache reuse, and boosts CPU efficiency. Both settings are optional to maintain full backward compatibility. <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes <!--Describe what testing was performed and which tests were added.--> #### Testing <!--Describe the documentation added.--> #### Documentation <!--Please delete paragraphs that you did not use before submitting.--> --------- Co-authored-by: Srinivas Venkata Bevara <[email protected]> Co-authored-by: Antoine Toulme <[email protected]> Co-authored-by: Vashistha Kumar Singh <[email protected]> Co-authored-by: vs667919 <[email protected]>
Uh oh!
There was an error while loading. Please reload this page.
Component(s)
receiver/kafka
Is your feature request related to a problem? Please describe.
Support for configuring Sticky Rebalancing Strategy (Consumer.Group.Rebalance.Strategy = sticky) in the Kafka receiver implementation via IBM Sarama client.
This includes:
The kafkareceiver currently uses the IBM Sarama client with the default range rebalancing strategy for consumer group coordination. This often leads to uneven partition assignments and large-scale rebalances when pods are restarted or scaled, causing unnecessary cache reloading, CPU spikes, and latency due to metric metadata being recomputed or fetched again.
This is especially problematic in large-scale OpenTelemetry Collector deployments that rely on consistent partition ownership for optimized caching and reduced memory churn.
Describe the solution you'd like
Expose support for the sticky rebalancing strategy (stickyBalanceStrategy) in the kafkareceiver using the IBM Sarama client.
Specifically:
Add configuration option in kafkareceiver to allow Sarama client's setting Consumer.Group.Rebalance.Strategy
Allow optionally specifying Group.InstanceId to leverage static membership (e.g., group_instance_id: ${POD_NAME} for StatefulSets)
Default to current range strategy if no value is provided to maintain backward compatibility
Example config:
Note: Supported Group.InstanceId for Kafka >2.3
This would allow consumers to maintain a more consistent partition-to-replica assignment across restarts and reduce the operational load during scaling events.
Describe alternatives you've considered
Additional context
Sticky balancing in Sarama:
https://github.com/IBM/sarama/blob/main/balance_strategy.go
https://github.com/IBM/sarama/blob/main/consumer_group.go
KafkaReceiver implementation:
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/internal/kafka/client.go
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/kafkareceiver
This enhancement will help large-scale OTel deployments (millions of unique time series) reduce rebalance impact and improve cache and CPU efficiency.
The text was updated successfully, but these errors were encountered: