Skip to content

Consolidate common Kafka configuration within internal/kafka #38411

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
axw opened this issue Mar 6, 2025 · 8 comments · Fixed by #38817
Closed

Consolidate common Kafka configuration within internal/kafka #38411

axw opened this issue Mar 6, 2025 · 8 comments · Fixed by #38817

Comments

@axw
Copy link
Contributor

axw commented Mar 6, 2025

Component(s)

exporter/kafka, extension/observer/kafkatopicsobserver, internal/kafka, receiver/kafka, receiver/kafkametrics

Describe the issue you're reporting

There are currently four components that use Kafka in some way:

  • exporter/kafkaexporter
  • receiver/kafkareceiver
  • receiver/kafkametricsreceiver
  • extension/observer/kafkatopicsobserver

These all depend on internal/kafka, but they each independently define a common subset of configuration. There are some subtle differences between them, and some subtle issues, e.g.

  • inconsistent default for client ID:
    • kafkaexporter defaults to "sarama"
    • kafkareceiver defaults to "otel-collector"
    • kafkametricsreceiver defaults to "otel-metrics-receiver"
    • kafkatopicsobserver hardcodes it to "sarama" (it doesn't configure it; sarama defaults to this), and does not even provide configuration
  • configuration is apparently being cargo-culted, e.g. SessionTimeout and HeartbeatTimeout exist in kafkatopicsobserver, but those configuration attributes are only relevant to Kafka consumers -- which the observer is not.

I propose we move common configuration to a new internal/kafka/configkafka package, modelled after confighttp, etc. Configuration will be split by purpose, i.e. general client config, consumer group config, producer config.

At the same time I propose we make the defaults for this common config consistent across each of the above components. This would be a breaking change, since it implies changing the default client ID. Users who depend on the client ID can explicitly configure it before/during upgrade to the new collector version.

I have a WIP branch at https://github.com/axw/opentelemetry-collector-contrib/tree/configkafka for an example of how this could look. There's also some unrelated cleanups/improvements in there that I intend to create PR(s) out of. Probably the most interesting parts would be:

If there's agreement, I will clean up my branch and create a PR.

Copy link
Contributor

github-actions bot commented Mar 6, 2025

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@MovieStoreGuy
Copy link
Contributor

Makes sense to me.

@dmitryax
Copy link
Member

dmitryax commented Mar 7, 2025

Sounds reasonable to me!

@axw
Copy link
Contributor Author

axw commented Mar 10, 2025

Part 1 is ready for review: #38482

MovieStoreGuy pushed a commit that referenced this issue Mar 12, 2025
#### Description

This is the first part of
#38411,
creating a package with common configuration structs for Kafka-related
components.

The new types have stronger validation, and all validation is done in
one place rather than in each component. I have therefore removed some
tests from the components related to validating authentication config,
and ensured coverage within internal/kafka.

The configkafka package is complete, but to keep the PR contained I've
only updated the components to use the new `AuthenticationConfig` type
for now. In future PRs I will:
- update the components to embed ClientConfig, ConsumerConfig, and
ProducerConfig where appropriate
- introduce functions in the internal/kafka package for creating Sarama
clients based on configkafka types
 - unify the defaults across the components

#### Link to tracking issue

First part of
#38411

#### Testing

Added unit tests.

#### Documentation

N/A, no user-facing changes yet.
@axw
Copy link
Contributor Author

axw commented Mar 14, 2025

Part 2: #38597

MovieStoreGuy pushed a commit that referenced this issue Mar 17, 2025
#### Description

Remove the duplicated client config from the Config struct, and embed
configkafka.ClientConfig. Remove the duplicated cluster admin client
creation code, and use kafka.NewSaramaClientConfig.

As a result of this change:
- protocol_version config is no longer required, it has a default like
the other Kafka components do
 - client_id config is now supported, and defaults to "otel-collector"
 - metadata config is now supported
- validation of client config has been removed from this package, since
it's covered by validation in the configkafka types

#### Link to tracking issue

Part of
#38411

#### Testing

Ran the unit tests.

#### Documentation

Updated README.
@axw
Copy link
Contributor Author

axw commented Mar 17, 2025

Part 3: #38634

@axw
Copy link
Contributor Author

axw commented Mar 17, 2025

Part 4, exporter: #38665

Nearly there...

axw added a commit to axw/opentelemetry-collector-contrib that referenced this issue Mar 17, 2025
Use configkafka.ClientConfig and configkafka.ConsumerConfig,
and the newly extract consumer client. This is a non-functional
change, as the config defaults do not change for this component.

Relates to open-telemetry#38411
MovieStoreGuy pushed a commit that referenced this issue Mar 20, 2025
#### Description

Use configkafka and extract a function for constructing a
sarama.SyncProducer given configkafka.ClientConfig and
configkafka.ProducerConfig.

As part of the change, `client_id` has been updated to default to
"otel-collector" instead of "sarama" by default. This is a breaking
change for anyone relying on client ID for monitoring or quota
management.

#### Link to tracking issue

Part of #38411

#### Testing

Updated unit tests.

Manually tested against Redpanda:

`$ docker run --publish=9093:9093 --health-cmd "rpk cluster health |
grep 'Healthy:.*true'"
docker.redpanda.com/redpandadata/redpanda:v23.1.11 redpanda start
--kafka-addr=internal://0.0.0.0:9092,external://0.0.0.0:9093 --smp=1
--memory=1G --mode=dev-container`

Ran the collector with `kafkametrics -> kafkaexporter -> Redpanda ->
kafkareceiver -> debugexporter`:

```yaml
receivers:
  kafka:
    brokers: [localhost:9093]
  kafkametrics:
    brokers: [localhost:9093]
    collection_interval: 10s
    scrapers: [topics, consumers]

exporters:
  debug:
    verbosity: detailed
  kafka:
    brokers: [localhost:9093]

service:
  pipelines:
    metrics/kafka:
      receivers: [kafka]
      exporters: [debug]
    metrics:
      receivers: [kafkametrics]
      exporters: [kafka]
```

#### Documentation

Updated README.
MovieStoreGuy pushed a commit that referenced this issue Mar 20, 2025
#### Description

Remove the duplicated client config from the Config struct, and embed
configkafka.ClientConfig. Remove the duplicated (cluster admin) client
creation code, and use kafka.NewSaramaClient.

This means that client_id now defaults to "otel-collector" instead of
"otel-metrics-receiver", which is a breaking change.

We also add "metadata.refresh_interval" to ClientConfig, defaulting to
10m, and deprecate "refresh_frequency" from the receiver config. If
metadata.refresh_interval is not explicitly set, but refresh_frequency
is, then the latter will be used for now.

I discovered some of the existing tests are not working correctly while
working on this PR, but fixing that will require a much more extensive
refactoring effort. I will follow up with another PR once this one is
merged.

#### Link to tracking issue

Part of
#38411

#### Testing

Updated unit tests.

#### Documentation

Updated README.
axw added a commit to axw/opentelemetry-collector-contrib that referenced this issue Mar 20, 2025
Use configkafka.ClientConfig and configkafka.ConsumerConfig,
and the newly extracted consumer client. This is a non-functional
change, as the config defaults do not change for this component.

Relates to open-telemetry#38411
@axw
Copy link
Contributor Author

axw commented Mar 20, 2025

Part 5 (last one!): #38817

Fiery-Fenix pushed a commit to Fiery-Fenix/opentelemetry-collector-contrib that referenced this issue Apr 24, 2025
#### Description

Use configkafka and extract a function for constructing a
sarama.SyncProducer given configkafka.ClientConfig and
configkafka.ProducerConfig.

As part of the change, `client_id` has been updated to default to
"otel-collector" instead of "sarama" by default. This is a breaking
change for anyone relying on client ID for monitoring or quota
management.

#### Link to tracking issue

Part of open-telemetry#38411

#### Testing

Updated unit tests.

Manually tested against Redpanda:

`$ docker run --publish=9093:9093 --health-cmd "rpk cluster health |
grep 'Healthy:.*true'"
docker.redpanda.com/redpandadata/redpanda:v23.1.11 redpanda start
--kafka-addr=internal://0.0.0.0:9092,external://0.0.0.0:9093 --smp=1
--memory=1G --mode=dev-container`

Ran the collector with `kafkametrics -> kafkaexporter -> Redpanda ->
kafkareceiver -> debugexporter`:

```yaml
receivers:
  kafka:
    brokers: [localhost:9093]
  kafkametrics:
    brokers: [localhost:9093]
    collection_interval: 10s
    scrapers: [topics, consumers]

exporters:
  debug:
    verbosity: detailed
  kafka:
    brokers: [localhost:9093]

service:
  pipelines:
    metrics/kafka:
      receivers: [kafka]
      exporters: [debug]
    metrics:
      receivers: [kafkametrics]
      exporters: [kafka]
```

#### Documentation

Updated README.
Fiery-Fenix pushed a commit to Fiery-Fenix/opentelemetry-collector-contrib that referenced this issue Apr 24, 2025
…lemetry#38634)

#### Description

Remove the duplicated client config from the Config struct, and embed
configkafka.ClientConfig. Remove the duplicated (cluster admin) client
creation code, and use kafka.NewSaramaClient.

This means that client_id now defaults to "otel-collector" instead of
"otel-metrics-receiver", which is a breaking change.

We also add "metadata.refresh_interval" to ClientConfig, defaulting to
10m, and deprecate "refresh_frequency" from the receiver config. If
metadata.refresh_interval is not explicitly set, but refresh_frequency
is, then the latter will be used for now.

I discovered some of the existing tests are not working correctly while
working on this PR, but fixing that will require a much more extensive
refactoring effort. I will follow up with another PR once this one is
merged.

#### Link to tracking issue

Part of
open-telemetry#38411

#### Testing

Updated unit tests.

#### Documentation

Updated README.
Fiery-Fenix pushed a commit to Fiery-Fenix/opentelemetry-collector-contrib that referenced this issue Apr 24, 2025
#### Description

Use configkafka.ClientConfig and configkafka.ConsumerConfig, and the
newly extracted consumer client. Redundant validation tests have been
removed, as they are now covered by configkafka.

This is a non-functional change, as the config defaults do not change
for this component. Nevertheless I have manually tested -- see Testing
section below.

#### Link to tracking issue

Fixes open-telemetry#38411

#### Testing

Updated unit tests.

Manually tested against Redpanda:

`$ docker run --publish=9093:9093 --health-cmd "rpk cluster health |
grep 'Healthy:.*true'"
docker.redpanda.com/redpandadata/redpanda:v23.1.11 redpanda start
--kafka-addr=internal://0.0.0.0:9092,external://0.0.0.0:9093 --smp=1
--memory=1G --mode=dev-container`

Ran the collector with `kafkametrics -> kafkaexporter -> Redpanda ->
kafkareceiver -> debugexporter`:

```yaml
receivers:
  kafka:
    brokers: [localhost:9093]
  kafkametrics:
    brokers: [localhost:9093]
    collection_interval: 10s
    scrapers: [topics, consumers]

exporters:
  debug:
    verbosity: detailed
  kafka:
    brokers: [localhost:9093]

service:
  pipelines:
    metrics/kafka:
      receivers: [kafka]
      exporters: [debug]
    metrics:
      receivers: [kafkametrics]
      exporters: [kafka]
```

#### Documentation

Updated README to clarify that `protocol_version` is not required.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants