Skip to content

[receiver/kafka] 0.124.0 release broke default log text encoding #39793

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kuiperda opened this issue Apr 30, 2025 · 5 comments · Fixed by #39806
Closed

[receiver/kafka] 0.124.0 release broke default log text encoding #39793

kuiperda opened this issue Apr 30, 2025 · 5 comments · Fixed by #39806
Labels
bug Something isn't working receiver/kafka

Comments

@kuiperda
Copy link
Contributor

kuiperda commented Apr 30, 2025

Component(s)

receiver/kafka

What happened?

Description

Release 0.124.0 updated the kafka receiver's topic and encoding fields.

0.124.0+ Collectors using text_utf-8 as their log::encoding encounter this error:

receiver: invalid component type: invalid character(s) in type "text_utf-8"

As part of the update, this PR made a change that errors if the - character is used in the encoding. See this function, specifically component.NewType(encoding):

// encodingToComponentID converts an encoding string to a component ID using the given encoding as type.
func encodingToComponentID(encoding string) (*component.ID, error) {
	componentType, err := component.NewType(encoding)
	if err != nil {
		return nil, fmt.Errorf("invalid component type: %w", err)
	}
	id := component.NewID(componentType)
	return &id, nil
}
NewType creates a type. It returns an error if the type is invalid. A type must - have at least one character, - start with an ASCII alphabetic character and - can only contain ASCII alphanumeric characters and '_'.

Looking at the func newLogsUnmarshaler in the same file, it looks like utf8 and utf16 are the expected format now, but the readme still recommends utf-8 and the default appears to still be utf-8. There is a test validating usage of utf16 but not utf8 or utf-8.

I do not have Kafka set up but a collector will error because of this even before complaining that there are no brokers to connect to.

Steps to Reproduce

Run a 0.124.0 collector with a kafkareceiver using text_utf-8 as the log encoding. It will immediately error due to the - in the encoding. The config I shared is still using the old encoding/topic fields, but nesting them under logs: instead still hits the same error.

Expected Result

The receiver should not error when using a hyphenated value like text_utf-8 as the log encoding.

Furthermore, the recommended and default format for text encoding should work. If the breaking change was intentional, the documentation should be updated accordingly.

Tests should be added to cover this case.

Actual Result

The collector errors due to the - in the log encoding.

Collector version

0.124.0

Environment information

Environment

OS: macOS/darwin Sequoia 15.0.1
Compiler(if manually compiled): go 1.24.0

OpenTelemetry Collector configuration

receivers:
    kafka/logs:
        brokers:
            - localhost:9092
        client_id: otel-collector
        encoding: text_utf-8
        group_id: otel-collector
        metadata:
            full: true
        protocol_version: 2.0.0
        topic: otlp_logs
exporters:
    nop/devnull: null
service:
    pipelines:
        logs:
            receivers:
                - kafka/logs
            processors: []
            exporters:
                - nop/devnull
    telemetry:
        metrics:
            readers:
                - pull:
                    exporter:
                        prometheus:
                            host: localhost
                            port: 8888

Log output

cannot start pipelines: failed to start "kafka/logs" receiver: invalid component type: invalid character(s) in type "text_utf-8"

Additional context

No response

@kuiperda kuiperda added bug Something isn't working needs triage New item requiring triage labels Apr 30, 2025
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@axw
Copy link
Contributor

axw commented May 1, 2025

Thanks for reporting this @kuiperda, and sorry for the breakage. Working on a fix now.

@axw
Copy link
Contributor

axw commented May 1, 2025

/label -needs-triage

@kuiperda
Copy link
Contributor Author

kuiperda commented May 1, 2025

Thanks for the swift response!

@axw
Copy link
Contributor

axw commented May 2, 2025

Fix is up: #39806

atoulme pushed a commit that referenced this issue May 2, 2025
#### Description

Fix support for text encodings with hyphens in their names.

If the encoding name has a hyphen then it is an invalid extension ID,
but we should not return an error due to this if it's a built-in
encoding.

#### Link to tracking issue

Fixes #39793

#### Testing

Added a new unit test covering hyphenated text encoding names (fails
without the associated fix).

#### Documentation

N/A
vincentfree pushed a commit to ing-bank/opentelemetry-collector-contrib that referenced this issue May 6, 2025
…#39806)

#### Description

Fix support for text encodings with hyphens in their names.

If the encoding name has a hyphen then it is an invalid extension ID,
but we should not return an error due to this if it's a built-in
encoding.

#### Link to tracking issue

Fixes open-telemetry#39793

#### Testing

Added a new unit test covering hyphenated text encoding names (fails
without the associated fix).

#### Documentation

N/A
vincentfree pushed a commit to ing-bank/opentelemetry-collector-contrib that referenced this issue May 20, 2025
…#39806)

#### Description

Fix support for text encodings with hyphens in their names.

If the encoding name has a hyphen then it is an invalid extension ID,
but we should not return an error due to this if it's a built-in
encoding.

#### Link to tracking issue

Fixes open-telemetry#39793

#### Testing

Added a new unit test covering hyphenated text encoding names (fails
without the associated fix).

#### Documentation

N/A
dragonlord93 pushed a commit to dragonlord93/opentelemetry-collector-contrib that referenced this issue May 23, 2025
…#39806)

#### Description

Fix support for text encodings with hyphens in their names.

If the encoding name has a hyphen then it is an invalid extension ID,
but we should not return an error due to this if it's a built-in
encoding.

#### Link to tracking issue

Fixes open-telemetry#39793

#### Testing

Added a new unit test covering hyphenated text encoding names (fails
without the associated fix).

#### Documentation

N/A
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working receiver/kafka
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants