-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Allow num_consumers to be greater than 1 if batch enabled and partitioner defined #13607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (33.33%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #13607 +/- ##
=======================================
Coverage ? 91.52%
=======================================
Files ? 639
Lines ? 41547
Branches ? 0
=======================================
Hits ? 38024
Misses ? 2704
Partials ? 819 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I'm afraid that my understanding of the exporterhelper's batcher has diverged from what's really happening. It would help if there were a document explaining how partitioning and num_consumers relate, or how much concurrency I can get without a partitioner configured. As a thought experiment, I am interested in "simple concurrent batching", let's say, where there is not a meaningful partition function, all data goes to the same logical destination. I understand the idea that only one active thread can take the mutex and locate a full batch, but the way I've thought about the Can you explain the current situation or refer to an explanation of how I can get simple concurrent batching? |
|
(Updated for posterity following the clarification in this comment)
👋 @jmacd this is an excellent question. I have also spent quite some time to wrap my head around the current situation so let me take a stab at explaining this based on what I have understood so far. TL;DR if you configure batching using the exporterhelper's Case 1: Configuring exporterhelper with queue and batch but without partitioner
The number of consumers for the queue, which basically refers to the number of gorutines consuming from the queue and pushing it to the partition batcher, in this case is forced to be Case 2: Configuring exporterhelper without batchingIn this case, a batcher is basically disabled (ref). The disabled batcher directly calls the consume function in sync (ref). The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sfc-gh-sili
would you comment? I would expect for num_consumers to work, without partitioning as a requirement. The change here asks for much less (and I want more). While one export is happening (a single-batch consumer) the other num_consumers-1 should be busy forming batches and exporting.
Co-authored-by: Joshua MacDonald <[email protected]>
a0d335d
|
@jmacd Reading through the code, my understanding is that when both batch and queue are enabled:
With the current setting "Reading from the queue and adding the item to the batch" is bounded by 1 goroutine at a time, and I guess we kept things that way for single batch because adding to the batch is guarded by a lock anyway.
I believe the current implementation allows |
Thanks for the clarification @sfc-gh-sili, I can see where my understanding went wrong. I missed the point that flush happens via a worker pool and is not in sync with the
Right, however, with the introduction of partition batcher, I think this PR would still make sense as the locks would be for each instance of partition batcher, and we could achieve true concurrency when the queue is reading for more than one partition. I guess the ideal solution here would be to have a partitioned queue in addition to a partitioned batcher so that we can eliminate the contention due to a heavily loaded partition. This could also deal with any head-of-line issues if we assigned atleast one worker from the pool for each partition. |
|
I'm late to the party, but this conversation about The behavior I'm interested in achieving is the following:
It seems like this is almost possible using Does it seem reasonable to allow for separate Batching and Queueing senders? Maybe @lahsivjar or @sfc-gh-sili could comment. (And thanks for all of the recent work in this area, I've been very excited to learn about it!) |
I think I see now that this is possible. Something that I missed initially was the wiring for I think this makes my scenario possible by configuring the following:
|
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [github.com/stretchr/testify](https://github.com/stretchr/testify) | require | minor | `v1.10.0` -> `v1.11.1` | | [go.opentelemetry.io/collector/component](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.35.0` -> `v1.45.0` | | [go.opentelemetry.io/collector/component/componenttest](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v0.129.0` -> `v0.139.0` | | [go.opentelemetry.io/collector/confmap](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.35.0` -> `v1.45.0` | | [go.opentelemetry.io/collector/consumer](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.35.0` -> `v1.45.0` | | [go.opentelemetry.io/collector/consumer/consumertest](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v0.129.0` -> `v0.139.0` | | [go.opentelemetry.io/collector/pdata](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.35.0` -> `v1.45.0` | | [go.opentelemetry.io/collector/processor](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.35.0` -> `v1.45.0` | | [go.opentelemetry.io/collector/processor/processortest](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v0.129.0` -> `v0.139.0` | --- ### Release Notes <details> <summary>stretchr/testify (github.com/stretchr/testify)</summary> ### [`v1.11.1`](https://github.com/stretchr/testify/releases/tag/v1.11.1) [Compare Source](stretchr/testify@v1.11.0...v1.11.1) This release fixes [#​1785](stretchr/testify#1785) introduced in v1.11.0 where expected argument values implementing the stringer interface (`String() string`) with a method which mutates their value, when passed to mock.Mock.On (`m.On("Method", <expected>).Return()`) or actual argument values passed to mock.Mock.Called may no longer match one another where they previously did match. The behaviour prior to v1.11.0 where the stringer is always called is restored. Future testify releases may not call the stringer method at all in this case. #### What's Changed - Backport [#​1786](stretchr/testify#1786) to release/1.11: mock: revert to pre-v1.11.0 argument matching behavior for mutating stringers by [@​brackendawson](https://github.com/brackendawson) in [#​1788](stretchr/testify#1788) **Full Changelog**: <stretchr/testify@v1.11.0...v1.11.1> ### [`v1.11.0`](https://github.com/stretchr/testify/releases/tag/v1.11.0) [Compare Source](stretchr/testify@v1.10.0...v1.11.0) #### What's Changed ##### Functional Changes v1.11.0 Includes a number of performance improvements. - Call stack perf change for CallerInfo by [@​mikeauclair](https://github.com/mikeauclair) in [#​1614](stretchr/testify#1614) - Lazily render mock diff output on successful match by [@​mikeauclair](https://github.com/mikeauclair) in [#​1615](stretchr/testify#1615) - assert: check early in Eventually, EventuallyWithT, and Never by [@​cszczepaniak](https://github.com/cszczepaniak) in [#​1427](stretchr/testify#1427) - assert: add IsNotType by [@​bartventer](https://github.com/bartventer) in [#​1730](stretchr/testify#1730) - assert.JSONEq: shortcut if same strings by [@​dolmen](https://github.com/dolmen) in [#​1754](stretchr/testify#1754) - assert.YAMLEq: shortcut if same strings by [@​dolmen](https://github.com/dolmen) in [#​1755](stretchr/testify#1755) - assert: faster and simpler isEmpty using reflect.Value.IsZero by [@​dolmen](https://github.com/dolmen) in [#​1761](stretchr/testify#1761) - suite: faster methods filtering (internal refactor) by [@​dolmen](https://github.com/dolmen) in [#​1758](stretchr/testify#1758) ##### Fixes - assert.ErrorAs: log target type by [@​craig65535](https://github.com/craig65535) in [#​1345](stretchr/testify#1345) - Fix failure message formatting for Positive and Negative asserts in [#​1062](stretchr/testify#1062) - Improve ErrorIs message when error is nil but an error was expected by [@​tsioftas](https://github.com/tsioftas) in [#​1681](stretchr/testify#1681) - fix Subset/NotSubset when calling with mixed input types by [@​siliconbrain](https://github.com/siliconbrain) in [#​1729](stretchr/testify#1729) - Improve ErrorAs failure message when error is nil by [@​ccoVeille](https://github.com/ccoVeille) in [#​1734](stretchr/testify#1734) - mock.AssertNumberOfCalls: improve error msg by [@​3scalation](https://github.com/3scalation) in [#​1743](stretchr/testify#1743) ##### Documentation, Build & CI - docs: Fix typo in README by [@​alexandear](https://github.com/alexandear) in [#​1688](stretchr/testify#1688) - Replace deprecated io/ioutil with io and os by [@​alexandear](https://github.com/alexandear) in [#​1684](stretchr/testify#1684) - Document consequences of calling t.FailNow() by [@​greg0ire](https://github.com/greg0ire) in [#​1710](stretchr/testify#1710) - chore: update docs for Unset [#​1621](stretchr/testify#1621) by [@​techfg](https://github.com/techfg) in [#​1709](stretchr/testify#1709) - README: apply gofmt to examples by [@​alexandear](https://github.com/alexandear) in [#​1687](stretchr/testify#1687) - refactor: use %q and %T to simplify fmt.Sprintf by [@​alexandear](https://github.com/alexandear) in [#​1674](stretchr/testify#1674) - Propose Christophe Colombier (ccoVeille) as approver by [@​brackendawson](https://github.com/brackendawson) in [#​1716](stretchr/testify#1716) - Update documentation for the Error function in assert or require package by [@​architagr](https://github.com/architagr) in [#​1675](stretchr/testify#1675) - assert: remove deprecated build constraints by [@​alexandear](https://github.com/alexandear) in [#​1671](stretchr/testify#1671) - assert: apply gofumpt to internal test suite by [@​ccoVeille](https://github.com/ccoVeille) in [#​1739](stretchr/testify#1739) - CI: fix shebang in .ci.\*.sh scripts by [@​dolmen](https://github.com/dolmen) in [#​1746](stretchr/testify#1746) - assert,require: enable parallel testing on (almost) all top tests by [@​dolmen](https://github.com/dolmen) in [#​1747](stretchr/testify#1747) - suite.Passed: add one more status test report by [@​Ararsa-Derese](https://github.com/Ararsa-Derese) in [#​1706](stretchr/testify#1706) - Add Helper() method in internal mocks and assert.CollectT by [@​dolmen](https://github.com/dolmen) in [#​1423](stretchr/testify#1423) - assert.Same/NotSame: improve usage of Sprintf by [@​ccoVeille](https://github.com/ccoVeille) in [#​1742](stretchr/testify#1742) - mock: enable parallel testing on internal testsuite by [@​dolmen](https://github.com/dolmen) in [#​1756](stretchr/testify#1756) - suite: cleanup use of 'testing' internals at runtime by [@​dolmen](https://github.com/dolmen) in [#​1751](stretchr/testify#1751) - assert: check test failure message for Empty and NotEmpty by [@​ccoVeille](https://github.com/ccoVeille) in [#​1745](stretchr/testify#1745) - deps: fix dependency cycle with objx (again) by [@​dolmen](https://github.com/dolmen) in [#​1567](stretchr/testify#1567) - assert.Empty: comprehensive doc of "Empty"-ness rules by [@​dolmen](https://github.com/dolmen) in [#​1753](stretchr/testify#1753) - doc: improve godoc of top level 'testify' package by [@​dolmen](https://github.com/dolmen) in [#​1760](stretchr/testify#1760) - assert.ErrorAs: simplify retrieving the type name by [@​ccoVeille](https://github.com/ccoVeille) in [#​1740](stretchr/testify#1740) - assert.EqualValues: improve test coverage to 100% by [@​dolmen](https://github.com/dolmen) in [#​1763](stretchr/testify#1763) - suite.Run: simplify running of Setup/TeardownSuite by [@​renzoarreaza](https://github.com/renzoarreaza) in [#​1769](stretchr/testify#1769) - assert.CallerInfo: micro optimization by using LastIndexByte by [@​dolmen](https://github.com/dolmen) in [#​1767](stretchr/testify#1767) - assert.CallerInfo: micro cleanup by [@​dolmen](https://github.com/dolmen) in [#​1768](stretchr/testify#1768) - assert: refactor Test*FileExists and Test*DirExists tests to enable parallel testing by [@​dolmen](https://github.com/dolmen) in [#​1766](stretchr/testify#1766) - suite.Run: refactor handling of stats for improved readability by [@​dolmen](https://github.com/dolmen) in [#​1764](stretchr/testify#1764) - tests: improve captureTestingT helper by [@​ccoVeille](https://github.com/ccoVeille) in [#​1741](stretchr/testify#1741) - build(deps): bump actions/checkout from 4 to 5 by [@​dependabot](https://github.com/dependabot)\[bot] in [#​1778](stretchr/testify#1778) #### New Contributors - [@​greg0ire](https://github.com/greg0ire) made their first contribution in [#​1710](stretchr/testify#1710) - [@​techfg](https://github.com/techfg) made their first contribution in [#​1709](stretchr/testify#1709) - [@​mikeauclair](https://github.com/mikeauclair) made their first contribution in [#​1614](stretchr/testify#1614) - [@​cszczepaniak](https://github.com/cszczepaniak) made their first contribution in [#​1427](stretchr/testify#1427) - [@​architagr](https://github.com/architagr) made their first contribution in [#​1675](stretchr/testify#1675) - [@​tsioftas](https://github.com/tsioftas) made their first contribution in [#​1681](stretchr/testify#1681) - [@​siliconbrain](https://github.com/siliconbrain) made their first contribution in [#​1729](stretchr/testify#1729) - [@​bartventer](https://github.com/bartventer) made their first contribution in [#​1730](stretchr/testify#1730) - [@​Ararsa-Derese](https://github.com/Ararsa-Derese) made their first contribution in [#​1706](stretchr/testify#1706) - [@​renzoarreaza](https://github.com/renzoarreaza) made their first contribution in [#​1769](stretchr/testify#1769) - [@​3scalation](https://github.com/3scalation) made their first contribution in [#​1743](stretchr/testify#1743) **Full Changelog**: <stretchr/testify@v1.10.0...v1.11.0> </details> <details> <summary>open-telemetry/opentelemetry-collector (go.opentelemetry.io/collector/component)</summary> ### [`v1.45.0`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1450v01390) ##### 🛑 Breaking changes 🛑 - `cmd/mdatagen`: Make stability.level a required field for metrics ([#​14070](open-telemetry/opentelemetry-collector#14070)) - `cmd/mdatagen`: Replace `optional` field with `requirement_level` field for attributes in metadata schema ([#​13913](open-telemetry/opentelemetry-collector#13913)) The `optional` boolean field for attributes has been replaced with a `requirement_level` field that accepts enum values: `required`, `conditionally_required`, `recommended`, or `opt_in`. - `required`: attribute is always included and cannot be excluded - `conditionally_required`: attribute is included by default when certain conditions are met (replaces `optional: true`) - `recommended`: attribute is included by default but can be disabled via configuration (replaces `optional: false`) - `opt_in`: attribute is not included unless explicitly enabled in user config When `requirement_level` is not specified, it defaults to `recommended`. - `pdata/pprofile`: Remove deprecated `PutAttribute` helper method ([#​14082](open-telemetry/opentelemetry-collector#14082)) - `pdata/pprofile`: Remove deprecated `PutLocation` helper method ([#​14082](open-telemetry/opentelemetry-collector#14082)) ##### 💡 Enhancements 💡 - `all`: Add FIPS and non-FIPS implementations for allowed TLS curves ([#​13990](open-telemetry/opentelemetry-collector#13990)) - `cmd/builder`: Set CGO\_ENABLED=0 by default, add the `cgo_enabled` configuration to enable it. ([#​10028](open-telemetry/opentelemetry-collector#10028)) - `pkg/config/configgrpc`: Errors of type status.Status returned from an Authenticator extension are being propagated as is to the upstream client. ([#​14005](open-telemetry/opentelemetry-collector#14005)) - `pkg/config/configoptional`: Adds new `configoptional.AddEnabledField` feature gate that allows users to explicitly disable a `configoptional.Optional` through a new `enabled` field. ([#​14021](open-telemetry/opentelemetry-collector#14021)) - `pkg/exporterhelper`: Replace usage of gogo proto for persistent queue metadata ([#​14079](open-telemetry/opentelemetry-collector#14079)) - `pkg/pdata`: Remove usage of gogo proto and generate the structs with pdatagen ([#​14078](open-telemetry/opentelemetry-collector#14078)) ##### 🧰 Bug fixes 🧰 - `exporter/debug`: add queue configuration ([#​14101](open-telemetry/opentelemetry-collector#14101)) <!-- previous-version --> ### [`v1.44.0`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1440v01380) ##### 🛑 Breaking changes 🛑 - `all`: Remove deprecated type `TracesConfig` ([#​14036](open-telemetry/opentelemetry-collector#14036)) - `pkg/exporterhelper`: Add default values for `sending_queue::batch` configuration. ([#​13766](open-telemetry/opentelemetry-collector#13766)) Setting `sending_queue::batch` to an empty value now results in the same setup as the default batch processor configuration. - `all`: Add unified print-config command with mode support (redacted, unredacted), json support (unstable), and validation support. ([#​11775](open-telemetry/opentelemetry-collector#11775)) This replaces the `print-initial-config` command. See the `service` package README for more details. The original command name `print-initial-config` remains an alias, to be retired with the feature flag. ##### 💡 Enhancements 💡 - `all`: Add `keep_alives_enabled` option to ServerConfig to control HTTP keep-alives for all components that create an HTTP server. ([#​13783](open-telemetry/opentelemetry-collector#13783)) - `pkg/otelcol`: Avoid unnecessary mutex in collector logs, replace by atomic pointer ([#​14008](open-telemetry/opentelemetry-collector#14008)) - `cmd/mdatagen`: Add lint/ordering validation for metadata.yaml ([#​13781](open-telemetry/opentelemetry-collector#13781)) - `pdata/xpdata`: Refactor JSON marshaling and unmarshaling to use `pcommon.Value` instead of `AnyValue`. ([#​13837](open-telemetry/opentelemetry-collector#13837)) - `pkg/exporterhelper`: Expose `MergeCtx` in exporterhelper's queue batch settings\` ([#​13742](open-telemetry/opentelemetry-collector#13742)) ##### 🧰 Bug fixes 🧰 - `all`: Fix zstd decoder data corruption due to decoder pooling for all components that create an HTTP server. ([#​13954](open-telemetry/opentelemetry-collector#13954)) - `pkg/otelcol`: Remove UB when taking internal logs and move them to the final zapcore.Core ([#​14009](open-telemetry/opentelemetry-collector#14009)) This can happen because of a race on accessing `logsTaken`. - `pkg/confmap`: Fix a potential race condition in confmap by closing the providers first. ([#​14018](open-telemetry/opentelemetry-collector#14018)) <!-- previous-version --> ### [`v1.43.0`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1430v01370) ##### 💡 Enhancements 💡 - `cmd/mdatagen`: Improve validation for resource attribute `enabled` field in metadata files ([#​12722](open-telemetry/opentelemetry-collector#12722)) Resource attributes now require an explicit `enabled` field in metadata.yaml files, while regular attributes are prohibited from having this field. This improves validation and prevents configuration errors. - `all`: Changelog entries will now have their component field checked against a list of valid components. ([#​13924](open-telemetry/opentelemetry-collector#13924)) This will ensure a more standardized changelog format which makes it easier to parse. - `pkg/pdata`: Mark featuregate pdata.useCustomProtoEncoding as stable ([#​13883](open-telemetry/opentelemetry-collector#13883)) <!-- previous-version --> ### [`v1.42.0`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1420v01360) ##### 💡 Enhancements 💡 - `xpdata`: Add Serialization and Deserialization of AnyValue ([#​12826](open-telemetry/opentelemetry-collector#12826)) - `debugexporter`: add support for batching ([#​13791](open-telemetry/opentelemetry-collector#13791)) The default queue size is 1 - `configtls`: Add early validation for TLS server configurations to fail fast when certificates are missing instead of failing at runtime. ([#​13130](open-telemetry/opentelemetry-collector#13130), [#​13245](open-telemetry/opentelemetry-collector#13245)) - `mdatagen`: Expose stability level in generated metric documentation ([#​13748](open-telemetry/opentelemetry-collector#13748)) - `internal/tools`: Add support for modernize in Makefile ([#​13796](open-telemetry/opentelemetry-collector#13796)) ##### 🧰 Bug fixes 🧰 - `otelcol`: Fix a potential deadlock during collector shutdown. ([#​13740](open-telemetry/opentelemetry-collector#13740)) - `otlpexporter`: fix the validation of unix socket endpoints ([#​13826](open-telemetry/opentelemetry-collector#13826)) <!-- previous-version --> ### [`v1.41.0`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1410v01350) ##### 💡 Enhancements 💡 - `exporterhelper`: Add new `exporter_queue_batch_send_size` and `exporter_queue_batch_send_size_bytes` metrics, showing the size of telemetry batches from the exporter. ([#​12894](open-telemetry/opentelemetry-collector#12894)) <!-- previous-version --> ### [`v1.40.0`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1400v01340) ##### 💡 Enhancements 💡 - `pdata`: Add custom grpc/encoding that replaces proto and calls into the custom marshal/unmarshal logic in pdata. ([#​13631](open-telemetry/opentelemetry-collector#13631)) This change should not affect other gRPC calls since it fallbacks to the default grpc/proto encoding if requests are not pdata/otlp requests. - `pdata`: Avoid copying the pcommon.Map when same origin ([#​13731](open-telemetry/opentelemetry-collector#13731)) This is a very large improvement if using OTTL with map functions since it will avoid a map copy. - `exporterhelper`: Respect `num_consumers` when batching and partitioning are enabled. ([#​13607](open-telemetry/opentelemetry-collector#13607)) ##### 🧰 Bug fixes 🧰 - `pdata`: Correctly parse OTLP payloads containing non-packed repeated primitive fields ([#​13727](open-telemetry/opentelemetry-collector#13727), [#​13730](open-telemetry/opentelemetry-collector#13730)) This bug prevented the Collector from ingesting most Histogram, ExponentialHistogram, and Profile payloads. <!-- previous-version --> ### [`v1.39.0`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1390v01330) ##### 🛑 Breaking changes 🛑 - `all`: Increase minimum Go version to 1.24 ([#​13627](open-telemetry/opentelemetry-collector#13627)) ##### 💡 Enhancements 💡 - `otlphttpexporter`: Add `profiles_endpoint` configuration option to allow custom endpoint for profiles data export ([#​13504](open-telemetry/opentelemetry-collector#13504)) The `profiles_endpoint` configuration follows the same pattern as `traces_endpoint`, `metrics_endpoint`, and `logs_endpoint`. When specified, profiles data will be sent to the custom URL instead of the default `{endpoint}/v1development/profiles`. - `pdata`: Add support for local memory pooling for data objects. ([#​13678](open-telemetry/opentelemetry-collector#13678)) This is still an early experimental (alpha) feature. Do not recommended to be used production. To enable use "--featuregate=+pdata.useProtoPooling" - `pdata`: Optimize CopyTo messages to avoid any copy when same source and destination ([#​13680](open-telemetry/opentelemetry-collector#13680)) - `receiverhelper`: New feature flag to make receiverhelper distinguish internal vs. downstream errors using new `otelcol_receiver_failed_x` and `otelcol_receiver_requests` metrics ([#​12207](open-telemetry/opentelemetry-collector#12207), [#​12802](open-telemetry/opentelemetry-collector#12802)) This is a breaking change for the semantics of the otelcol\_receiver\_refused\_metric\_points, otelcol\_receiver\_refused\_log\_records and otelcol\_receiver\_refused\_spans metrics. These new metrics and semantics are enabled through the `receiverhelper.newReceiverMetrics` feature gate. - `debugexporter`: Add support for entity references in debug exporter output ([#​13324](open-telemetry/opentelemetry-collector#13324)) - `pdata`: Fix unnecessary allocation of a new state when adding new values to pcommon.Map ([#​13634](open-telemetry/opentelemetry-collector#13634)) - `service`: Implement refcounting for pipeline data owned memory. ([#​13631](open-telemetry/opentelemetry-collector#13631)) This feature is protected by `--featuregate=+pdata.useProtoPooling`. - `service`: Add a debug-level log message when a consumer returns an error. ([#​13357](open-telemetry/opentelemetry-collector#13357)) - `xpdata`: Optimize xpdata/context for persistent queue when only one value for key ([#​13636](open-telemetry/opentelemetry-collector#13636)) - `otlpreceiver`: Log the listening addresses of the receiver, rather than the configured endpoints. ([#​13654](open-telemetry/opentelemetry-collector#13654)) - `pdata`: Use the newly added proto marshaler/unmarshaler for the official proto Marshaler/Unmarshaler ([#​13637](open-telemetry/opentelemetry-collector#13637)) If any problems observed with this consider to disable the featuregate `--feature-gates=-pdata.useCustomProtoEncoding` <!-- cspell:ignore MLKEM mlkem --> - `configtls`: Enable X25519MLKEM768 as per draft-ietf-tls-ecdhe-mlkem ([#​13670](open-telemetry/opentelemetry-collector#13670)) ##### 🧰 Bug fixes 🧰 - `exporterhelper`: Prevent uncontrolled goroutines in batcher due to a incorrect worker pool behaviour. ([#​13689](open-telemetry/opentelemetry-collector#13689)) - `service`: Ensure the insecure configuration is accounted for when normalizing the endpoint. ([#​13691](open-telemetry/opentelemetry-collector#13691)) - `configoptional`: Allow validating nested types ([#​13579](open-telemetry/opentelemetry-collector#13579)) `configoptional.Optional` now implements `xconfmap.Validator` - `batchprocessor`: Fix UB in batch processor when trying to read bytes size after adding request to pipeline ([#​13698](open-telemetry/opentelemetry-collector#13698)) This bug only happens id detailed metrics are enabled and also an async (sending queue enabled) exporter that mutates data is configure. <!-- previous-version --> ### [`v1.38.0`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1380v01320) ##### 🛑 Breaking changes 🛑 - `componentstatus`: Change the signature of the componentstatus.NewEvent to accept multiple options. ([#​13210](open-telemetry/opentelemetry-collector#13210)) Changes the signature of the component.NewEvent to accept multiple EventBuilderOption, like the new WithAttributes constructor. ##### 🚩 Deprecations 🚩 - `service`: move service.noopTraceProvider feature gate to deprecated stage ([#​13492](open-telemetry/opentelemetry-collector#13492)) The functionality of the feature gate is available via configuration with the following telemetry settings: ``` service: telemetry: traces: level: none ``` - `mdatagen`: Remove the deletion of `generated_component_telemetry_test.go`. ([#​12067](open-telemetry/opentelemetry-collector#12067)) This file used to be generated by mdatagen. Starting with 0.122.0, the code deletes that file. It is no longer necessary to delete the file, as code has had time to upgrade to mdatagen and delete the file. - `service`: The `telemetry.disableHighCardinalityMetrics` feature gate is deprecated ([#​13537](open-telemetry/opentelemetry-collector#13537)) The feature gate is now deprecated since metric views can be configured. The feature gate will be removed in v0.134.0. The metric attributes removed by this feature gate are no longer emitted by the Collector by default, but if needed, you can achieve the same functionality by configuring the following metric views: ```yaml service: telemetry: metrics: level: detailed views: - selector: meter_name: "go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc" stream: attribute_keys: excluded: ["net.sock.peer.addr", "net.sock.peer.port", "net.sock.peer.name"] - selector: meter_name: "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp" stream: attribute_keys: excluded: ["net.host.name", "net.host.port"] ``` Note that this requires setting `service::telemetry::metrics::level: detailed`. If you have a strong use case for using views in combination with a different level, please show your interest in [#​10769](open-telemetry/opentelemetry-collector#10769). ##### 💡 Enhancements 💡 - `pdata`: Generate Logs/Traces/Metrics/Profiles and p\[log|trace|metric|profile]ExportResponse with pdatagen. ([#​13597](open-telemetry/opentelemetry-collector#13597)) This change brings consistency on how these structs are written and remove JSON marshaling/unmarshaling hand written logic. - `confighttp`: Add option to configure ForceAttemptHTTP2 to support HTTP/1 specific transport settings. ([#​13426](open-telemetry/opentelemetry-collector#13426)) - `pdata`: Avoid unnecessary buffer copy when JSON marshal fails. ([#​13598](open-telemetry/opentelemetry-collector#13598)) - `cmd/mdatagen`: Use a custom host implementation for lifecycle tests ([#​13589](open-telemetry/opentelemetry-collector#13589)) Use a custom noop host implementation that implements all non-deprecated, publicly-accessible interfaces implemented by the Collector service. - `processorhelper`: Add processor internal duration metric. ([#​13231](open-telemetry/opentelemetry-collector#13231)) - `pdata`: Improve RemoveIf for slices to not reference anymore the removed memory ([#​13522](open-telemetry/opentelemetry-collector#13522)) ##### 🧰 Bug fixes 🧰 - `pdata`: Fix null pointer access when copying into a slice with larger cap but smaller len. ([#​13523](open-telemetry/opentelemetry-collector#13523)) - `confighttp`: Fix middleware configuration field name from "middleware" to "middlewares" for consistency with configgrpc ([#​13444](open-telemetry/opentelemetry-collector#13444)) - `memorylimiterextension, memorylimiterprocessor`: Memory limiter extension and processor shutdown don't throw an error if the component was not started first. ([#​9687](open-telemetry/opentelemetry-collector#9687)) The components would throw an error if they were shut down before being started. With this change, they will no longer return an error, conforming to the lifecycle of components expected. - `confighttp`: Reuse zstd Reader objects ([#​11824](open-telemetry/opentelemetry-collector#11824)) <!-- previous-version --> ### [`v1.37.0`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1370v01310) ##### 🛑 Breaking changes 🛑 - `confighttp`: Move `confighttp.framedSnappy` feature gate to beta. ([#​10584](open-telemetry/opentelemetry-collector#10584)) ##### 💡 Enhancements 💡 - `exporter/debug`: Move to alpha stability except profiles ([#​13487](open-telemetry/opentelemetry-collector#13487)) - `exporterhelper`: Enable `exporter.PersistRequestContext` feature gate by default. ([#​13437](open-telemetry/opentelemetry-collector#13437)) Request context is now preserved by default when using persistent queues. Note that Auth extensions context is not propagated through the persistent queue. - `pdata`: Use pdatagen to generate marshalJSON without using gogo proto jsonpb. ([#​13450](open-telemetry/opentelemetry-collector#13450)) - `otlpreceiver`: Remove usage of gogo proto which uses reflect.Value.MethodByName. Removes one source of disabling DCE. ([#​12747](open-telemetry/opentelemetry-collector#12747)) - `exporterhelper`: Fix metrics split logic to consider metrics description into the size. ([#​13418](open-telemetry/opentelemetry-collector#13418)) - `service`: New pipeline instrumentation now differentiates internal failures from downstream errors ([#​13234](open-telemetry/opentelemetry-collector#13234)) With the telemetry.newPipelineTelemetry feature gate enabled, the "received" and "produced" metrics related to a component now distinguish between two types of errors: - "outcome = failure" indicates that the component returned an internal error; - "outcome = refused" indicates that the component successfully emitted data, but returned an error coming from a downstream component processing that data. - `pdata`: Remove usage of text/template from pdata, improves DCE. ([#​12747](open-telemetry/opentelemetry-collector#12747)) - `architecture`: New Tier 3 platform riscv64 allowing the collector to be built and distributed for this platform. ([#​13462](open-telemetry/opentelemetry-collector#13462)) ##### 🧰 Bug fixes 🧰 - `exporterhelper`: Prevents the exporter for being stuck when telemetry data is bigger than batch.max\_size ([#​12893](open-telemetry/opentelemetry-collector#12893)) - `mdatagen`: Fix import paths for mdatagen component ([#​13069](open-telemetry/opentelemetry-collector#13069)) - `otlpreceiver`: Error handler correctly fallbacks to content type ([#​13414](open-telemetry/opentelemetry-collector#13414)) - `pdata/pprofiles`: Fix profiles JSON unmarshal logic for originalPayload. The bytes have to be base64 encoded. ([#​13483](open-telemetry/opentelemetry-collector#13483)) - `xpdata`: Fix unmarshaling JSON for entities, add e2e tests to avoid this in the future. ([#​13480](open-telemetry/opentelemetry-collector#13480)) - `service`: Downgrade dependency of prometheus exporter in OTel Go SDK ([#​13429](open-telemetry/opentelemetry-collector#13429)) This fixes the bug where collector's internal metrics are emitted with an unexpected suffix in their names when users configure the service::telemetry::metrics::readers with Prometheus - `service`: Revert Default internal metrics config now enables `otel_scope_` labels ([#​12939](open-telemetry/opentelemetry-collector#12939), [#​13344](open-telemetry/opentelemetry-collector#13344)) Reverting change temporarily due to prometheus exporter downgrade. This unfortunately re-introduces the bug that instrumentation scope attributes cause errors in Prometheus exporter. See [#​12939](http://github.com/open-telemetry/opentelemetry-collector/issues/12939) for details. - `builder`: Remove undocumented handling of `DIST_*` environment variables replacements ([#​13335](open-telemetry/opentelemetry-collector#13335)) <!-- previous-version --> ### [`v1.36.1`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1361v01301) ##### 🧰 Bug fixes 🧰 - `service`: Fixes bug where internal metrics are emitted with an unexpected suffix in their names when users configure `service::telemetry::metrics::readers` with Prometheus. ([#​13449](open-telemetry/opentelemetry-collector#13449)) See more details on [open-telemetry/opentelemetry-go#7039](open-telemetry/opentelemetry-go#7039) <!-- previous-version --> ### [`v1.36.0`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1360v01300) ##### ❗ Known Issues ❗ - Due to a [bug](open-telemetry/opentelemetry-go#7039) in the prometheus exporter, if you are configuring a prometheus exporter, the collector's internal metrics will be emitted with an unexpected suffix in its name. For example, the metric `otelcol_exporter_sent_spans__spans__total` instead of `otelcol_exporter_sent_spans_total`. The workaround is to manually configure `without_units: true` in your prometheus exporter config ```yaml service: telemetry: metrics: readers: - pull: exporter: prometheus: host: 0.0.0.0 port: 8888 without_units: true ``` If you are using the collector's default Prometheus exporter for exporting internal metrics you are unaffected. ##### 🛑 Breaking changes 🛑 - `exporter/otlp`: Remove deprecated batcher config from OTLP, use queuebatch ([#​13339](open-telemetry/opentelemetry-collector#13339)) ##### 💡 Enhancements 💡 - `exporterhelper`: Enable items and bytes sizers for persistent queue ([#​12881](open-telemetry/opentelemetry-collector#12881)) - `exporterhelper`: Refactor persistent storage size backup to always record it. ([#​12890](open-telemetry/opentelemetry-collector#12890)) - `exporterhelper`: Add support to configure a different Sizer for the batcher than the queue ([#​13313](open-telemetry/opentelemetry-collector#13313)) - `yaml`: Replaced `sigs.k8s.io/yaml` with `go.yaml.in/yaml` for improved support and long-term maintainability. ([#​13308](open-telemetry/opentelemetry-collector#13308)) ##### 🧰 Bug fixes 🧰 - `exporterhelper`: Fix exporter.PersistRequestContext feature gate ([#​13342](open-telemetry/opentelemetry-collector#13342)) - `exporterhelper`: Preserve all metrics metadata when batch splitting. ([#​13236](open-telemetry/opentelemetry-collector#13236)) Previously, when large batches of metrics were processed, the splitting logic in `metric_batch.go` could cause the `name` field of some metrics to disappear. This fix ensures that all metric fields are properly preserved when `metricRequest` objects are split. - `service`: Default internal metrics config now enables `otel_scope_` labels ([#​12939](open-telemetry/opentelemetry-collector#12939), [#​13344](open-telemetry/opentelemetry-collector#13344)) By default, the Collector exports its internal metrics using a Prometheus exporter from the opentelemetry-go repository. With this change, the Collector no longer sets "without\_scope\_info" to true in its configuration. This means that all exported metrics will have `otel_scope_name`, `otel_scope_schema_url`, and `otel_scope_version` labels corresponding to the instrumentation scope metadata for that metric. This notably prevents an error when multiple metrics are only distinguished by their instrumentation scopes and end up aliased during export. If this is not desired behavior, a Prometheus exporter can be explicitly configured with this option enabled. <!-- previous-version --> </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 👻 **Immortal**: This PR will be recreated if closed unmerged. Get [config help](https://github.com/renovatebot/renovate/discussions) if that's undesired. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0MS45OC40IiwidXBkYXRlZEluVmVyIjoiNDIuMS4zIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6W119--> Reviewed-on: https://gitea.t000-n.de/t.behrendt/tracebasedlogsampler/pulls/23 Reviewed-by: t.behrendt <[email protected]> Co-authored-by: Renovate Bot <[email protected]> Co-committed-by: Renovate Bot <[email protected]>
Description
As per #12473, the number of consumers was set to
1to prevent contention due to single-threaded batchers. However, with the introduction of multiBatcher the batching will be done in multiple goroutines.Optimally a dynamic configuration to update the number of consumers to the number of partitions would be ideal, however, this could also cause contention as the queue is not partitioned and we can have hotspots where one partioner messages are more than other partitions. This PR makes it possible to configure the number of consumers to be greater than
1statically (based on otel config) which should be better than what we have right now which bottlenecks the consumption even with partitioned batchers.Link to tracking issue
Related to #12473
Testing
N/A
Documentation
N/A