You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[pkg/stanza] Remove batching in LogEmitter behind feature gate (#38428)
#### Description
Removes batching in LogEmitter to prevent data loss during ungraceful
shutdown of the collector. See
#35456
for details.
This is done behind a feature gate, as it may have a negative
performance impact, depending on user configuration. See the added
documentation for the feature gate.
On implementation side, this was done by renaming the existing
`LogEmitter` struct to `BatchingLogEmitter` and introducing a new
`SynchronousLogEmitter`, see `pkg/stanza/adapter/emitter.go`.
#### Link to tracking issue
- Fixes#35456
#### Testing
Added unit tests in `pkg/stanza/adapter/emitter_test.go`.
Adapted the benchmarks `pkg/stanza/adapter/receiver_test.go` to run for
both the existing BatchingLogEmitter and the new SynchronousLogEmitter.
#### Documentation
Added documentation for the feature gate.
# Use this changelog template to create an entry for release notes.
2
+
3
+
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
4
+
change_type: enhancement
5
+
6
+
# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
7
+
component: pkg/stanza
8
+
9
+
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
10
+
note: Prevent data loss in Stanza-based receivers on ungraceful shutdown of the collector
11
+
12
+
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
13
+
issues: [35456]
14
+
15
+
# (Optional) One or more lines of additional information to render under the primary note.
16
+
# These lines will be padded with 2 spaces and then inserted directly into the document.
17
+
# Use pipe (|) for multiline entries.
18
+
subtext: |
19
+
Enable the `stanza.synchronousLogEmitter` feature gate to unlock this feature.
20
+
See the [documentation](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/README.md) for more information.
21
+
22
+
23
+
# If your change doesn't affect end users or the exported elements of any package,
24
+
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
25
+
# Optional: The change log or logs in which this entry should be included.
26
+
# e.g. '[user]' or '[user, api]'
27
+
# Include 'user' if the change is relevant to end users.
28
+
# Include 'api' if there is a change to a library API.
Copy file name to clipboardExpand all lines: pkg/stanza/README.md
+28Lines changed: 28 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -49,6 +49,34 @@ Common functionality for all of these receivers is provided by the adapter packa
49
49
- A special `emitter` operator, combined with a `converter` which together act as a bridge from the operator sequence to the
50
50
OpenTelemetry Collector's pipelines.
51
51
52
+
### Feature Gates
53
+
54
+
#### `stanza.synchronousLogEmitter`
55
+
56
+
The `stanza.synchronousLogEmitter` feature gate prevents possible data loss during an ungraceful shutdown of the collector by emitting logs in LogEmitter synchronously,
57
+
instead of batching the logs in LogEmitter's internal buffer. See related issue <https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/35456>.
58
+
59
+
LogEmitter is a component in Stanza that passes logs from Stanza pipeline to the collector's pipeline.
60
+
LogEmitter keeps an internal buffer of logs and only emits the logs as a single batch when the buffer is full (or when flush timeout elapses).
61
+
This was done in order to increase performance, as processing data in batches is much more performant than processing each entry separately.
62
+
However, this has the disadvantage of losing the data in the buffer in case of an ungraceful shutdown of the collector.
63
+
To prevent this, enable this feature gate to make LogEmitter synchronous, eliminating the risk of data loss.
64
+
65
+
Note that enabling this feature gate may have negative performance impact in some situations, see below.
66
+
67
+
The performance impact does not occur when using receivers based on Stanza inputs that support batching. Currently these are: File Log receiver. See caveat below.
68
+
69
+
The performance impact may be observed when using receivers based on Stanza inputs that do not support batching. Currently these are: Journald receiver, Named Pipe receiver, Syslog receiver, TCP Log receiver, UDP Log receiver, Windows EventLog receiver.
70
+
71
+
The caveat is that even when using a receiver that supports batching (like the File Log receiver), the performance impact may still be observed when additional operators are configured (see `operators` configuration option).
72
+
This is because Stanza transform operators currently don't support processing logs in batches, so even if the File Log receiver's File input operator creates a batch of logs,
73
+
the next operator in Stanza pipeline will split every batch into single entries.
74
+
75
+
The planned schedule for this feature gate is the following:
76
+
77
+
- Introduce as `Alpha` (disabled by default) in v0.122.0
78
+
- Move to `Beta` (enabled by default) after transform operators support batching and after all receivers that are selected to support batching support it
79
+
52
80
### FAQ
53
81
54
82
Q: Why don't we make every parser and transform operator into a distinct OpenTelemetry processor?
0 commit comments