You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[processor/logdedup] feat: add ottl condition to logdedup processor (#35443)
**Description:**
Adds OTTL Condition field to Deduplicate Logs Processor
**Link to tracking Issue:** Closes#35440
**Testing:**
- Tested functionality with BindPlane
- Added unit tests for the condition logic
**Documentation:** Added documentation to the logdedup processor README
about the condition field and an example configuration with a condition.
---------
Co-authored-by: Mike Goldsmith <[email protected]>
Copy file name to clipboardExpand all lines: processor/logdedupprocessor/README.md
+38-7Lines changed: 38 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@ This processor is used to deduplicate logs by detecting identical logs over a ra
15
15
16
16
## How It Works
17
17
1. The user configures the log deduplication processor in the desired logs pipeline.
18
-
2.All logs sent to the processor and aggregated over the configured `interval`. Logs are considered identical if they have the same body, resource attributes, severity, and log attributes.
18
+
2.If the processor does not provide `conditions`, all logs are considered eligible for aggregation. If the processor does have configured `conditions`, all log entries where at least one of the `conditions` evaluates `true` are considered eligible for aggregation. Eligible identical logs are aggregated over the configured `interval`. Logs are considered identical if they have the same body, resource attributes, severity, and log attributes. Logs that do not match any condition in `conditions` are passed onward in the pipeline without aggregating.
19
19
3. After the interval, the processor emits a single log with the count of logs that were deduplicated. The emitted log will have the same body, resource attributes, severity, and log attributes as the original log. The emitted log will also have the following new attributes:
20
20
21
21
-`log_count`: The count of logs that were deduplicated over the interval. The name of the attribute is configurable via the `log_count_attribute` parameter.
@@ -25,13 +25,17 @@ This processor is used to deduplicate logs by detecting identical logs over a ra
25
25
**Note**: The `ObservedTimestamp` and `Timestamp` of the emitted log will be the time that the aggregated log was emitted and will not be the same as the `ObservedTimestamp` and `Timestamp` of the original logs.
26
26
27
27
## Configuration
28
-
| Field | Type | Default | Description |
29
-
| --- | --- | --- | --- |
30
-
| interval | duration |`10s`| The interval at which logs are aggregated. The counter will reset after each interval. |
31
-
| log_count_attribute | string |`log_count`| The name of the count attribute of deduplicated logs that will be added to the emitted aggregated log. |
32
-
| timezone | string |`UTC`| The timezone of the `first_observed_timestamp` and `last_observed_timestamp` timestamps on the emitted aggregated log. The available locations depend on the local IANA Time Zone database. [This page](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) contains many examples, such as `America/New_York`. |
33
-
| exclude_fields |[]string |`[]`| Fields to exclude from duplication matching. Fields can be excluded from the log `body` or `attributes`. These fields will not be present in the emitted aggregated log. Nested fields must be `.` delimited. If a field contains a `.` it can be escaped by using a `\` see [example config](#example-config-with-excluded-fields).<br><br>**Note**: The entire `body` cannot be excluded. If the body is a map then fields within it can be excluded. |
28
+
| Field | Type | Default | Description |
29
+
| --- | --- | --- | --- |
30
+
| interval | duration |`10s`| The interval at which logs are aggregated. The counter will reset after each interval. |
31
+
| conditions |[]string |`[]`| A slice of [OTTL] expressions used to evaluate which log records are deduped. All paths in the [log context] are available to reference. All [converters] are available to use. |
32
+
| log_count_attribute | string |`log_count`| The name of the count attribute of deduplicated logs that will be added to the emitted aggregated log. |
33
+
| timezone | string |`UTC`| The timezone of the `first_observed_timestamp` and `last_observed_timestamp` timestamps on the emitted aggregated log. The available locations depend on the local IANA Time Zone database. [This page](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) contains many examples, such as `America/New_York`. |
34
+
| exclude_fields |[]string |`[]`| Fields to exclude from duplication matching. Fields can be excluded from the log `body` or `attributes`. These fields will not be present in the emitted aggregated log. Nested fields must be `.` delimited. If a field contains a `.` it can be escaped by using a `\` see [example config](#example-config-with-excluded-fields).<br><br>**Note**: The entire `body` cannot be excluded. If the body is a map then fields within it can be excluded. |
The following config is an example configuration for the log deduplication processor. It is configured with an aggregation interval of `60 seconds`, a timezone of `America/Los_Angeles`, and a log count attribute of `dedup_count`. It has no fields being excluded.
@@ -82,3 +86,30 @@ service:
82
86
processors: [logdedup]
83
87
exporters: [googlecloud]
84
88
```
89
+
90
+
91
+
### Example Config with Conditions
92
+
The following config is an example configuration that only performs the deduping process on telemetry where Attribute `ID` equals `1` OR where Resource Attribute `service.name` equals `my-service`:
0 commit comments