Skip to content

Commit e2c5886

Browse files
douglascamatachengchuanpeng
authored andcommitted
[pkg/stanza] Adopt semantic convention for the log file path attribute (open-telemetry#37210)
#### Description This PR adopts [the semantic convention for the log file path attribute](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/logs.md#log-file), which should be `attributes["log.file.path"]`. It fixes the default value for the `recombine` operator's `source_identifier`.
1 parent 4fd84f1 commit e2c5886

File tree

8 files changed

+169
-135
lines changed

8 files changed

+169
-135
lines changed
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Use this changelog template to create an entry for release notes.
2+
3+
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
4+
change_type: bug_fix
5+
6+
# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
7+
component: pkg/stanza
8+
9+
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
10+
note: Fix default source identifier in recombine operator
11+
12+
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
13+
issues: [37210]
14+
15+
# (Optional) One or more lines of additional information to render under the primary note.
16+
# These lines will be padded with 2 spaces and then inserted directly into the document.
17+
# Use pipe (|) for multiline entries.
18+
subtext: |
19+
Its defualt value is now aligned with the semantic conventions: `attributes["log.file.path"]`
20+
21+
# If your change doesn't affect end users or the exported elements of any package,
22+
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
23+
# Optional: The change log or logs in which this entry should be included.
24+
# e.g. '[user]' or '[user, api]'
25+
# Include 'user' if the change is relevant to end users.
26+
# Include 'api' if there is a change to a library API.
27+
# Default: '[user]'
28+
change_logs: [user]

pkg/stanza/docs/operators/recombine.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -4,22 +4,22 @@ The `recombine` operator combines consecutive logs into single logs based on sim
44

55
### Configuration Fields
66

7-
| Field | Default | Description |
8-
| --- | --- | --- |
9-
| `id` | `recombine` | A unique identifier for the operator. |
10-
| `output` | Next in pipeline | The connected operator(s) that will receive all outbound entries. |
11-
| `on_error` | `send` | The behavior of the operator if it encounters an error. See [on_error](../types/on_error.md). |
12-
| `is_first_entry` | | An [expression](../types/expression.md) that returns true if the entry being processed is the first entry in a multiline series. |
13-
| `is_last_entry` | | An [expression](../types/expression.md) that returns true if the entry being processed is the last entry in a multiline series. |
14-
| `combine_field` | required | The [field](../types/field.md) from all the entries that will be recombined. |
15-
| `combine_with` | `"\n"` | The string that is put between the combined entries. This can be an empty string as well. When using special characters like `\n`, be sure to enclose the value in double quotes: `"\n"`. |
16-
| `max_batch_size` | 1000 | The maximum number of consecutive entries that will be combined into a single entry. |
17-
| `max_unmatched_batch_size` | 100 | The maximum number of consecutive entries that will be combined into a single entry before the match occurs (with `is_first_entry` or `is_last_entry`), e.g. `max_unmatched_batch_size=0` - all entries combined, `max_unmatched_batch_size=1` - all entries uncombined until the match occurs, `max_unmatched_batch_size=100` - entries combined into 100-entry-packages until the match occurs |
18-
| `overwrite_with` | `newest` | Whether to use the fields from the `oldest` or the `newest` entry for all the fields that are not combined. |
19-
| `force_flush_period` | `5s` | Flush timeout after which entries will be flushed aborting the wait for their sub parts to be merged with. |
20-
| `source_identifier` | `$attributes["file.path"]` | The [field](../types/field.md) to separate one source of logs from others when combining them. |
21-
| `max_sources` | 1000 | The maximum number of unique sources allowed concurrently to be tracked for combining separately. |
22-
| `max_log_size` | 0 | The maximum bytes size of the combined field. Once the size exceeds the limit, all received entries of the source will be combined and flushed. "0" of max_log_size means no limit. |
7+
| Field | Default | Description |
8+
| --- | --- | --- |
9+
| `id` | `recombine` | A unique identifier for the operator. |
10+
| `output` | Next in pipeline | The connected operator(s) that will receive all outbound entries. |
11+
| `on_error` | `send` | The behavior of the operator if it encounters an error. See [on_error](../types/on_error.md). |
12+
| `is_first_entry` | | An [expression](../types/expression.md) that returns true if the entry being processed is the first entry in a multiline series. |
13+
| `is_last_entry` | | An [expression](../types/expression.md) that returns true if the entry being processed is the last entry in a multiline series. |
14+
| `combine_field` | required | The [field](../types/field.md) from all the entries that will be recombined. |
15+
| `combine_with` | `"\n"` | The string that is put between the combined entries. This can be an empty string as well. When using special characters like `\n`, be sure to enclose the value in double quotes: `"\n"`. |
16+
| `max_batch_size` | 1000 | The maximum number of consecutive entries that will be combined into a single entry. |
17+
| `max_unmatched_batch_size` | 100 | The maximum number of consecutive entries that will be combined into a single entry before the match occurs (with `is_first_entry` or `is_last_entry`), e.g. `max_unmatched_batch_size=0` - all entries combined, `max_unmatched_batch_size=1` - all entries uncombined until the match occurs, `max_unmatched_batch_size=100` - entries combined into 100-entry-packages until the match occurs |
18+
| `overwrite_with` | `newest` | Whether to use the fields from the `oldest` or the `newest` entry for all the fields that are not combined. |
19+
| `force_flush_period` | `5s` | Flush timeout after which entries will be flushed aborting the wait for their sub parts to be merged with. |
20+
| `source_identifier` | attributes["log.file.path"] | The [field](../types/field.md) to separate one source of logs from others when combining them. |
21+
| `max_sources` | 1000 | The maximum number of unique sources allowed concurrently to be tracked for combining separately. |
22+
| `max_log_size` | 0 | The maximum bytes size of the combined field. Once the size exceeds the limit, all received entries of the source will be combined and flushed. "0" of max_log_size means no limit. |
2323

2424
Exactly one of `is_first_entry` and `is_last_entry` must be specified.
2525

pkg/stanza/operator/input/file/input_test.go

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ import (
1313
"github.com/stretchr/testify/require"
1414

1515
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/entry"
16+
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/attrs"
1617
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/testutil"
1718
)
1819

@@ -62,7 +63,7 @@ func TestAddFileResolvedFields(t *testing.T) {
6263

6364
e := waitForOne(t, logReceived)
6465
require.Equal(t, filepath.Base(symLinkPath), e.Attributes["log.file.name"])
65-
require.Equal(t, symLinkPath, e.Attributes["log.file.path"])
66+
require.Equal(t, symLinkPath, e.Attributes[attrs.LogFilePath])
6667
require.Equal(t, filepath.Base(resolved), e.Attributes["log.file.name_resolved"])
6768
require.Equal(t, resolved, e.Attributes["log.file.path_resolved"])
6869
if runtime.GOOS != "windows" {

pkg/stanza/operator/parser/container/config.go

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,15 @@ import (
1212

1313
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/entry"
1414
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/errors"
15+
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/attrs"
1516
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator"
1617
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper"
1718
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/transformer/recombine"
1819
)
1920

2021
const (
2122
operatorType = "container"
22-
recombineSourceIdentifier = "log.file.path"
23+
recombineSourceIdentifier = attrs.LogFilePath
2324
recombineIsLastEntry = "attributes.logtag == 'F'"
2425
removeOriginalTimeFieldFeatureFlag = "filelog.container.removeOriginalTimeField"
2526
)

pkg/stanza/operator/parser/container/parser.go

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ import (
1717

1818
"github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal/timeutils"
1919
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/entry"
20+
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/attrs"
2021
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator"
2122
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper"
2223
)
@@ -30,7 +31,7 @@ const (
3031
crioPattern = "^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$"
3132
containerdPattern = "^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$"
3233
logpathPattern = "^.*(\\/|\\\\)(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\\-]+)(\\/|\\\\)(?P<container_name>[^\\._]+)(\\/|\\\\)(?P<restart_count>\\d+)\\.log$"
33-
logPathField = "log.file.path"
34+
logPathField = attrs.LogFilePath
3435
crioTimeLayout = "2006-01-02T15:04:05.999999999Z07:00"
3536
goTimeLayout = "2006-01-02T15:04:05.999Z"
3637
)

0 commit comments

Comments
 (0)