Skip to content

Commit 0385b21

Browse files
authored
[pkg/stanza] Option for setting max log size in syslog parser (#33777)
**Description:** add `MaxLogSize` parameter to `syslog` parser. Note that for this option to be available, `enable_octet_counting` needs to be set to `true`, as this is an option that is exclusive to the `octetcounting` parser in the [syslog library](github.com/leodido/go-syslog). One aspect where I'm not sure about yet is regarding the placement of the `max_log_size` option: Right now, this option is also set within the `TCP` input configuration, whereas this new option would be one layer above, i.e. in the syslog base config. This would mean that this option could potentially be set to different values in the parser and tcp input config like for example: ``` receivers: syslog: protocol: rfc5424 enable_octet_counting: true max_log_size: 200000000 # 200MiB tcp: listen_address: :4278 max_log_size: 100000000 # 100MiB exporters: debug: service: pipelines: logs: receivers: [syslog] exporters: [debug] ``` For now I have implemented this in a way where if nothing is set if the tcp input config, the max_log_size value from the syslog base config will be used. If set in the tcp config, the tcp input will use that more specific value. To me this makes the most sense right now, but I appreciate any feedback on this. **Link to tracking Issue:** #33182 **Testing:** so far added unit test for the syslog parser, will also add some tests for the syslog input config to test the behavior described above. **Documentation:** TODO, will add once we have figured out all open questions --------- Signed-off-by: Florian Bacher <[email protected]>
1 parent 3e5c046 commit 0385b21

File tree

6 files changed

+91
-19
lines changed

6 files changed

+91
-19
lines changed
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Use this changelog template to create an entry for release notes.
2+
3+
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
4+
change_type: bug_fix
5+
6+
# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
7+
component: syslogreceiver
8+
9+
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
10+
note: "Allow to define `max_octets` for octet counting RFC5424 syslog parser"
11+
12+
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
13+
issues: [33182]
14+
15+
# (Optional) One or more lines of additional information to render under the primary note.
16+
# These lines will be padded with 2 spaces and then inserted directly into the document.
17+
# Use pipe (|) for multiline entries.
18+
subtext:
19+
20+
# If your change doesn't affect end users or the exported elements of any package,
21+
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
22+
# Optional: The change log or logs in which this entry should be included.
23+
# e.g. '[user]' or '[user, api]'
24+
# Include 'user' if the change is relevant to end users.
25+
# Include 'api' if there is a change to a library API.
26+
# Default: '[user]'
27+
change_logs: [user]

pkg/stanza/operator/input/syslog/config.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ func (c Config) Build(set component.TelemetrySettings) (operator.Operator, error
5151
syslogParserCfg.BaseConfig = c.BaseConfig
5252
syslogParserCfg.SetID(inputBase.ID() + "_internal_parser")
5353
syslogParserCfg.OutputIDs = c.OutputIDs
54+
syslogParserCfg.MaxOctets = c.MaxOctets
5455
syslogParser, err := syslogParserCfg.Build(set)
5556
if err != nil {
5657
return nil, fmt.Errorf("failed to resolve syslog config: %w", err)

pkg/stanza/operator/parser/syslog/config.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ type BaseConfig struct {
5555
EnableOctetCounting bool `mapstructure:"enable_octet_counting,omitempty"`
5656
AllowSkipPriHeader bool `mapstructure:"allow_skip_pri_header,omitempty"`
5757
NonTransparentFramingTrailer *string `mapstructure:"non_transparent_framing_trailer,omitempty"`
58+
MaxOctets int `mapstructure:"max_octets,omitempty"`
5859
}
5960

6061
// Build will build a JSON parser operator.
@@ -105,5 +106,6 @@ func (c Config) Build(set component.TelemetrySettings) (operator.Operator, error
105106
enableOctetCounting: c.EnableOctetCounting,
106107
allowSkipPriHeader: c.AllowSkipPriHeader,
107108
nonTransparentFramingTrailer: c.NonTransparentFramingTrailer,
109+
maxOctets: c.MaxOctets,
108110
}, nil
109111
}

pkg/stanza/operator/parser/syslog/parser.go

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ type Parser struct {
3333
enableOctetCounting bool
3434
allowSkipPriHeader bool
3535
nonTransparentFramingTrailer *string
36+
maxOctets int
3637
}
3738

3839
// Process will parse an entry field as syslog.
@@ -96,7 +97,7 @@ func (p *Parser) buildParseFunc() (parseFunc, error) {
9697
switch {
9798
// Octet Counting Parsing RFC6587
9899
case p.enableOctetCounting:
99-
return newOctetCountingParseFunc(), nil
100+
return newOctetCountingParseFunc(p.maxOctets), nil
100101
// Non-Transparent-Framing Parsing RFC6587
101102
case p.nonTransparentFramingTrailer != nil && *p.nonTransparentFramingTrailer == LFTrailer:
102103
return newNonTransparentFramingParseFunc(nontransparent.LF), nil
@@ -291,13 +292,23 @@ func postprocess(e *entry.Entry) error {
291292
return cleanupTimestamp(e)
292293
}
293294

294-
func newOctetCountingParseFunc() parseFunc {
295+
func newOctetCountingParseFunc(maxOctets int) parseFunc {
295296
return func(input []byte) (message sl.Message, err error) {
296297
listener := func(res *sl.Result) {
297298
message = res.Message
298299
err = res.Error
299300
}
300-
parser := octetcounting.NewParser(sl.WithBestEffort(), sl.WithListener(listener))
301+
302+
parserOpts := []sl.ParserOption{
303+
sl.WithBestEffort(),
304+
sl.WithListener(listener),
305+
}
306+
307+
if maxOctets > 0 {
308+
parserOpts = append(parserOpts, sl.WithMaxMessageLength(maxOctets))
309+
}
310+
311+
parser := octetcounting.NewParser(parserOpts...)
301312
reader := bytes.NewReader(input)
302313
parser.Parse(reader)
303314
return

pkg/stanza/operator/parser/syslog/parser_test.go

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,36 @@ func TestSyslogParseRFC5424_SDNameTooLong(t *testing.T) {
8181
}
8282
}
8383

84+
func TestSyslogParseRFC5424_Octet_Counting_MessageTooLong(t *testing.T) {
85+
cfg := basicConfig()
86+
cfg.Protocol = RFC5424
87+
cfg.EnableOctetCounting = true
88+
cfg.MaxOctets = 214
89+
90+
body := `215 <86>1 2015-08-05T21:58:59.693Z 192.168.2.132 SecureAuth0 23108 ID52020 [SecureAuth@27389 UserHostAddress="192.168.2.132" Realm="SecureAuth0" UserID="Tester2" PEN="27389"] Found the user for retrieving user's profile`
91+
92+
set := componenttest.NewNopTelemetrySettings()
93+
op, err := cfg.Build(set)
94+
require.NoError(t, err)
95+
96+
fake := testutil.NewFakeOutput(t)
97+
err = op.SetOutputs([]operator.Operator{fake})
98+
require.NoError(t, err)
99+
100+
newEntry := entry.New()
101+
newEntry.Body = body
102+
err = op.Process(context.Background(), newEntry)
103+
require.Error(t, err)
104+
require.Contains(t, err.Error(), "message too long to parse. was size 215, max length 214")
105+
106+
select {
107+
case e := <-fake.Received:
108+
require.Equal(t, body, e.Body)
109+
case <-time.After(time.Second):
110+
require.FailNow(t, "Timed out waiting for entry to be processed")
111+
}
112+
}
113+
84114
func TestSyslogProtocolConfig(t *testing.T) {
85115
for _, proto := range []string{"RFC5424", "rfc5424", "RFC3164", "rfc3164"} {
86116
cfg := basicConfig()

receiver/syslogreceiver/README.md

Lines changed: 17 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -16,22 +16,23 @@ Parses Syslogs received over TCP or UDP.
1616

1717
## Configuration
1818

19-
| Field | Default | Description |
20-
|-------------------------------------|--------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
21-
| `tcp` | `nil` | Defined tcp_input operator. (see the TCP configuration section) |
22-
| `udp` | `nil` | Defined udp_input operator. (see the UDP configuration section) |
23-
| `protocol` | required | The protocol to parse the syslog messages as. Options are `rfc3164` and `rfc5424` |
24-
| `location` | `UTC` | The geographic location (timezone) to use when parsing the timestamp (Syslog RFC 3164 only). The available locations depend on the local IANA Time Zone database. [This page](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) contains many examples, such as `America/New_York`. |
25-
| `enable_octet_counting` | `false` | Wether or not to enable [RFC 6587](https://www.rfc-editor.org/rfc/rfc6587#section-3.4.1) Octet Counting on syslog parsing (Syslog RFC 5424 and TCP only). |
26-
| `allow_skip_pri_header` | `false` | Allow parsing records without the PRI header. If this setting is enabled, messages without the PRI header will be successfully parsed. The `SeverityNumber` and `SeverityText` fields as well as the `priority` and `facility` attributes will not be set on the log record. If this setting is disabled (the default), messages without PRI header will throw an exception. To set this setting to `true`, the `enable_octet_counting` setting must be `false`.|
27-
| `non_transparent_framing_trailer` | `nil` | The framing trailer, either `LF` or `NUL`, when using [RFC 6587](https://www.rfc-editor.org/rfc/rfc6587#section-3.4.2) Non-Transparent-Framing (Syslog RFC 5424 and TCP only). |
28-
| `attributes` | {} | A map of `key: value` labels to add to the entry's attributes |
29-
| `resource` | {} | A map of `key: value` labels to add to the entry's resource |
30-
| `operators` | [] | An array of [operators](../../pkg/stanza/docs/operators/README.md#what-operators-are-available). See below for more details |
31-
| `retry_on_failure.enabled` | `false` | If `true`, the receiver will pause reading a file and attempt to resend the current batch of logs if it encounters an error from downstream components. |
32-
| `retry_on_failure.initial_interval` | `1 second` | Time to wait after the first failure before retrying. |
33-
| `retry_on_failure.max_interval` | `30 seconds` | Upper bound on retry backoff interval. Once this value is reached the delay between consecutive retries will remain constant at the specified value. |
34-
| `retry_on_failure.max_elapsed_time` | `5 minutes` | Maximum amount of time (including retries) spent trying to send a logs batch to a downstream consumer. Once this value is reached, the data is discarded. Retrying never stops if set to `0`. |
19+
| Field | Default | Description |
20+
|-------------------------------------|--------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
21+
| `tcp` | `nil` | Defined tcp_input operator. (see the TCP configuration section) |
22+
| `udp` | `nil` | Defined udp_input operator. (see the UDP configuration section) |
23+
| `protocol` | required | The protocol to parse the syslog messages as. Options are `rfc3164` and `rfc5424` |
24+
| `location` | `UTC` | The geographic location (timezone) to use when parsing the timestamp (Syslog RFC 3164 only). The available locations depend on the local IANA Time Zone database. [This page](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) contains many examples, such as `America/New_York`. |
25+
| `enable_octet_counting` | `false` | Wether or not to enable [RFC 6587](https://www.rfc-editor.org/rfc/rfc6587#section-3.4.1) Octet Counting on syslog parsing (Syslog RFC 5424 and TCP only). |
26+
| `max_octets` | `8192` | The maximum octets for messages using [RFC 6587](https://www.rfc-editor.org/rfc/rfc6587#section-3.4.1) Octet Counting on syslog parsing (Syslog RFC 5424 and TCP only). |
27+
| `allow_skip_pri_header` | `false` | Allow parsing records without the PRI header. If this setting is enabled, messages without the PRI header will be successfully parsed. The `SeverityNumber` and `SeverityText` fields as well as the `priority` and `facility` attributes will not be set on the log record. If this setting is disabled (the default), messages without PRI header will throw an exception. To set this setting to `true`, the `enable_octet_counting` setting must be `false`. |
28+
| `non_transparent_framing_trailer` | `nil` | The framing trailer, either `LF` or `NUL`, when using [RFC 6587](https://www.rfc-editor.org/rfc/rfc6587#section-3.4.2) Non-Transparent-Framing (Syslog RFC 5424 and TCP only). |
29+
| `attributes` | {} | A map of `key: value` labels to add to the entry's attributes |
30+
| `resource` | {} | A map of `key: value` labels to add to the entry's resource |
31+
| `operators` | [] | An array of [operators](../../pkg/stanza/docs/operators/README.md#what-operators-are-available). See below for more details |
32+
| `retry_on_failure.enabled` | `false` | If `true`, the receiver will pause reading a file and attempt to resend the current batch of logs if it encounters an error from downstream components. |
33+
| `retry_on_failure.initial_interval` | `1 second` | Time to wait after the first failure before retrying. |
34+
| `retry_on_failure.max_interval` | `30 seconds` | Upper bound on retry backoff interval. Once this value is reached the delay between consecutive retries will remain constant at the specified value. |
35+
| `retry_on_failure.max_elapsed_time` | `5 minutes` | Maximum amount of time (including retries) spent trying to send a logs batch to a downstream consumer. Once this value is reached, the data is discarded. Retrying never stops if set to `0`. |
3536

3637
### Operators
3738

0 commit comments

Comments
 (0)