[pkg/ottl] ParseSeverity function #35079

djaglowski · 2024-09-09T16:39:13Z

Component(s)

pkg/ottl

Is your feature request related to a problem? Please describe.

OTTL can directly set severity number and text, but it would be nice if there was an optimized function for interpreting values. Stanza currently has a dedicated severity interpreter which has a couple advantages over using OTTL:

Configuration is simple. The user can specify a mapping of values, ranges, or pre-defined categories, along with the level to which these values should be interpreted.
Performance costs is almost entirely paid for at startup by building a full mapping which can then be used for instant interpretation of any value.

Describe the solution you'd like

The primary challenge with OTTL is that its functional nature may not lend itself well to specifying mappings. I'm not sure if there is a way to do this today, but I'd like users to be able to specify a function that contains an arbitrary number of mapping options. e.g. ParseSeverity(attributes["sev"], AsError("err", "error", "NOOO"), AsInfo("info", "hey"), AsInfoRange(1, 100), ...)

Describe alternatives you've considered

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

github-actions · 2024-09-09T16:47:58Z

Pinging code owners:

pkg/ottl: @TylerHelmuth @kentquirk @bogdandrutu @evan-bradley

See Adding Labels via Comments if you do not have permissions to add labels yourself.

TylerHelmuth · 2024-09-10T14:56:57Z

@djaglowski we support map literals now, will that help? The grammar supports:

ParseSeverity(attributes["sev"], {"err": ["error", "NOOO"], "info": ["hey"]})

As a reference for discussion, the existing solution would be multiple statements with conditions:

set(severity_number, SEVERITY_NUMBER_ERROR) where attributes["sev"] == "error" or attributes["sev"] == "NOOO"
set(severity_number, SEVERITY_NUMBER_INFO) where attributes["sev"] == "hey" or (attributes["status code"] >= 1 and attributes["status code"] < 100)

Any additional function will need to be simpler than those statements.

TylerHelmuth · 2024-09-10T15:04:51Z

It is possible that the solution doesn't need to be an additional function and instead could be a additional section in the transformprocessor config.

djaglowski · 2024-09-10T15:22:49Z

we support map literals now, will that help?

I don't think this is sufficient for ranges, which are a common use case with severity interpretation.

Any additional function will need to be simpler than those statements.

I don't think that's difficult but performance should also be a consideration here.

I'll give another example of statements which could be much clearer and more performant with a better solution.

In this case, HTTP status codes are interpreted into severity number and text:

set(severity_number, 5) where (IsMatch(body["status"], "^1[0-9]{2}$") and true)
set(severity_text, "debug") where (IsMatch(body["status"], "^1[0-9]{2}$") and true)
set(severity_number, 9) where (IsMatch(body["status"], "^2[0-9]{2}$") and true)
set(severity_text, "info") where (IsMatch(body["status"], "^2[0-9]{2}$") and true)
set(severity_number, 9) where (IsMatch(body["status"], "^3[0-9]{2}$") and true)
set(severity_text, "info") where (IsMatch(body["status"], "^3[0-9]{2}$") and true)
set(severity_number, 13) where (IsMatch(body["status"], "^4[0-9]{2}$") and true)
set(severity_text, "warn") where (IsMatch(body["status"], "^4[0-9]{2}$") and true)
set(severity_number, 17) where (IsMatch(body["status"], "^5[0-9]{2}$") and true)
set(severity_text, "error") where (IsMatch(body["status"], "^5[0-9]{2}$") and true)

This is both difficult to understand and not very performant since each statement must execute against each log record, and because the statements themselves are non-trivial to evaluate.

In contrast, the stanza is far simpler to understand:

severity:
  parse_from: body["status"]
  mapping:
    debug:
      - min: 100
        max: 199
    info:
      - min: 200
        max: 299
      - min: 300
        max: 399
    warn:
      - min: 400
        max: 499
    error:
      - min: 500
        max: 599

# (Actually it can be even simpler but this is a special case where aliases represent predefined ranges)
# mapping:
#   debug: 1xx
#   info:
#     - 2xx
#     - 3xx
#   warn: 4xx
#   error: 5xx

Additionally, there is a substantial performance optimization which occurs with the stanza configuration. Specifically, we're able to put all values into a map at startup, so that every log records requires exactly one map lookup.

I don't have strong opinions about exactly which form this should take in OTTL but I think it should be supported in some way that avoids such dense and wasteful statements.

s71m · 2024-10-24T00:55:07Z

hi, i vote for this feature request, because

 receivers:
   syslog:
    udp:
      listen_address: 0.0.0.0:54525
    protocol: rfc5424

in that case logs will contain SeverityText, there will be "err", "crit", but I would like to compare it with the more well-known "ERROR", "FATAL" etc.
And I broke my head how to do this, because severity is a top level of entry and no possibility to get access here

      - type: severity_parser
        parse_from: severity_text

And only transformation can lead to the solution:

processors:
  transform:
    log_statements:
      - context: log
        statements:
          # --- DEBUG ---
          - set(severity_text, "DEBUG") where severity_text == "trace"
          - set(severity_text, "DEBUG") where severity_text == "debug"
          - set(severity_text, "DEBUG") where severity_text == "DEBUG"

          # --- INFO ---
          - set(severity_text, "INFO") where severity_text == "info"
          - set(severity_text, "INFO") where severity_text == "INFO"

          # --- WARNING ---
          - set(severity_text, "WARNING") where severity_text == "warn"
          - set(severity_text, "WARNING") where severity_text == "WARN"

          # --- ERROR ---
          - set(severity_text, "ERROR") where severity_text == "err"
          - set(severity_text, "ERROR") where severity_text == "error"

          # --- CRITICAL ---
          - set(severity_text, "CRITICAL") where severity_text == "crit"
          - set(severity_text, "CRITICAL") where severity_text == "fatal"
          - set(severity_text, "CRITICAL") where severity_text == "FATAL"

djaglowski · 2024-10-24T23:03:44Z

@s71m if you are using udp receiver you can define your own severity mapping as part of severity_parser.

s71m · 2024-10-25T01:23:02Z

@s71m if you are using udp receiver you can define your own severity mapping as part of severity_parser.

@djaglowski thx for reply! I'm using both, syslog-ng on remote vps actually via tcp, but for severity need to define field "parse_from", and i cant figured it out, which field i need to pass.

bacherfl · 2025-01-09T09:30:47Z

I'd like to look into implementing this

TylerHelmuth · 2025-01-09T19:50:12Z

@bacherfl if you've got ideas please share them here before working on a PR. We haven't yet landed on a proper way to express the map feature that stanza has in OTTL.

bacherfl · 2025-01-10T09:24:35Z

@bacherfl if you've got ideas please share them here before working on a PR. We haven't yet landed on a proper way to express the map feature that stanza has in OTTL.

alright, will post my thoughts here as soon as i have a possible approach

bacherfl · 2025-01-15T07:10:23Z

One thing we might do here is make use of the map literals we introduced recently. This would allow to specify the log level mapping in a similar way as in stanza:

ParseSeverity(body["status"], {"error":{"range":{"min": 500, "max":599}}, "info":{"range":{"min":200, "max": 299}}})

In the example above, the numeric code is used to determine the error level, but the struct being used by the ParseSeverity function could also be extended with e.g. an array of strings (or regexes) that would map to a certain log level, e.g.:

{
	"error":{
		"mapFrom": [
			"err",
                        "fail"
		]
	},
	"info": {...},
        "warn": {...}.
        // other log levels
}

@TylerHelmuth @djaglowski please let me know your thoughts - do you think this could be a valid approach here?

djaglowski · 2025-01-15T16:22:02Z

@bacherfl, it seems promising.

One limitation of this vs stanza would be that in stanza you can provide multiple independent items for each level, where each item may be a literal match or something else like a range.

e.g.

    error:
      - nooo
      - bad
      - min: 500
        max: 599
    info:
      - ok
      - success
      - min: 200
        max: 299

Do you see a way to allow this? Maybe each severity key could point to a slice that can contain any combination of strings and "min/max structs"?

ParseSeverity(
    body["status"],
    {
        "error": [
            "nooo", 
            "bad",
           { "min": 500, "max": 599 }
        ], 
        "info": [
            "ok", 
            "success",
           { "min": 200, "max": 299 }
        ]
    }
)

TylerHelmuth · 2025-01-15T16:44:44Z

@djaglowski OTTL definitely supports that syntax and it looks like that would be a one-to-one match with stanza.

bacherfl · 2025-01-16T05:59:41Z

Thanks for the feedback! That's a good suggestion and will make the structure more concise. In this case I will proceed to turn my PoC into a proper PR, with these changes in mind. I will keep you updated once it's ready for a first round of reviews

TylerHelmuth · 2025-01-16T06:08:24Z

Since map literal syntax in OTTL is kinda gross, should we add something to the transformprocessor ContextStatement struct to defining map literals in yaml that could get turned into strings and passed into the OTTL parser?

Would be pretty cool to be able to copy and paste a filelogreciever severity config into the transformprocessor config

bacherfl · 2025-01-16T07:47:03Z

Do you mean something like this?

processors:
  transform:
    log_statements:
      - context: log
        statements:
          - set(attributes["level"], ParseSeverity(severity_number, ${literal:severity-mapping}))
        literals:
          severity-mapping:
            error:
              - nooo
              - bad
              - min: 500
                max: 599
            info:
              - ok
              - success
              - min: 200
                max: 299

And then the transform processor will, before calling ottl.ParseStatement(), convert the severity-mapping into the Json representation and replace ${literal:severity-mapping}

TylerHelmuth · 2025-01-16T17:50:46Z

Ya something like that.

We've never done statement templating like this before and it is risky, but it really like being able to reuse stanza's yaml config.

@evan-bradley @edmocosta @djaglowski what do you think?

bacherfl · 2025-01-17T05:52:48Z

That sounds like a good idea - Let's see what the others say, I'd be happy to look into implementing this after this function is done

edmocosta · 2025-01-17T09:44:01Z

I understand this issue is about parsing severities, but I think a converter capable of doing value translations/lookups would be useful for other use cases as well, not only for parsing severities. That said, I'm wondering if we should implement a broader/generic solution for that instead? It would initially support this issue's requirements (literal values and ranges), and could be extended on the future to support for example, slices (multiple lookups), regexes, fallback values, functions, etc. WDYT?

We've never done statement templating like this before and it is risky, but it really like being able to reuse stanza's yaml config.

I agree it might be risky, and I'm wondering how we would implement it. Would it be specific for the transformprocessor? Or should we extend the existing confmap expanding capabilities (${...}) to support some kind of yaml files/literals schema expander, for example: ${yamlfile:literals.yaml::severity-mapping}?

I like the idea of being able to interpolate/reuse configurations, but I'm still not sure if I would make it part of the transform processor's configuration. Furthermore, the ${literal:...} placeholder would also collide with the existing confmap expanding functionality, which I think would give users the impression that it can be used anywhere in the configuration file.
We can definitely work those things around, but I'm curious to see what other folks says.

Thoughts?

djaglowski · 2025-01-17T17:30:48Z

I like the idea of offering another way to define literals, but can I suggest that be a separate issue? I think severity parsing will be useful even if the syntax is verbose.

TylerHelmuth · 2025-01-17T17:35:27Z

Ok lets move the parsing literals idea into a new issue and move forward with #35079 (comment). Handling literals can be an enhancement later.

github-actions · 2025-03-19T03:36:58Z

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

pkg/ottl: @TylerHelmuth @kentquirk @bogdandrutu @evan-bradley @edmocosta

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions · 2025-05-19T03:43:15Z

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

pkg/ottl: @TylerHelmuth @evan-bradley @edmocosta

See Adding Labels via Comments if you do not have permissions to add labels yourself.

djaglowski added enhancement New feature or request needs triage New item requiring triage labels Sep 9, 2024

github-actions bot added the pkg/ottl label Sep 9, 2024

github-actions bot mentioned this issue Sep 10, 2024

Weekly Report: 2024-09-03 - 2024-09-10 #35086

Closed

github-actions bot mentioned this issue Sep 17, 2024

Weekly Report: 2024-09-10 - 2024-09-17 #35228

Closed

This was referenced Sep 24, 2024

Weekly Report: 2024-09-17 - 2024-09-24 #35377

Closed

Weekly Report: 2024-09-24 - 2024-10-01 #35498

Closed

github-actions bot mentioned this issue Oct 8, 2024

Weekly Report: 2024-10-01 - 2024-10-08 #35659

Closed

atoulme removed the needs triage New item requiring triage label Oct 12, 2024

TylerHelmuth added the priority:p2 Medium label Nov 15, 2024

TylerHelmuth changed the title ~~[ottl] ParseSeverity function~~ [pkg/ottl] ParseSeverity function Nov 15, 2024

bacherfl linked a pull request Jan 17, 2025 that will close this issue

[pkg/ottl] Add ParseSeverity function #37280

Draft

github-actions bot added the Stale label Mar 19, 2025

djaglowski removed the Stale label Mar 19, 2025

github-actions bot added the Stale label May 19, 2025

edmocosta removed the Stale label May 19, 2025

[pkg/ottl] ParseSeverity function #35079

[pkg/ottl] ParseSeverity function #35079

Comments

djaglowski commented Sep 9, 2024

Component(s)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

github-actions bot commented Sep 9, 2024

Uh oh!

TylerHelmuth commented Sep 10, 2024

Uh oh!

TylerHelmuth commented Sep 10, 2024

Uh oh!

djaglowski commented Sep 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s71m commented Oct 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

djaglowski commented Oct 24, 2024

Uh oh!

s71m commented Oct 25, 2024

Uh oh!

bacherfl commented Jan 9, 2025

Uh oh!

TylerHelmuth commented Jan 9, 2025

Uh oh!

bacherfl commented Jan 10, 2025

Uh oh!

bacherfl commented Jan 15, 2025

Uh oh!

djaglowski commented Jan 15, 2025

Uh oh!

TylerHelmuth commented Jan 15, 2025

Uh oh!

bacherfl commented Jan 16, 2025

Uh oh!

TylerHelmuth commented Jan 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bacherfl commented Jan 16, 2025

Uh oh!

TylerHelmuth commented Jan 16, 2025

Uh oh!

bacherfl commented Jan 17, 2025

Uh oh!

edmocosta commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

djaglowski commented Jan 17, 2025

Uh oh!

TylerHelmuth commented Jan 17, 2025

Uh oh!

github-actions bot commented Mar 19, 2025

Uh oh!

github-actions bot commented May 19, 2025

Uh oh!

djaglowski commented Sep 10, 2024 •

edited

Loading

s71m commented Oct 24, 2024 •

edited

Loading

TylerHelmuth commented Jan 16, 2025 •

edited

Loading

edmocosta commented Jan 17, 2025 •

edited

Loading