Skip to content

Proposal for language additions to OTTL #30800

@jsuereth

Description

@jsuereth

Component(s)

pkg/ottl

Is your feature request related to a problem? Please describe.

We were attempting to use OTTL to transform Google's structured logging format in GKE (see: https://cloud.google.com/logging/docs/structured-logging) to the current example data model in OpenTelemetry (see: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model-appendix.md#google-cloud-logging).

Effectively, we're trying to parse a JSON log body and extract components into OTLP equivalents:

{
  "severity":"ERROR",
  "message":"There was an error in the application.",
  "httpRequest":{
    "requestMethod":"GET"
  },
  "times":"2020-10-12T07:20:50.52Z",
  "logging.googleapis.com/insertId":"42",
  "logging.googleapis.com/labels":{
    "user_label_1":"value_1",
    "user_label_2":"value_2"
  },
  "logging.googleapis.com/operation":{
    "id":"get_data",
    "producer":"github.com/MyProject/MyApplication",
     "first":"true"
  },
  "logging.googleapis.com/sourceLocation":{
    "file":"get_data.py",
    "line":"142",
    "function":"getData"
  },
  "logging.googleapis.com/spanId":"000000000000004a",
  "logging.googleapis.com/trace":"projects/my-projectid/traces/06796866738c859f2f19b7cfb3214824",
  "logging.googleapis.com/trace_sampled":false
}

In doing so we noticed a lot of friction in OTTL and duplicate expressions.

Here's a "simplified" (i.e. still missing some if/where statements, and new required built-in functions for span processing) version:

context: log
statements:
- set(body, ParseJSON(body["message"])) where (body != nil and body["message"] != nil)
- merge_maps(attributes, body["logging.googleapis.com/labels"], "upsert") where body["logging.googleapis.com/labels"] != nil
- delete_key(body, "logging.googleapis.com/labels") where (body != nil and body["logging.googleapis.com/labels"] != nil)
- delete_key(cache, "__field_0") where (cache != nil and cache["__field_0"] != nil)
- set(cache["__field_0"], body["logging.googleapis.com/httpRequest"]) where (body != nil and body["logging.googleapis.com/httpRequest"] != nil)
- delete_key(body, "logging.googleapis.com/httpRequest") where (body != nil and body["logging.googleapis.com/httpRequest"] != nil)
- set(cache["value"], cache["__field_0"])
- set(attributes["gcp.http_request"], cache["value"]) where (cache != nil and cache["value"] != nil)
- delete_key(cache, "__field_0") where (cache != nil and cache["__field_0"] != nil)
- set(cache["__field_0"], body["logging.googleapis.com/logName"]) where (body != nil and body["logging.googleapis.com/logName"] != nil)
- delete_key(body, "logging.googleapis.com/logName") where (body != nil and body["logging.googleapis.com/logName"] != nil)
- set(cache["value"], cache["__field_0"])
- set(attributes["gcp.log_name"], cache["value"]) where (cache != nil and cache["value"] != nil)
- delete_key(cache, "__field_0") where (cache != nil and cache["__field_0"] != nil)
- set(cache["__field_0"], body["logging.googleapis.com/severity"]) where (body != nil and body["logging.googleapis.com/severity"] != nil)
- delete_key(body, "logging.googleapis.com/severity") where (body != nil and body["logging.googleapis.com/severity"] != nil)
- set(cache["value"], cache["__field_0"])
- set(severity_text, cache["value"]) where (cache != nil and cache["value"] != nil)
- delete_key(cache, "__field_0") where (cache != nil and cache["__field_0"] != nil)
- set(cache["__field_0"], body["logging.googleapis.com/sourceLocation"]) where (body != nil and body["logging.googleapis.com/sourceLocation"] != nil)
- delete_key(body, "logging.googleapis.com/sourceLocation") where (body != nil and body["logging.googleapis.com/sourceLocation"] != nil)
- set(cache["value"], cache["__field_0"])
- set(attributes["gcp.source_location"], cache["value"]) where (cache != nil and cache["value"] != nil)
- delete_key(cache, "__field_0") where (cache != nil and cache["__field_0"] != nil)
- set(cache["__field_0"], body["logging.googleapis.com/spanId"]) where (body != nil and body["logging.googleapis.com/spanId"] != nil)
- delete_key(body, "logging.googleapis.com/spanId") where (body != nil and body["logging.googleapis.com/spanId"] != nil)
- set(cache["value"], cache["__field_0"])
- set(span_id, cache["value"]) where (cache != nil and cache["value"] != nil)
- delete_key(cache, "__field_0") where (cache != nil and cache["__field_0"] != nil)
- set(cache["__field_0"], body["logging.googleapis.com/trace"]) where (body != nil and body["logging.googleapis.com/trace"] != nil)
- delete_key(body, "logging.googleapis.com/trace") where (body != nil and body["logging.googleapis.com/trace"] != nil)
- set(cache["value"], cache["__field_0"])
- set(trace_id, cache["value"]) where (cache != nil and cache["value"] != nil)

Describe the solution you'd like

We'd like to propose a new expression-focused syntax for OTTL that would allow the previous OTTL to look like this:

on log
when log.body is aStringMessage(parsedJson(json))
yield log with {
  attributes: attributes with json["logging.googleapis.com/labels"] with {
    "gcp.log_name": json["logging.googleapis.com/logName"]
  },
  spanID: StringToSpanID(json["logging.googleapis.com/spanId"]),
  traceID: StringToTraceID(json["logging.googleapis.com/trace"]),
  severity_text: json["logging.googleapis.com/severity"],
  body: json with {
    "logging.googleapis.com/labels": nil,
    "logging.googleapis.com/logName": nil,
    "logging.googleapis.com/severity": nil,
    "logging.googleapis.com/sourceLocation": nil,
    "logging.googleapis.com/spanId": nil,
    "logging.googleapis.com/trace": nil,
    "logging.googleapis.com/labels": nil,
  },
}

At a high level we propose the following:

  • Supporting string-formatting literals, e.g. "I can reference {expression}s"
  • Add a type system for better "prior to evaluation" error messages, including the ability to
    get error messages without running the collector. (e.g. Go, Rust, Typescript)
  • Allow operations to operate against structural data, preferably with a JSON-like feel. (e.g. Jsonnet TypeScript, Dart)
    • Assign multiple values at the same time.
    • Have the visual structure mirrored in the code.
  • Provide List comprehensions (e.g. Typescript, Python, Kotlin) that simplifies operating against
    lists of "KeyValueList" i.e. Attributes.
    • This should dramatically reduce need for built-in functions to be worthwhile.
    • This should replace need for: limit, truncate_all, replace_*, keep_keys, delete_keys.
  • Pattern matching or "binding patterns", to simplify dealing with generic AnyValue attributes and log bodies, reducing the need to duplicate intent between where clause and statements.
  • (optional) - Only reserve new keywords in "stateful" context. (This would require migrating to a stateful lexer, but could help avoid reducing the available identifiers).

Describe alternatives you've considered

We investigated leveraging CEL or LUA for this purpose.

Unfortunately there are a few shortcomings we think this proposal would alleviate:

  • CEL lacks anonymous structural definitions. All key-value objects must be constructed as full types.
  • CEL + LUA lack pattern-matching syntax. We believe pattern matching, with explicit "guard" vs. "execution" interpretation can dramatically simplify many OTTL expressions.
  • CEL is inherently expression-based (returns a value) while OTTL is inherently imperative (mutates values)
  • LUA is a full fledged programming language and could require more care to limit runtime complexity.

Additional context

I have a fully implemented prototype "trans-piler" which can take this new syntax and backport OTTL statements from it. This prototype includes grammar suggestions and rationale.

I would like to consider whether OTTL should expand its expression prior to #28892.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions