-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Component(s)
pkg/ottl
Is your feature request related to a problem? Please describe.
We were attempting to use OTTL to transform Google's structured logging format in GKE (see: https://cloud.google.com/logging/docs/structured-logging) to the current example data model in OpenTelemetry (see: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model-appendix.md#google-cloud-logging).
Effectively, we're trying to parse a JSON log body and extract components into OTLP equivalents:
{
"severity":"ERROR",
"message":"There was an error in the application.",
"httpRequest":{
"requestMethod":"GET"
},
"times":"2020-10-12T07:20:50.52Z",
"logging.googleapis.com/insertId":"42",
"logging.googleapis.com/labels":{
"user_label_1":"value_1",
"user_label_2":"value_2"
},
"logging.googleapis.com/operation":{
"id":"get_data",
"producer":"github.com/MyProject/MyApplication",
"first":"true"
},
"logging.googleapis.com/sourceLocation":{
"file":"get_data.py",
"line":"142",
"function":"getData"
},
"logging.googleapis.com/spanId":"000000000000004a",
"logging.googleapis.com/trace":"projects/my-projectid/traces/06796866738c859f2f19b7cfb3214824",
"logging.googleapis.com/trace_sampled":false
}
In doing so we noticed a lot of friction in OTTL and duplicate expressions.
Here's a "simplified" (i.e. still missing some if/where statements, and new required built-in functions for span processing) version:
context: log
statements:
- set(body, ParseJSON(body["message"])) where (body != nil and body["message"] != nil)
- merge_maps(attributes, body["logging.googleapis.com/labels"], "upsert") where body["logging.googleapis.com/labels"] != nil
- delete_key(body, "logging.googleapis.com/labels") where (body != nil and body["logging.googleapis.com/labels"] != nil)
- delete_key(cache, "__field_0") where (cache != nil and cache["__field_0"] != nil)
- set(cache["__field_0"], body["logging.googleapis.com/httpRequest"]) where (body != nil and body["logging.googleapis.com/httpRequest"] != nil)
- delete_key(body, "logging.googleapis.com/httpRequest") where (body != nil and body["logging.googleapis.com/httpRequest"] != nil)
- set(cache["value"], cache["__field_0"])
- set(attributes["gcp.http_request"], cache["value"]) where (cache != nil and cache["value"] != nil)
- delete_key(cache, "__field_0") where (cache != nil and cache["__field_0"] != nil)
- set(cache["__field_0"], body["logging.googleapis.com/logName"]) where (body != nil and body["logging.googleapis.com/logName"] != nil)
- delete_key(body, "logging.googleapis.com/logName") where (body != nil and body["logging.googleapis.com/logName"] != nil)
- set(cache["value"], cache["__field_0"])
- set(attributes["gcp.log_name"], cache["value"]) where (cache != nil and cache["value"] != nil)
- delete_key(cache, "__field_0") where (cache != nil and cache["__field_0"] != nil)
- set(cache["__field_0"], body["logging.googleapis.com/severity"]) where (body != nil and body["logging.googleapis.com/severity"] != nil)
- delete_key(body, "logging.googleapis.com/severity") where (body != nil and body["logging.googleapis.com/severity"] != nil)
- set(cache["value"], cache["__field_0"])
- set(severity_text, cache["value"]) where (cache != nil and cache["value"] != nil)
- delete_key(cache, "__field_0") where (cache != nil and cache["__field_0"] != nil)
- set(cache["__field_0"], body["logging.googleapis.com/sourceLocation"]) where (body != nil and body["logging.googleapis.com/sourceLocation"] != nil)
- delete_key(body, "logging.googleapis.com/sourceLocation") where (body != nil and body["logging.googleapis.com/sourceLocation"] != nil)
- set(cache["value"], cache["__field_0"])
- set(attributes["gcp.source_location"], cache["value"]) where (cache != nil and cache["value"] != nil)
- delete_key(cache, "__field_0") where (cache != nil and cache["__field_0"] != nil)
- set(cache["__field_0"], body["logging.googleapis.com/spanId"]) where (body != nil and body["logging.googleapis.com/spanId"] != nil)
- delete_key(body, "logging.googleapis.com/spanId") where (body != nil and body["logging.googleapis.com/spanId"] != nil)
- set(cache["value"], cache["__field_0"])
- set(span_id, cache["value"]) where (cache != nil and cache["value"] != nil)
- delete_key(cache, "__field_0") where (cache != nil and cache["__field_0"] != nil)
- set(cache["__field_0"], body["logging.googleapis.com/trace"]) where (body != nil and body["logging.googleapis.com/trace"] != nil)
- delete_key(body, "logging.googleapis.com/trace") where (body != nil and body["logging.googleapis.com/trace"] != nil)
- set(cache["value"], cache["__field_0"])
- set(trace_id, cache["value"]) where (cache != nil and cache["value"] != nil)
Describe the solution you'd like
We'd like to propose a new expression-focused syntax for OTTL that would allow the previous OTTL to look like this:
on log
when log.body is aStringMessage(parsedJson(json))
yield log with {
attributes: attributes with json["logging.googleapis.com/labels"] with {
"gcp.log_name": json["logging.googleapis.com/logName"]
},
spanID: StringToSpanID(json["logging.googleapis.com/spanId"]),
traceID: StringToTraceID(json["logging.googleapis.com/trace"]),
severity_text: json["logging.googleapis.com/severity"],
body: json with {
"logging.googleapis.com/labels": nil,
"logging.googleapis.com/logName": nil,
"logging.googleapis.com/severity": nil,
"logging.googleapis.com/sourceLocation": nil,
"logging.googleapis.com/spanId": nil,
"logging.googleapis.com/trace": nil,
"logging.googleapis.com/labels": nil,
},
}
At a high level we propose the following:
- Supporting string-formatting literals, e.g. "I can reference {expression}s"
- Add a type system for better "prior to evaluation" error messages, including the ability to
get error messages without running the collector. (e.g. Go, Rust, Typescript) - Allow operations to operate against structural data, preferably with a JSON-like feel. (e.g. Jsonnet TypeScript, Dart)
- Assign multiple values at the same time.
- Have the visual structure mirrored in the code.
- Provide List comprehensions (e.g. Typescript, Python, Kotlin) that simplifies operating against
lists of "KeyValueList" i.e. Attributes.- This should dramatically reduce need for built-in functions to be worthwhile.
- This should replace need for:
limit
,truncate_all
,replace_*
,keep_keys
,delete_keys
.
- Pattern matching or "binding patterns", to simplify dealing with generic
AnyValue
attributes and log bodies, reducing the need to duplicate intent betweenwhere
clause and statements. - (optional) - Only reserve new keywords in "stateful" context. (This would require migrating to a stateful lexer, but could help avoid reducing the available identifiers).
Describe alternatives you've considered
We investigated leveraging CEL or LUA for this purpose.
Unfortunately there are a few shortcomings we think this proposal would alleviate:
- CEL lacks anonymous structural definitions. All key-value objects must be constructed as full types.
- CEL + LUA lack pattern-matching syntax. We believe pattern matching, with explicit "guard" vs. "execution" interpretation can dramatically simplify many OTTL expressions.
- CEL is inherently expression-based (returns a value) while OTTL is inherently imperative (mutates values)
- LUA is a full fledged programming language and could require more care to limit runtime complexity.
Additional context
I have a fully implemented prototype "trans-piler" which can take this new syntax and backport OTTL statements from it. This prototype includes grammar suggestions and rationale.
I would like to consider whether OTTL should expand its expression prior to #28892.